* Q: schedule() and implied barriers on arm64 @ 2015-10-16 15:18 Peter Zijlstra 2015-10-16 16:04 ` Paul E. McKenney 0 siblings, 1 reply; 20+ messages in thread From: Peter Zijlstra @ 2015-10-16 15:18 UTC (permalink / raw) To: Will Deacon, Paul McKenney; +Cc: linux-kernel, Oleg Nesterov, Ingo Molnar Hi, IIRC Paul relies on schedule() implying a full memory barrier with strong transitivity for RCU. If not, ignore this email. If so, however, I suspect AARGH64 is borken and would need (just like PPC): #define smp_mb__before_spinlock() smp_mb() The problem is that schedule() (when a NO-OP) does: smp_mb__before_spinlock(); LOCK rq->lock clear_bit() UNLOCK rq->lock And nothing there implies a full barrier on AARGH64, since smp_mb__before_spinlock() defaults to WMB, LOCK is an "ldaxr" or load-acquire, UNLOCK is "stlrh" or store-release and clear_bit() isn't anything. Pretty much every other arch has LOCK implying a full barrier, either because its strongly ordered or because it needs one for the ACQUIRE semantics. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Q: schedule() and implied barriers on arm64 2015-10-16 15:18 Q: schedule() and implied barriers on arm64 Peter Zijlstra @ 2015-10-16 16:04 ` Paul E. McKenney 2015-10-16 16:16 ` Peter Zijlstra 2015-10-16 17:11 ` Catalin Marinas 0 siblings, 2 replies; 20+ messages in thread From: Paul E. McKenney @ 2015-10-16 16:04 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Will Deacon, linux-kernel, Oleg Nesterov, Ingo Molnar On Fri, Oct 16, 2015 at 05:18:30PM +0200, Peter Zijlstra wrote: > Hi, > > IIRC Paul relies on schedule() implying a full memory barrier with > strong transitivity for RCU. > > If not, ignore this email. Not so sure about schedule(), but definitely need strong transitivity for the rcu_node structure's ->lock field. And the atomic operations on the rcu_dyntick structure's fields when entering or leaving the idle loop. With schedule, the thread later reports the quiescent state, which involves acquiring the rcu_node structure's ->lock field. So I -think- that the locks in the scheduler can be weakly transitive. > If so, however, I suspect AARGH64 is borken and would need (just like > PPC): > > #define smp_mb__before_spinlock() smp_mb() > > The problem is that schedule() (when a NO-OP) does: > > smp_mb__before_spinlock(); > LOCK rq->lock > > clear_bit() > > UNLOCK rq->lock > > And nothing there implies a full barrier on AARGH64, since > smp_mb__before_spinlock() defaults to WMB, LOCK is an "ldaxr" or > load-acquire, UNLOCK is "stlrh" or store-release and clear_bit() isn't > anything. > > Pretty much every other arch has LOCK implying a full barrier, either > because its strongly ordered or because it needs one for the ACQUIRE > semantics. Well, arm64 might well need smp_mb__after_unlock_lock() to be non-empty. But I thought that it used a dmb in the spinlock code somewhere or another... Thanx, Paul ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Q: schedule() and implied barriers on arm64 2015-10-16 16:04 ` Paul E. McKenney @ 2015-10-16 16:16 ` Peter Zijlstra 2015-10-16 16:28 ` Paul E. McKenney 2015-10-16 16:55 ` Catalin Marinas 2015-10-16 17:11 ` Catalin Marinas 1 sibling, 2 replies; 20+ messages in thread From: Peter Zijlstra @ 2015-10-16 16:16 UTC (permalink / raw) To: Paul E. McKenney; +Cc: Will Deacon, linux-kernel, Oleg Nesterov, Ingo Molnar On Fri, Oct 16, 2015 at 09:04:22AM -0700, Paul E. McKenney wrote: > On Fri, Oct 16, 2015 at 05:18:30PM +0200, Peter Zijlstra wrote: > > Hi, > > > > IIRC Paul relies on schedule() implying a full memory barrier with > > strong transitivity for RCU. > > > > If not, ignore this email. > > Not so sure about schedule(), but definitely need strong transitivity > for the rcu_node structure's ->lock field. And the atomic operations > on the rcu_dyntick structure's fields when entering or leaving the > idle loop. > > With schedule, the thread later reports the quiescent state, which > involves acquiring the rcu_node structure's ->lock field. So I -think- > that the locks in the scheduler can be weakly transitive. So I _thought_ you needed this to separate the preempt_disabled sections. Such that rcu_note_context_switch() is guaranteed to be done before a new preempt_disabled region starts. But if you really only need program order guarantees for that, and deal with everything else from your tick, then that's fine too. Maybe some previous RCU variant relied on this? > > If so, however, I suspect AARGH64 is borken and would need (just like > > PPC): > > > > #define smp_mb__before_spinlock() smp_mb() > > > > The problem is that schedule() (when a NO-OP) does: > > > > smp_mb__before_spinlock(); > > LOCK rq->lock > > > > clear_bit() > > > > UNLOCK rq->lock > > > > And nothing there implies a full barrier on AARGH64, since > > smp_mb__before_spinlock() defaults to WMB, LOCK is an "ldaxr" or > > load-acquire, UNLOCK is "stlrh" or store-release and clear_bit() isn't > > anything. > > > > Pretty much every other arch has LOCK implying a full barrier, either > > because its strongly ordered or because it needs one for the ACQUIRE > > semantics. > > But I thought that it used a dmb in the spinlock code somewhere or > another... arm does, arm64 not so much. > Well, arm64 might well need smp_mb__after_unlock_lock() to be non-empty. Its UNLOCK+LOCK should be RCsc, so that should be good. Its just that LOCK+UNLOCK isn't anything. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Q: schedule() and implied barriers on arm64 2015-10-16 16:16 ` Peter Zijlstra @ 2015-10-16 16:28 ` Paul E. McKenney 2015-10-16 16:39 ` Peter Zijlstra 2015-10-16 16:55 ` Catalin Marinas 1 sibling, 1 reply; 20+ messages in thread From: Paul E. McKenney @ 2015-10-16 16:28 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Will Deacon, linux-kernel, Oleg Nesterov, Ingo Molnar On Fri, Oct 16, 2015 at 06:16:08PM +0200, Peter Zijlstra wrote: > On Fri, Oct 16, 2015 at 09:04:22AM -0700, Paul E. McKenney wrote: > > On Fri, Oct 16, 2015 at 05:18:30PM +0200, Peter Zijlstra wrote: > > > Hi, > > > > > > IIRC Paul relies on schedule() implying a full memory barrier with > > > strong transitivity for RCU. > > > > > > If not, ignore this email. > > > > Not so sure about schedule(), but definitely need strong transitivity > > for the rcu_node structure's ->lock field. And the atomic operations > > on the rcu_dyntick structure's fields when entering or leaving the > > idle loop. > > > > With schedule, the thread later reports the quiescent state, which > > involves acquiring the rcu_node structure's ->lock field. So I -think- > > that the locks in the scheduler can be weakly transitive. > > So I _thought_ you needed this to separate the preempt_disabled > sections. Such that rcu_note_context_switch() is guaranteed to be done > before a new preempt_disabled region starts. > > But if you really only need program order guarantees for that, and deal > with everything else from your tick, then that's fine too. > > Maybe some previous RCU variant relied on this? Yes, older versions did rely on this. Now, only the CPU itself observes RCU's state changes during context switch. I couldn't tell you exactly when this changed. :-/ With the exception of some synchronize_sched_expedited() cases, but in those cases, RCU code acquires the CPU's leaf rcu_node structure's ->lock, and with the required strong transitivity. > > > If so, however, I suspect AARGH64 is borken and would need (just like > > > PPC): > > > > > > #define smp_mb__before_spinlock() smp_mb() > > > > > > The problem is that schedule() (when a NO-OP) does: > > > > > > smp_mb__before_spinlock(); > > > LOCK rq->lock > > > > > > clear_bit() > > > > > > UNLOCK rq->lock > > > > > > And nothing there implies a full barrier on AARGH64, since > > > smp_mb__before_spinlock() defaults to WMB, LOCK is an "ldaxr" or > > > load-acquire, UNLOCK is "stlrh" or store-release and clear_bit() isn't > > > anything. > > > > > > Pretty much every other arch has LOCK implying a full barrier, either > > > because its strongly ordered or because it needs one for the ACQUIRE > > > semantics. > > > > But I thought that it used a dmb in the spinlock code somewhere or > > another... > > arm does, arm64 not so much. > > > Well, arm64 might well need smp_mb__after_unlock_lock() to be non-empty. > > Its UNLOCK+LOCK should be RCsc, so that should be good. Its just that > LOCK+UNLOCK isn't anything. Ah! If RCU relies on LOCK+UNLOCK being a barrier of any sort, that is a bug in RCU that needs fixing. Thanx, Paul ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Q: schedule() and implied barriers on arm64 2015-10-16 16:28 ` Paul E. McKenney @ 2015-10-16 16:39 ` Peter Zijlstra 0 siblings, 0 replies; 20+ messages in thread From: Peter Zijlstra @ 2015-10-16 16:39 UTC (permalink / raw) To: Paul E. McKenney; +Cc: Will Deacon, linux-kernel, Oleg Nesterov, Ingo Molnar On Fri, Oct 16, 2015 at 09:28:24AM -0700, Paul E. McKenney wrote: > > Maybe some previous RCU variant relied on this? > > Yes, older versions did rely on this. Now, only the CPU itself observes > RCU's state changes during context switch. I couldn't tell you exactly > when this changed. :-/ > > With the exception of some synchronize_sched_expedited() cases, but in > those cases, RCU code acquires the CPU's leaf rcu_node structure's > ->lock, and with the required strong transitivity. OK, so I can scrap this 'requirement' from my list. All sorted, thanks! > > > Well, arm64 might well need smp_mb__after_unlock_lock() to be non-empty. > > > > Its UNLOCK+LOCK should be RCsc, so that should be good. Its just that > > LOCK+UNLOCK isn't anything. > > Ah! If RCU relies on LOCK+UNLOCK being a barrier of any sort, that is a > bug in RCU that needs fixing. Don't think RCU does that, But its what schedule() provides in the weakest case. Hence my question here. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Q: schedule() and implied barriers on arm64 2015-10-16 16:16 ` Peter Zijlstra 2015-10-16 16:28 ` Paul E. McKenney @ 2015-10-16 16:55 ` Catalin Marinas 2015-10-16 17:28 ` Paul E. McKenney 2015-10-16 19:06 ` Peter Zijlstra 1 sibling, 2 replies; 20+ messages in thread From: Catalin Marinas @ 2015-10-16 16:55 UTC (permalink / raw) To: Peter Zijlstra Cc: Paul E. McKenney, Will Deacon, Linux Kernel Mailing List, Oleg Nesterov, Ingo Molnar I'll try to reply in Will's absence, though I gave up trying to understand these threads long time ago ;). On 16 October 2015 at 17:16, Peter Zijlstra <peterz@infradead.org> wrote: > On Fri, Oct 16, 2015 at 09:04:22AM -0700, Paul E. McKenney wrote: >> On Fri, Oct 16, 2015 at 05:18:30PM +0200, Peter Zijlstra wrote: >> > If so, however, I suspect AARGH64 is borken and would need (just like >> > PPC): >> > >> > #define smp_mb__before_spinlock() smp_mb() >> > >> > The problem is that schedule() (when a NO-OP) does: >> > >> > smp_mb__before_spinlock(); >> > LOCK rq->lock >> > >> > clear_bit() >> > >> > UNLOCK rq->lock >> > >> > And nothing there implies a full barrier on AARGH64, since >> > smp_mb__before_spinlock() defaults to WMB, LOCK is an "ldaxr" or >> > load-acquire, UNLOCK is "stlrh" or store-release and clear_bit() isn't >> > anything. >> > >> > Pretty much every other arch has LOCK implying a full barrier, either >> > because its strongly ordered or because it needs one for the ACQUIRE >> > semantics. >> >> But I thought that it used a dmb in the spinlock code somewhere or >> another... > > arm does, arm64 not so much. arm64 indeed does not have a dmb after spin_lock, it only has a load-acquire. So with the default smp_mb__before_spinlock() + spin_lock we have: smp_wmb() loop load-acquire store So (I think) this guarantees that any writes before wmb+lock would be visible before any reads _and_ writes after wmb+lock. However, the ordering with reads before wmb+lock is not guaranteed. -- Catalin ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Q: schedule() and implied barriers on arm64 2015-10-16 16:55 ` Catalin Marinas @ 2015-10-16 17:28 ` Paul E. McKenney 2015-10-16 19:07 ` Peter Zijlstra 2015-10-19 15:18 ` Catalin Marinas 2015-10-16 19:06 ` Peter Zijlstra 1 sibling, 2 replies; 20+ messages in thread From: Paul E. McKenney @ 2015-10-16 17:28 UTC (permalink / raw) To: Catalin Marinas Cc: Peter Zijlstra, Will Deacon, Linux Kernel Mailing List, Oleg Nesterov, Ingo Molnar On Fri, Oct 16, 2015 at 05:55:35PM +0100, Catalin Marinas wrote: > I'll try to reply in Will's absence, though I gave up trying to > understand these threads long time ago ;). > > On 16 October 2015 at 17:16, Peter Zijlstra <peterz@infradead.org> wrote: > > On Fri, Oct 16, 2015 at 09:04:22AM -0700, Paul E. McKenney wrote: > >> On Fri, Oct 16, 2015 at 05:18:30PM +0200, Peter Zijlstra wrote: > >> > If so, however, I suspect AARGH64 is borken and would need (just like > >> > PPC): > >> > > >> > #define smp_mb__before_spinlock() smp_mb() > >> > > >> > The problem is that schedule() (when a NO-OP) does: > >> > > >> > smp_mb__before_spinlock(); > >> > LOCK rq->lock > >> > > >> > clear_bit() > >> > > >> > UNLOCK rq->lock > >> > > >> > And nothing there implies a full barrier on AARGH64, since > >> > smp_mb__before_spinlock() defaults to WMB, LOCK is an "ldaxr" or > >> > load-acquire, UNLOCK is "stlrh" or store-release and clear_bit() isn't > >> > anything. > >> > > >> > Pretty much every other arch has LOCK implying a full barrier, either > >> > because its strongly ordered or because it needs one for the ACQUIRE > >> > semantics. > >> > >> But I thought that it used a dmb in the spinlock code somewhere or > >> another... > > > > arm does, arm64 not so much. > > arm64 indeed does not have a dmb after spin_lock, it only has a > load-acquire. So with the default smp_mb__before_spinlock() + > spin_lock we have: > > smp_wmb() > loop > load-acquire > store > > So (I think) this guarantees that any writes before wmb+lock would be > visible before any reads _and_ writes after wmb+lock. However, the > ordering with reads before wmb+lock is not guaranteed. So RCU needs the following sort of guarantee: void task1(unsigned long flags) { WRITE_ONCE(x, 1); WRITE_ONCE(z, 1); raw_spin_unlock_irqrestore(&rnp->lock, flags); } void task2(unsigned long *flags) { raw_spin_lock_irqsave(&rnp->lock, *flags); smp_mb__after_unlock_lock(); r1 = READ_ONCE(y); r2 = READ_ONCE(z); } void task3(void) { WRITE_ONCE(y, 1); smp_mb(); r3 = READ_ONCE(x); } BUG_ON(!r1 && r2 && !r3); /* After the dust settles. */ In other words, if task2() acquires the lock after task1() releases it, all CPUs must agree on the order of the operations in the two critical sections, even if these other CPUs don't acquire the lock. This same guarantee is needed if task1() and then task2() run in succession on the same CPU with no additional synchronization of any sort. Does this work on arm64? Thanx, Paul ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Q: schedule() and implied barriers on arm64 2015-10-16 17:28 ` Paul E. McKenney @ 2015-10-16 19:07 ` Peter Zijlstra 2015-10-16 19:20 ` Paul E. McKenney 2015-10-19 15:18 ` Catalin Marinas 1 sibling, 1 reply; 20+ messages in thread From: Peter Zijlstra @ 2015-10-16 19:07 UTC (permalink / raw) To: Paul E. McKenney Cc: Catalin Marinas, Will Deacon, Linux Kernel Mailing List, Oleg Nesterov, Ingo Molnar On Fri, Oct 16, 2015 at 10:28:11AM -0700, Paul E. McKenney wrote: > In other words, if task2() acquires the lock after task1() releases it, > all CPUs must agree on the order of the operations in the two critical > sections, even if these other CPUs don't acquire the lock. > > This same guarantee is needed if task1() and then task2() run in > succession on the same CPU with no additional synchronization of any sort. > > Does this work on arm64? Yes, their load-acquire and store-release are RCsc. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Q: schedule() and implied barriers on arm64 2015-10-16 19:07 ` Peter Zijlstra @ 2015-10-16 19:20 ` Paul E. McKenney 0 siblings, 0 replies; 20+ messages in thread From: Paul E. McKenney @ 2015-10-16 19:20 UTC (permalink / raw) To: Peter Zijlstra Cc: Catalin Marinas, Will Deacon, Linux Kernel Mailing List, Oleg Nesterov, Ingo Molnar On Fri, Oct 16, 2015 at 09:07:41PM +0200, Peter Zijlstra wrote: > On Fri, Oct 16, 2015 at 10:28:11AM -0700, Paul E. McKenney wrote: > > In other words, if task2() acquires the lock after task1() releases it, > > all CPUs must agree on the order of the operations in the two critical > > sections, even if these other CPUs don't acquire the lock. > > > > This same guarantee is needed if task1() and then task2() run in > > succession on the same CPU with no additional synchronization of any sort. > > > > Does this work on arm64? > > Yes, their load-acquire and store-release are RCsc. Whew!!! Thanx, Paul ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Q: schedule() and implied barriers on arm64 2015-10-16 17:28 ` Paul E. McKenney 2015-10-16 19:07 ` Peter Zijlstra @ 2015-10-19 15:18 ` Catalin Marinas 1 sibling, 0 replies; 20+ messages in thread From: Catalin Marinas @ 2015-10-19 15:18 UTC (permalink / raw) To: Paul E. McKenney Cc: Peter Zijlstra, Will Deacon, Linux Kernel Mailing List, Oleg Nesterov, Ingo Molnar On Fri, Oct 16, 2015 at 10:28:11AM -0700, Paul E. McKenney wrote: > So RCU needs the following sort of guarantee: > > void task1(unsigned long flags) > { > WRITE_ONCE(x, 1); > WRITE_ONCE(z, 1); > raw_spin_unlock_irqrestore(&rnp->lock, flags); > } > > void task2(unsigned long *flags) > { > raw_spin_lock_irqsave(&rnp->lock, *flags); > smp_mb__after_unlock_lock(); > r1 = READ_ONCE(y); > r2 = READ_ONCE(z); > } > > void task3(void) > { > WRITE_ONCE(y, 1); > smp_mb(); > r3 = READ_ONCE(x); > } > > BUG_ON(!r1 && r2 && !r3); /* After the dust settles. */ > > In other words, if task2() acquires the lock after task1() releases it, > all CPUs must agree on the order of the operations in the two critical > sections, even if these other CPUs don't acquire the lock. > > This same guarantee is needed if task1() and then task2() run in > succession on the same CPU with no additional synchronization of any sort. > > Does this work on arm64? I think it does. If r3 == 0, it means that READ_ONCE(x) in task3 is "observed" (in ARM ARM terms) by task1 before WRITE_ONCE(x, 1). The smp_mb() in task3 implies that WRITEONCE(y, 1) is also observed by task1. A store-release is multi-copy atomic when "observed" with a load-acquire (from task2). When on the same CPU, they are always observed in program order. The store-release on ARM has the property that writes observed by task1 before store-release (that is WRITE_ONCE(y, 1) in task3) will be observed by other observers (task2) before the store-release is observed (the unlock). The above rules guarantee that, when r3 == 0, WRITE_ONCE(y, 1) in task3 is observed by task2 (and task1), hence r1 == 1. (a more formal proof would have to wait for Will to come back from holiday ;)) -- Catalin ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Q: schedule() and implied barriers on arm64 2015-10-16 16:55 ` Catalin Marinas 2015-10-16 17:28 ` Paul E. McKenney @ 2015-10-16 19:06 ` Peter Zijlstra 2015-10-19 7:06 ` Ingo Molnar 1 sibling, 1 reply; 20+ messages in thread From: Peter Zijlstra @ 2015-10-16 19:06 UTC (permalink / raw) To: Catalin Marinas Cc: Paul E. McKenney, Will Deacon, Linux Kernel Mailing List, Oleg Nesterov, Ingo Molnar On Fri, Oct 16, 2015 at 05:55:35PM +0100, Catalin Marinas wrote: > arm64 indeed does not have a dmb after spin_lock, it only has a > load-acquire. So with the default smp_mb__before_spinlock() + > spin_lock we have: > > smp_wmb() > loop > load-acquire > store > > So (I think) this guarantees that any writes before wmb+lock would be > visible before any reads _and_ writes after wmb+lock. However, the > ordering with reads before wmb+lock is not guaranteed. That is my understanding as well, and stores could creep up from below the unlock and then the reads and those stores can cross and you've lost. In any case, its all moot now, since Paul no longer requires schedule() to imply a full barrier. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Q: schedule() and implied barriers on arm64 2015-10-16 19:06 ` Peter Zijlstra @ 2015-10-19 7:06 ` Ingo Molnar 2015-10-19 9:04 ` Peter Zijlstra 2015-10-19 15:21 ` Catalin Marinas 0 siblings, 2 replies; 20+ messages in thread From: Ingo Molnar @ 2015-10-19 7:06 UTC (permalink / raw) To: Peter Zijlstra Cc: Catalin Marinas, Paul E. McKenney, Will Deacon, Linux Kernel Mailing List, Oleg Nesterov, Linus Torvalds, Andrew Morton * Peter Zijlstra <peterz@infradead.org> wrote: > In any case, its all moot now, since Paul no longer requires schedule() to imply > a full barrier. > > [...] Nevertheless from a least-surprise POV it might be worth guaranteeing it, because I bet there's tons of code that assumes that schedule() is a heavy operation and it's such an easy mistake to make. Since we are so close to having that guarantee, we might as well codify it? Just like system calls are assumed to be barriers in general - and system calls are more lightweight than schedule() ... Thanks, Ingo ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Q: schedule() and implied barriers on arm64 2015-10-19 7:06 ` Ingo Molnar @ 2015-10-19 9:04 ` Peter Zijlstra 2015-10-19 15:21 ` Catalin Marinas 1 sibling, 0 replies; 20+ messages in thread From: Peter Zijlstra @ 2015-10-19 9:04 UTC (permalink / raw) To: Ingo Molnar Cc: Catalin Marinas, Paul E. McKenney, Will Deacon, Linux Kernel Mailing List, Oleg Nesterov, Linus Torvalds, Andrew Morton On Mon, Oct 19, 2015 at 09:06:05AM +0200, Ingo Molnar wrote: > > * Peter Zijlstra <peterz@infradead.org> wrote: > > > In any case, its all moot now, since Paul no longer requires schedule() to imply > > a full barrier. > > > > [...] > > Nevertheless from a least-surprise POV it might be worth guaranteeing it, because > I bet there's tons of code that assumes that schedule() is a heavy operation and > it's such an easy mistake to make. Since we are so close to having that guarantee, > we might as well codify it? > > Just like system calls are assumed to be barriers in general - Are they? I know they are on some platforms, but I'm not sure we've audited them all and established this. > and system calls are more lightweight than schedule() ... Hopefully, although if you enable nohz_full there's a fair chance to reverse that :-) ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Q: schedule() and implied barriers on arm64 2015-10-19 7:06 ` Ingo Molnar 2015-10-19 9:04 ` Peter Zijlstra @ 2015-10-19 15:21 ` Catalin Marinas 2015-10-19 16:24 ` Peter Zijlstra 1 sibling, 1 reply; 20+ messages in thread From: Catalin Marinas @ 2015-10-19 15:21 UTC (permalink / raw) To: Ingo Molnar Cc: Peter Zijlstra, Paul E. McKenney, Will Deacon, Linux Kernel Mailing List, Oleg Nesterov, Linus Torvalds, Andrew Morton On Mon, Oct 19, 2015 at 09:06:05AM +0200, Ingo Molnar wrote: > * Peter Zijlstra <peterz@infradead.org> wrote: > > > In any case, its all moot now, since Paul no longer requires schedule() to imply > > a full barrier. > > > > [...] > > Nevertheless from a least-surprise POV it might be worth guaranteeing it, because > I bet there's tons of code that assumes that schedule() is a heavy operation and > it's such an easy mistake to make. Since we are so close to having that guarantee, > we might as well codify it? FWIW, the arm64 __switch_to() has a heavy barrier (DSB) but the reason for this was to cope with potentially interrupted cache or TLB maintenance (which require a DSB on the same CPU) and thread migration to another CPU. -- Catalin ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Q: schedule() and implied barriers on arm64 2015-10-19 15:21 ` Catalin Marinas @ 2015-10-19 16:24 ` Peter Zijlstra 2015-10-20 8:37 ` Ingo Molnar 2015-10-27 16:19 ` Will Deacon 0 siblings, 2 replies; 20+ messages in thread From: Peter Zijlstra @ 2015-10-19 16:24 UTC (permalink / raw) To: Catalin Marinas Cc: Ingo Molnar, Paul E. McKenney, Will Deacon, Linux Kernel Mailing List, Oleg Nesterov, Linus Torvalds, Andrew Morton On Mon, Oct 19, 2015 at 04:21:08PM +0100, Catalin Marinas wrote: > On Mon, Oct 19, 2015 at 09:06:05AM +0200, Ingo Molnar wrote: > > * Peter Zijlstra <peterz@infradead.org> wrote: > > > > > In any case, its all moot now, since Paul no longer requires schedule() to imply > > > a full barrier. > > > > > > [...] > > > > Nevertheless from a least-surprise POV it might be worth guaranteeing it, because > > I bet there's tons of code that assumes that schedule() is a heavy operation and > > it's such an easy mistake to make. Since we are so close to having that guarantee, > > we might as well codify it? > > FWIW, the arm64 __switch_to() has a heavy barrier (DSB) but the reason > for this was to cope with potentially interrupted cache or TLB > maintenance (which require a DSB on the same CPU) and thread migration > to another CPU. Right, but there's a path through schedule() that does not pass through __switch_to(); when we pick the current task as the most eligible task and next == prev. In that case there really only is the wmb, a spin lock, an atomic op and a spin unlock (and a whole bunch of 'normal' code of course). ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Q: schedule() and implied barriers on arm64 2015-10-19 16:24 ` Peter Zijlstra @ 2015-10-20 8:37 ` Ingo Molnar 2015-10-27 16:19 ` Will Deacon 1 sibling, 0 replies; 20+ messages in thread From: Ingo Molnar @ 2015-10-20 8:37 UTC (permalink / raw) To: Peter Zijlstra Cc: Catalin Marinas, Paul E. McKenney, Will Deacon, Linux Kernel Mailing List, Oleg Nesterov, Linus Torvalds, Andrew Morton * Peter Zijlstra <peterz@infradead.org> wrote: > On Mon, Oct 19, 2015 at 04:21:08PM +0100, Catalin Marinas wrote: > > On Mon, Oct 19, 2015 at 09:06:05AM +0200, Ingo Molnar wrote: > > > * Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > > In any case, its all moot now, since Paul no longer requires schedule() to imply > > > > a full barrier. > > > > > > > > [...] > > > > > > Nevertheless from a least-surprise POV it might be worth guaranteeing it, > > > because I bet there's tons of code that assumes that schedule() is a heavy > > > operation and it's such an easy mistake to make. Since we are so close to > > > having that guarantee, we might as well codify it? > > > > FWIW, the arm64 __switch_to() has a heavy barrier (DSB) but the reason for > > this was to cope with potentially interrupted cache or TLB maintenance (which > > require a DSB on the same CPU) and thread migration to another CPU. > > Right, but there's a path through schedule() that does not pass through > __switch_to(); when we pick the current task as the most eligible task and next > == prev. > > In that case there really only is the wmb, a spin lock, an atomic op and a spin > unlock (and a whole bunch of 'normal' code of course). Yeah, so my concern is that this is a rare race that might be 'surprising' for developers relying on various schedule() constructs. Especially as it's a full barrier on x86 (the most prominent SMP platform at the moment) there's a real danger of hard to debug bugs creeping to other architectures. So I think we should just do the small step of making it a full barrier everywhere - it's very close to it in any case, and it shouldn't really matter for performance. Agreed? Thanks, Ingo ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Q: schedule() and implied barriers on arm64 2015-10-19 16:24 ` Peter Zijlstra 2015-10-20 8:37 ` Ingo Molnar @ 2015-10-27 16:19 ` Will Deacon 2015-10-27 18:40 ` Peter Zijlstra 1 sibling, 1 reply; 20+ messages in thread From: Will Deacon @ 2015-10-27 16:19 UTC (permalink / raw) To: Peter Zijlstra Cc: Catalin Marinas, Ingo Molnar, Paul E. McKenney, Linux Kernel Mailing List, Oleg Nesterov, Linus Torvalds, Andrew Morton On Mon, Oct 19, 2015 at 06:24:23PM +0200, Peter Zijlstra wrote: > On Mon, Oct 19, 2015 at 04:21:08PM +0100, Catalin Marinas wrote: > > On Mon, Oct 19, 2015 at 09:06:05AM +0200, Ingo Molnar wrote: > > > * Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > > In any case, its all moot now, since Paul no longer requires schedule() to imply > > > > a full barrier. > > > > > > > > [...] > > > > > > Nevertheless from a least-surprise POV it might be worth guaranteeing it, because > > > I bet there's tons of code that assumes that schedule() is a heavy operation and > > > it's such an easy mistake to make. Since we are so close to having that guarantee, > > > we might as well codify it? > > > > FWIW, the arm64 __switch_to() has a heavy barrier (DSB) but the reason > > for this was to cope with potentially interrupted cache or TLB > > maintenance (which require a DSB on the same CPU) and thread migration > > to another CPU. > > Right, but there's a path through schedule() that does not pass through > __switch_to(); when we pick the current task as the most eligible task > and next == prev. > > In that case there really only is the wmb, a spin lock, an atomic op and > a spin unlock (and a whole bunch of 'normal' code of course). ... and the 'normal' code will have a control hazard somewhere, followed by the implicit ISB in exception return, so there's a barrier of sorts there too. The problem is that people say "full barrier" without defining what it really means, and we end up going round the houses on things like transitivity (which ctrl + isb doesn't always give you). Will ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Q: schedule() and implied barriers on arm64 2015-10-27 16:19 ` Will Deacon @ 2015-10-27 18:40 ` Peter Zijlstra 2015-10-28 10:39 ` Will Deacon 0 siblings, 1 reply; 20+ messages in thread From: Peter Zijlstra @ 2015-10-27 18:40 UTC (permalink / raw) To: Will Deacon Cc: Catalin Marinas, Ingo Molnar, Paul E. McKenney, Linux Kernel Mailing List, Oleg Nesterov, Linus Torvalds, Andrew Morton On Tue, Oct 27, 2015 at 04:19:48PM +0000, Will Deacon wrote: > ... and the 'normal' code will have a control hazard somewhere, followed > by the implicit ISB in exception return, so there's a barrier of sorts > there too. Which exception return? > The problem is that people say "full barrier" without defining what it > really means, and we end up going round the houses on things like > transitivity (which ctrl + isb doesn't always give you). I pretty much meant smp_mb() here :-) ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Q: schedule() and implied barriers on arm64 2015-10-27 18:40 ` Peter Zijlstra @ 2015-10-28 10:39 ` Will Deacon 0 siblings, 0 replies; 20+ messages in thread From: Will Deacon @ 2015-10-28 10:39 UTC (permalink / raw) To: Peter Zijlstra Cc: Catalin Marinas, Ingo Molnar, Paul E. McKenney, Linux Kernel Mailing List, Oleg Nesterov, Linus Torvalds, Andrew Morton On Tue, Oct 27, 2015 at 07:40:20PM +0100, Peter Zijlstra wrote: > On Tue, Oct 27, 2015 at 04:19:48PM +0000, Will Deacon wrote: > > ... and the 'normal' code will have a control hazard somewhere, followed > > by the implicit ISB in exception return, so there's a barrier of sorts > > there too. > > Which exception return? The return to userspace after the interrupt/fault/system call that got us into the kernel. > > The problem is that people say "full barrier" without defining what it > > really means, and we end up going round the houses on things like > > transitivity (which ctrl + isb doesn't always give you). > > I pretty much meant smp_mb() here :-) In which case, we don't provide the transitivity guarantees that you would get from an smp_mb(). Will ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Q: schedule() and implied barriers on arm64 2015-10-16 16:04 ` Paul E. McKenney 2015-10-16 16:16 ` Peter Zijlstra @ 2015-10-16 17:11 ` Catalin Marinas 1 sibling, 0 replies; 20+ messages in thread From: Catalin Marinas @ 2015-10-16 17:11 UTC (permalink / raw) To: Paul McKenney Cc: Peter Zijlstra, Will Deacon, Linux Kernel Mailing List, Oleg Nesterov, Ingo Molnar On 16 October 2015 at 17:04, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > On Fri, Oct 16, 2015 at 05:18:30PM +0200, Peter Zijlstra wrote: >> If so, however, I suspect AARGH64 is borken and would need (just like >> PPC): >> >> #define smp_mb__before_spinlock() smp_mb() >> >> The problem is that schedule() (when a NO-OP) does: >> >> smp_mb__before_spinlock(); >> LOCK rq->lock >> >> clear_bit() >> >> UNLOCK rq->lock >> >> And nothing there implies a full barrier on AARGH64, since >> smp_mb__before_spinlock() defaults to WMB, LOCK is an "ldaxr" or >> load-acquire, UNLOCK is "stlrh" or store-release and clear_bit() isn't >> anything. >> >> Pretty much every other arch has LOCK implying a full barrier, either >> because its strongly ordered or because it needs one for the ACQUIRE >> semantics. > > Well, arm64 might well need smp_mb__after_unlock_lock() to be non-empty. > But I thought that it used a dmb in the spinlock code somewhere or > another... unlock+lock on arm64 is a full barrier, so, IIUC the semantics of smp_mb__after_unlock_lock(), I don't think we need this to be non-empty. Maybe redefining smp_mb__before_spinlock() to be smp_mb() (rather than wmb) on arm64 if we need smp_mb__before_spinlock+lock to act as a barrier for both reads and writes (see my other reply to Peter). -- Catalin ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2015-10-28 10:39 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-10-16 15:18 Q: schedule() and implied barriers on arm64 Peter Zijlstra 2015-10-16 16:04 ` Paul E. McKenney 2015-10-16 16:16 ` Peter Zijlstra 2015-10-16 16:28 ` Paul E. McKenney 2015-10-16 16:39 ` Peter Zijlstra 2015-10-16 16:55 ` Catalin Marinas 2015-10-16 17:28 ` Paul E. McKenney 2015-10-16 19:07 ` Peter Zijlstra 2015-10-16 19:20 ` Paul E. McKenney 2015-10-19 15:18 ` Catalin Marinas 2015-10-16 19:06 ` Peter Zijlstra 2015-10-19 7:06 ` Ingo Molnar 2015-10-19 9:04 ` Peter Zijlstra 2015-10-19 15:21 ` Catalin Marinas 2015-10-19 16:24 ` Peter Zijlstra 2015-10-20 8:37 ` Ingo Molnar 2015-10-27 16:19 ` Will Deacon 2015-10-27 18:40 ` Peter Zijlstra 2015-10-28 10:39 ` Will Deacon 2015-10-16 17:11 ` Catalin Marinas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).