* [PATCH RFC] sched: Make wake_up_nohz_cpu() handle CPUs going offline @ 2016-06-30 17:58 Paul E. McKenney 2016-06-30 23:29 ` Frederic Weisbecker 0 siblings, 1 reply; 9+ messages in thread From: Paul E. McKenney @ 2016-06-30 17:58 UTC (permalink / raw) To: peterz, fweisbec, tglx; +Cc: linux-kernel, rgkernel Both timers and hrtimers are maintained on the outgoing CPU until CPU_DEAD time, at which point they are migrated to a surviving CPU. If a mod_timer() executes between CPU_DYING and CPU_DEAD time, x86 systems will splat in native_smp_send_reschedule() when attempting to wake up the just-now-offlined CPU, as shown below from a NO_HZ_FULL kernel: [ 7976.741556] WARNING: CPU: 0 PID: 661 at /home/paulmck/public_git/linux-rcu/arch/x86/kernel/smp.c:125 native_smp_send_reschedule+0x39/0x40 [ 7976.741595] Modules linked in: [ 7976.741595] CPU: 0 PID: 661 Comm: rcu_torture_rea Not tainted 4.7.0-rc2+ #1 [ 7976.741595] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [ 7976.741595] 0000000000000000 ffff88000002fcc8 ffffffff8138ab2e 0000000000000000 [ 7976.741595] 0000000000000000 ffff88000002fd08 ffffffff8105cabc 0000007d1fd0ee18 [ 7976.741595] 0000000000000001 ffff88001fd16d40 ffff88001fd0ee00 ffff88001fd0ee00 [ 7976.741595] Call Trace: [ 7976.741595] [<ffffffff8138ab2e>] dump_stack+0x67/0x99 [ 7976.741595] [<ffffffff8105cabc>] __warn+0xcc/0xf0 [ 7976.741595] [<ffffffff8105cb98>] warn_slowpath_null+0x18/0x20 [ 7976.741595] [<ffffffff8103cba9>] native_smp_send_reschedule+0x39/0x40 [ 7976.741595] [<ffffffff81089bc2>] wake_up_nohz_cpu+0x82/0x190 [ 7976.741595] [<ffffffff810d275a>] internal_add_timer+0x7a/0x80 [ 7976.741595] [<ffffffff810d3ee7>] mod_timer+0x187/0x2b0 [ 7976.741595] [<ffffffff810c89dd>] rcu_torture_reader+0x33d/0x380 [ 7976.741595] [<ffffffff810c66f0>] ? sched_torture_read_unlock+0x30/0x30 [ 7976.741595] [<ffffffff810c86a0>] ? rcu_bh_torture_read_lock+0x80/0x80 [ 7976.741595] [<ffffffff8108068f>] kthread+0xdf/0x100 [ 7976.741595] [<ffffffff819dd83f>] ret_from_fork+0x1f/0x40 [ 7976.741595] [<ffffffff810805b0>] ? kthread_create_on_node+0x200/0x200 However, in this case, the wakeup is redundant, because the timer migration will reprogram timer hardware as needed. Note that the fact that preemption is disabled does not avoid the splat, as the offline operation has already passed both the synchronize_sched() and the stop_machine() that would be blocked by disabled preemption. This commit therefore modifies wake_up_nohz_cpu() to avoid attempting to wake up offline CPUs. It also adds a comment stating that the caller must tolerate lost wakeups when the target CPU is going offline, and suggesting the CPU_DEAD notifier as a recovery mechanism. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> --- core.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 7f2cae4620c7..08502966e7df 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -590,9 +590,14 @@ static bool wake_up_full_nohz_cpu(int cpu) return false; } +/* + * Wake up the specified CPU. If the CPU is going offline, it is the + * caller's responsibility to deal with the lost wakeup, for example, + * by hooking into the CPU_DEAD notifier like timers and hrtimers do. + */ void wake_up_nohz_cpu(int cpu) { - if (!wake_up_full_nohz_cpu(cpu)) + if (cpu_online(cpu) && !wake_up_full_nohz_cpu(cpu)) wake_up_idle_cpu(cpu); } ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH RFC] sched: Make wake_up_nohz_cpu() handle CPUs going offline 2016-06-30 17:58 [PATCH RFC] sched: Make wake_up_nohz_cpu() handle CPUs going offline Paul E. McKenney @ 2016-06-30 23:29 ` Frederic Weisbecker 2016-07-01 18:40 ` Paul E. McKenney 2016-07-12 14:19 ` Peter Zijlstra 0 siblings, 2 replies; 9+ messages in thread From: Frederic Weisbecker @ 2016-06-30 23:29 UTC (permalink / raw) To: Paul E. McKenney; +Cc: peterz, tglx, linux-kernel, rgkernel On Thu, Jun 30, 2016 at 10:58:45AM -0700, Paul E. McKenney wrote: > Both timers and hrtimers are maintained on the outgoing CPU until > CPU_DEAD time, at which point they are migrated to a surviving CPU. If a > mod_timer() executes between CPU_DYING and CPU_DEAD time, x86 systems > will splat in native_smp_send_reschedule() when attempting to wake up > the just-now-offlined CPU, as shown below from a NO_HZ_FULL kernel: > > [ 7976.741556] WARNING: CPU: 0 PID: 661 at /home/paulmck/public_git/linux-rcu/arch/x86/kernel/smp.c:125 native_smp_send_reschedule+0x39/0x40 > [ 7976.741595] Modules linked in: > [ 7976.741595] CPU: 0 PID: 661 Comm: rcu_torture_rea Not tainted 4.7.0-rc2+ #1 > [ 7976.741595] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > [ 7976.741595] 0000000000000000 ffff88000002fcc8 ffffffff8138ab2e 0000000000000000 > [ 7976.741595] 0000000000000000 ffff88000002fd08 ffffffff8105cabc 0000007d1fd0ee18 > [ 7976.741595] 0000000000000001 ffff88001fd16d40 ffff88001fd0ee00 ffff88001fd0ee00 > [ 7976.741595] Call Trace: > [ 7976.741595] [<ffffffff8138ab2e>] dump_stack+0x67/0x99 > [ 7976.741595] [<ffffffff8105cabc>] __warn+0xcc/0xf0 > [ 7976.741595] [<ffffffff8105cb98>] warn_slowpath_null+0x18/0x20 > [ 7976.741595] [<ffffffff8103cba9>] native_smp_send_reschedule+0x39/0x40 > [ 7976.741595] [<ffffffff81089bc2>] wake_up_nohz_cpu+0x82/0x190 > [ 7976.741595] [<ffffffff810d275a>] internal_add_timer+0x7a/0x80 > [ 7976.741595] [<ffffffff810d3ee7>] mod_timer+0x187/0x2b0 > [ 7976.741595] [<ffffffff810c89dd>] rcu_torture_reader+0x33d/0x380 > [ 7976.741595] [<ffffffff810c66f0>] ? sched_torture_read_unlock+0x30/0x30 > [ 7976.741595] [<ffffffff810c86a0>] ? rcu_bh_torture_read_lock+0x80/0x80 > [ 7976.741595] [<ffffffff8108068f>] kthread+0xdf/0x100 > [ 7976.741595] [<ffffffff819dd83f>] ret_from_fork+0x1f/0x40 > [ 7976.741595] [<ffffffff810805b0>] ? kthread_create_on_node+0x200/0x200 > > However, in this case, the wakeup is redundant, because the timer > migration will reprogram timer hardware as needed. Note that the fact > that preemption is disabled does not avoid the splat, as the offline > operation has already passed both the synchronize_sched() and the > stop_machine() that would be blocked by disabled preemption. > > This commit therefore modifies wake_up_nohz_cpu() to avoid attempting > to wake up offline CPUs. It also adds a comment stating that the > caller must tolerate lost wakeups when the target CPU is going offline, > and suggesting the CPU_DEAD notifier as a recovery mechanism. > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Frederic Weisbecker <fweisbec@gmail.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > --- > core.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 7f2cae4620c7..08502966e7df 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -590,9 +590,14 @@ static bool wake_up_full_nohz_cpu(int cpu) > return false; > } > > +/* > + * Wake up the specified CPU. If the CPU is going offline, it is the > + * caller's responsibility to deal with the lost wakeup, for example, > + * by hooking into the CPU_DEAD notifier like timers and hrtimers do. > + */ > void wake_up_nohz_cpu(int cpu) > { > - if (!wake_up_full_nohz_cpu(cpu)) > + if (cpu_online(cpu) && !wake_up_full_nohz_cpu(cpu)) So at this point, as we passed CPU_DYING, I believe the CPU isn't visible in the domains anymore (correct me if I'm wrong), therefore get_nohz_timer_target() can't return it, unless smp_processor_id() is the only alternative. Hence, that call to wake_up_nohz_cpu() can only happen to online CPUs or the current one (pinned). And wake_up_idle_cpu() on the current CPU is a no-op. So only wake_up_full_nohz_cpu() is concerned. Then perhaps it would be better to move that cpu_online() check to wake_up_full_nohz_cpu() ? BTW, it seems that rcutorture stops its kthreads after CPU_DYING, is it expected that it queues timers at this stage? Thanks. > wake_up_idle_cpu(cpu); > } > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH RFC] sched: Make wake_up_nohz_cpu() handle CPUs going offline 2016-06-30 23:29 ` Frederic Weisbecker @ 2016-07-01 18:40 ` Paul E. McKenney 2016-07-01 23:49 ` Frederic Weisbecker 2016-07-12 14:19 ` Peter Zijlstra 1 sibling, 1 reply; 9+ messages in thread From: Paul E. McKenney @ 2016-07-01 18:40 UTC (permalink / raw) To: Frederic Weisbecker; +Cc: peterz, tglx, linux-kernel, rgkernel On Fri, Jul 01, 2016 at 01:29:59AM +0200, Frederic Weisbecker wrote: > On Thu, Jun 30, 2016 at 10:58:45AM -0700, Paul E. McKenney wrote: > > Both timers and hrtimers are maintained on the outgoing CPU until > > CPU_DEAD time, at which point they are migrated to a surviving CPU. If a > > mod_timer() executes between CPU_DYING and CPU_DEAD time, x86 systems > > will splat in native_smp_send_reschedule() when attempting to wake up > > the just-now-offlined CPU, as shown below from a NO_HZ_FULL kernel: > > > > [ 7976.741556] WARNING: CPU: 0 PID: 661 at /home/paulmck/public_git/linux-rcu/arch/x86/kernel/smp.c:125 native_smp_send_reschedule+0x39/0x40 > > [ 7976.741595] Modules linked in: > > [ 7976.741595] CPU: 0 PID: 661 Comm: rcu_torture_rea Not tainted 4.7.0-rc2+ #1 > > [ 7976.741595] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > > [ 7976.741595] 0000000000000000 ffff88000002fcc8 ffffffff8138ab2e 0000000000000000 > > [ 7976.741595] 0000000000000000 ffff88000002fd08 ffffffff8105cabc 0000007d1fd0ee18 > > [ 7976.741595] 0000000000000001 ffff88001fd16d40 ffff88001fd0ee00 ffff88001fd0ee00 > > [ 7976.741595] Call Trace: > > [ 7976.741595] [<ffffffff8138ab2e>] dump_stack+0x67/0x99 > > [ 7976.741595] [<ffffffff8105cabc>] __warn+0xcc/0xf0 > > [ 7976.741595] [<ffffffff8105cb98>] warn_slowpath_null+0x18/0x20 > > [ 7976.741595] [<ffffffff8103cba9>] native_smp_send_reschedule+0x39/0x40 > > [ 7976.741595] [<ffffffff81089bc2>] wake_up_nohz_cpu+0x82/0x190 > > [ 7976.741595] [<ffffffff810d275a>] internal_add_timer+0x7a/0x80 > > [ 7976.741595] [<ffffffff810d3ee7>] mod_timer+0x187/0x2b0 > > [ 7976.741595] [<ffffffff810c89dd>] rcu_torture_reader+0x33d/0x380 > > [ 7976.741595] [<ffffffff810c66f0>] ? sched_torture_read_unlock+0x30/0x30 > > [ 7976.741595] [<ffffffff810c86a0>] ? rcu_bh_torture_read_lock+0x80/0x80 > > [ 7976.741595] [<ffffffff8108068f>] kthread+0xdf/0x100 > > [ 7976.741595] [<ffffffff819dd83f>] ret_from_fork+0x1f/0x40 > > [ 7976.741595] [<ffffffff810805b0>] ? kthread_create_on_node+0x200/0x200 > > > > However, in this case, the wakeup is redundant, because the timer > > migration will reprogram timer hardware as needed. Note that the fact > > that preemption is disabled does not avoid the splat, as the offline > > operation has already passed both the synchronize_sched() and the > > stop_machine() that would be blocked by disabled preemption. > > > > This commit therefore modifies wake_up_nohz_cpu() to avoid attempting > > to wake up offline CPUs. It also adds a comment stating that the > > caller must tolerate lost wakeups when the target CPU is going offline, > > and suggesting the CPU_DEAD notifier as a recovery mechanism. > > > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > Cc: Peter Zijlstra <peterz@infradead.org> > > Cc: Frederic Weisbecker <fweisbec@gmail.com> > > Cc: Thomas Gleixner <tglx@linutronix.de> > > --- > > core.c | 7 ++++++- > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > > index 7f2cae4620c7..08502966e7df 100644 > > --- a/kernel/sched/core.c > > +++ b/kernel/sched/core.c > > @@ -590,9 +590,14 @@ static bool wake_up_full_nohz_cpu(int cpu) > > return false; > > } > > > > +/* > > + * Wake up the specified CPU. If the CPU is going offline, it is the > > + * caller's responsibility to deal with the lost wakeup, for example, > > + * by hooking into the CPU_DEAD notifier like timers and hrtimers do. > > + */ > > void wake_up_nohz_cpu(int cpu) > > { > > - if (!wake_up_full_nohz_cpu(cpu)) > > + if (cpu_online(cpu) && !wake_up_full_nohz_cpu(cpu)) > > So at this point, as we passed CPU_DYING, I believe the CPU isn't visible in the domains > anymore (correct me if I'm wrong), therefore get_nohz_timer_target() can't return it, > unless smp_processor_id() is the only alternative. Right, but the timers have been posted long before even CPU_UP_PREPARE. >From what I can see, they are left alone until CPU_DEAD. Which means that if you try to mod_timer() them between CPU_DYING and CPU_DEAD, you can get the above splat. Or am I missing somthing subtle here? > Hence, that call to wake_up_nohz_cpu() can only happen to online CPUs or the current > one (pinned). And wake_up_idle_cpu() on the current CPU is a no-op. So only > wake_up_full_nohz_cpu() is concerned. Then perhaps it would be better to move that > cpu_online() check to wake_up_full_nohz_cpu() ? As in the patch shown below? Either way works for me. > BTW, it seems that rcutorture stops its kthreads after CPU_DYING, is it expected that > it queues timers at this stage? Hmmm... From what I can see, rcutorture cleans up its priority-boost kthreads at CPU_DOWN_PREPARE time. The other threads are allowed to migrate wherever the scheduler wants, give or take the task shuffling. The task shuffling only excludes one CPU at a time, and I have seen this occur when multiple CPUs were running, e.g., 0, 2, and 3 while offlining 1. Besides which, doesn't the scheduler prevent anything but the idle thread from running after CPU_DYING time? Thanx, Paul ------------------------------------------------------------------------ diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 7f2cae4620c7..08502966e7df 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -590,9 +590,14 @@ static bool wake_up_full_nohz_cpu(int cpu) return false; } +/* + * Wake up the specified CPU. If the CPU is going offline, it is the + * caller's responsibility to deal with the lost wakeup, for example, + * by hooking into the CPU_DEAD notifier like timers and hrtimers do. + */ void wake_up_nohz_cpu(int cpu) { - if (!wake_up_full_nohz_cpu(cpu)) + if (cpu_online(cpu) && !wake_up_full_nohz_cpu(cpu)) wake_up_idle_cpu(cpu); } ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH RFC] sched: Make wake_up_nohz_cpu() handle CPUs going offline 2016-07-01 18:40 ` Paul E. McKenney @ 2016-07-01 23:49 ` Frederic Weisbecker 2016-07-02 0:15 ` Paul E. McKenney 0 siblings, 1 reply; 9+ messages in thread From: Frederic Weisbecker @ 2016-07-01 23:49 UTC (permalink / raw) To: Paul E. McKenney; +Cc: peterz, tglx, linux-kernel, rgkernel On Fri, Jul 01, 2016 at 11:40:54AM -0700, Paul E. McKenney wrote: > On Fri, Jul 01, 2016 at 01:29:59AM +0200, Frederic Weisbecker wrote: > > > +/* > > > + * Wake up the specified CPU. If the CPU is going offline, it is the > > > + * caller's responsibility to deal with the lost wakeup, for example, > > > + * by hooking into the CPU_DEAD notifier like timers and hrtimers do. > > > + */ > > > void wake_up_nohz_cpu(int cpu) > > > { > > > - if (!wake_up_full_nohz_cpu(cpu)) > > > + if (cpu_online(cpu) && !wake_up_full_nohz_cpu(cpu)) > > > > So at this point, as we passed CPU_DYING, I believe the CPU isn't visible in the domains > > anymore (correct me if I'm wrong), therefore get_nohz_timer_target() can't return it, > > unless smp_processor_id() is the only alternative. > > Right, but the timers have been posted long before even CPU_UP_PREPARE. > From what I can see, they are left alone until CPU_DEAD. Which means > that if you try to mod_timer() them between CPU_DYING and CPU_DEAD, > you can get the above splat. > > Or am I missing somthing subtle here? Yes that's exactly what I meant. It happens on mod_timer() calls between CPU_DYING and CPU_DEAD. I just wanted to clarify the conditions for it to happen: the fact that it shouldn't concern remote CPU targets, only local pinned timers. > > Hence, that call to wake_up_nohz_cpu() can only happen to online CPUs or the current > > one (pinned). And wake_up_idle_cpu() on the current CPU is a no-op. So only > > wake_up_full_nohz_cpu() is concerned. Then perhaps it would be better to move that > > cpu_online() check to wake_up_full_nohz_cpu() ? > > As in the patch shown below? Either way works for me. Hmm, the patch doesn't seem to be different than the previous one :-) > > > BTW, it seems that rcutorture stops its kthreads after CPU_DYING, is it expected that > > it queues timers at this stage? > > Hmmm... From what I can see, rcutorture cleans up its priority-boost > kthreads at CPU_DOWN_PREPARE time. The other threads are allowed to > migrate wherever the scheduler wants, give or take the task shuffling. > The task shuffling only excludes one CPU at a time, and I have seen > this occur when multiple CPUs were running, e.g., 0, 2, and 3 while > offlining 1. But if rcutorture kthreads are cleaned up at CPU_DOWN_PREPARE, they shouldn't be calling mod_timer() on CPU_DYING time. Or there are other rcutorture threads? > > Besides which, doesn't the scheduler prevent anything but the idle > thread from running after CPU_DYING time? Indeed migrate_tasks() is called on CPU_DYING but pinned kthreads, outside smpboot, have their own way to deal with hotplug through notifiers. Thanks. > > Thanx, Paul > > ------------------------------------------------------------------------ > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 7f2cae4620c7..08502966e7df 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -590,9 +590,14 @@ static bool wake_up_full_nohz_cpu(int cpu) > return false; > } > > +/* > + * Wake up the specified CPU. If the CPU is going offline, it is the > + * caller's responsibility to deal with the lost wakeup, for example, > + * by hooking into the CPU_DEAD notifier like timers and hrtimers do. > + */ > void wake_up_nohz_cpu(int cpu) > { > - if (!wake_up_full_nohz_cpu(cpu)) > + if (cpu_online(cpu) && !wake_up_full_nohz_cpu(cpu)) > wake_up_idle_cpu(cpu); > } > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH RFC] sched: Make wake_up_nohz_cpu() handle CPUs going offline 2016-07-01 23:49 ` Frederic Weisbecker @ 2016-07-02 0:15 ` Paul E. McKenney 2016-07-04 12:21 ` Frederic Weisbecker 0 siblings, 1 reply; 9+ messages in thread From: Paul E. McKenney @ 2016-07-02 0:15 UTC (permalink / raw) To: Frederic Weisbecker; +Cc: peterz, tglx, linux-kernel, rgkernel On Sat, Jul 02, 2016 at 01:49:56AM +0200, Frederic Weisbecker wrote: > On Fri, Jul 01, 2016 at 11:40:54AM -0700, Paul E. McKenney wrote: > > On Fri, Jul 01, 2016 at 01:29:59AM +0200, Frederic Weisbecker wrote: > > > > +/* > > > > + * Wake up the specified CPU. If the CPU is going offline, it is the > > > > + * caller's responsibility to deal with the lost wakeup, for example, > > > > + * by hooking into the CPU_DEAD notifier like timers and hrtimers do. > > > > + */ > > > > void wake_up_nohz_cpu(int cpu) > > > > { > > > > - if (!wake_up_full_nohz_cpu(cpu)) > > > > + if (cpu_online(cpu) && !wake_up_full_nohz_cpu(cpu)) > > > > > > So at this point, as we passed CPU_DYING, I believe the CPU isn't visible in the domains > > > anymore (correct me if I'm wrong), therefore get_nohz_timer_target() can't return it, > > > unless smp_processor_id() is the only alternative. > > > > Right, but the timers have been posted long before even CPU_UP_PREPARE. > > From what I can see, they are left alone until CPU_DEAD. Which means > > that if you try to mod_timer() them between CPU_DYING and CPU_DEAD, > > you can get the above splat. > > > > Or am I missing somthing subtle here? > > Yes that's exactly what I meant. It happens on mod_timer() calls > between CPU_DYING and CPU_DEAD. I just wanted to clarify the > conditions for it to happen: the fact that it shouldn't concern > remote CPU targets, only local pinned timers. OK. What happens in the following sequence of events? o CPU 5 posts a timer, which might well be locally pinned. This is rcu_torture_reader() posting its on-stack timer creatively named "t". o CPU 5 starts going offline, so that rcu_torture_reader() gets migrated to CPU 6. o CPU 5 reaches CPU_DYING but has not yet reached CPU_DEAD. o CPU 6 invokes mod_timer() on its timer "t". Wouldn't that trigger the scenario that I am seeing? > > > Hence, that call to wake_up_nohz_cpu() can only happen to online CPUs or the current > > > one (pinned). And wake_up_idle_cpu() on the current CPU is a no-op. So only > > > wake_up_full_nohz_cpu() is concerned. Then perhaps it would be better to move that > > > cpu_online() check to wake_up_full_nohz_cpu() ? > > > > As in the patch shown below? Either way works for me. > > Hmm, the patch doesn't seem to be different than the previous one :-) Indeed it does not! How about the one shown below this time? > > > BTW, it seems that rcutorture stops its kthreads after CPU_DYING, is it expected that > > > it queues timers at this stage? > > > > Hmmm... From what I can see, rcutorture cleans up its priority-boost > > kthreads at CPU_DOWN_PREPARE time. The other threads are allowed to > > migrate wherever the scheduler wants, give or take the task shuffling. > > The task shuffling only excludes one CPU at a time, and I have seen > > this occur when multiple CPUs were running, e.g., 0, 2, and 3 while > > offlining 1. > > But if rcutorture kthreads are cleaned up at CPU_DOWN_PREPARE, they > shouldn't be calling mod_timer() on CPU_DYING time. Or there are other > rcutorture threads? The rcu_torture_reader() kthreads aren't associated with any particular CPU, so when CPUs go offline, they just get migrated to other CPUs. This allows them to execute on those other CPUs between CPU_DYING and CPU_DEAD time, correct? Other rcutorture kthreads -are- bound to specific CPUs, but they are testing priority boosting, not simple reading. > > Besides which, doesn't the scheduler prevent anything but the idle > > thread from running after CPU_DYING time? > > Indeed migrate_tasks() is called on CPU_DYING but pinned kthreads, outside > smpboot, have their own way to deal with hotplug through notifiers. Agreed, but the rcu_torture_reader() kthreads aren't pinned, so they should migrate automatically at CPU_DYING time. Thanx, Paul ------------------------------------------------------------------------ diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 7f2cae4620c7..1a91fc733a0f 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -580,6 +580,8 @@ static bool wake_up_full_nohz_cpu(int cpu) * If needed we can still optimize that later with an * empty IRQ. */ + if (cpu_is_offline(cpu)) + return true; if (tick_nohz_full_cpu(cpu)) { if (cpu != smp_processor_id() || tick_nohz_tick_stopped()) @@ -590,6 +592,11 @@ static bool wake_up_full_nohz_cpu(int cpu) return false; } +/* + * Wake up the specified CPU. If the CPU is going offline, it is the + * caller's responsibility to deal with the lost wakeup, for example, + * by hooking into the CPU_DEAD notifier like timers and hrtimers do. + */ void wake_up_nohz_cpu(int cpu) { if (!wake_up_full_nohz_cpu(cpu)) ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH RFC] sched: Make wake_up_nohz_cpu() handle CPUs going offline 2016-07-02 0:15 ` Paul E. McKenney @ 2016-07-04 12:21 ` Frederic Weisbecker 2016-07-04 16:55 ` Paul E. McKenney 0 siblings, 1 reply; 9+ messages in thread From: Frederic Weisbecker @ 2016-07-04 12:21 UTC (permalink / raw) To: Paul E. McKenney; +Cc: peterz, tglx, linux-kernel, rgkernel On Fri, Jul 01, 2016 at 05:15:06PM -0700, Paul E. McKenney wrote: > On Sat, Jul 02, 2016 at 01:49:56AM +0200, Frederic Weisbecker wrote: > > On Fri, Jul 01, 2016 at 11:40:54AM -0700, Paul E. McKenney wrote: > > > On Fri, Jul 01, 2016 at 01:29:59AM +0200, Frederic Weisbecker wrote: > > > > > +/* > > > > > + * Wake up the specified CPU. If the CPU is going offline, it is the > > > > > + * caller's responsibility to deal with the lost wakeup, for example, > > > > > + * by hooking into the CPU_DEAD notifier like timers and hrtimers do. > > > > > + */ > > > > > void wake_up_nohz_cpu(int cpu) > > > > > { > > > > > - if (!wake_up_full_nohz_cpu(cpu)) > > > > > + if (cpu_online(cpu) && !wake_up_full_nohz_cpu(cpu)) > > > > > > > > So at this point, as we passed CPU_DYING, I believe the CPU isn't visible in the domains > > > > anymore (correct me if I'm wrong), therefore get_nohz_timer_target() can't return it, > > > > unless smp_processor_id() is the only alternative. > > > > > > Right, but the timers have been posted long before even CPU_UP_PREPARE. > > > From what I can see, they are left alone until CPU_DEAD. Which means > > > that if you try to mod_timer() them between CPU_DYING and CPU_DEAD, > > > you can get the above splat. > > > > > > Or am I missing somthing subtle here? > > > > Yes that's exactly what I meant. It happens on mod_timer() calls > > between CPU_DYING and CPU_DEAD. I just wanted to clarify the > > conditions for it to happen: the fact that it shouldn't concern > > remote CPU targets, only local pinned timers. > > OK. What happens in the following sequence of events? > > o CPU 5 posts a timer, which might well be locally pinned. > This is rcu_torture_reader() posting its on-stack timer > creatively named "t". > > o CPU 5 starts going offline, so that rcu_torture_reader() gets > migrated to CPU 6. > > o CPU 5 reaches CPU_DYING but has not yet reached CPU_DEAD. > > o CPU 6 invokes mod_timer() on its timer "t". > > Wouldn't that trigger the scenario that I am seeing? No I don't think so because "t" is then going to be enqueued to CPU 6. __mod_timer() -> get_target_base() uses local accessor on timer base. So the target of pinned timers is always the CPU of the caller. > > > > BTW, it seems that rcutorture stops its kthreads after CPU_DYING, is it expected that > > > > it queues timers at this stage? > > > > > > Hmmm... From what I can see, rcutorture cleans up its priority-boost > > > kthreads at CPU_DOWN_PREPARE time. The other threads are allowed to > > > migrate wherever the scheduler wants, give or take the task shuffling. > > > The task shuffling only excludes one CPU at a time, and I have seen > > > this occur when multiple CPUs were running, e.g., 0, 2, and 3 while > > > offlining 1. > > > > But if rcutorture kthreads are cleaned up at CPU_DOWN_PREPARE, they > > shouldn't be calling mod_timer() on CPU_DYING time. Or there are other > > rcutorture threads? > > The rcu_torture_reader() kthreads aren't associated with any particular > CPU, so when CPUs go offline, they just get migrated to other CPUs. > This allows them to execute on those other CPUs between CPU_DYING and > CPU_DEAD time, correct? Indeed. Timers get migrated on CPU_DEAD only. So if rcu_torture_reader() enqueued a pinned timer to CPU 5, then get migrated to CPU 6, the timer may well fire after CPU_DYING on CPU 5 and even re-enqueue itself in CPU 5, provided the timer is self-enqueued. > > Other rcutorture kthreads -are- bound to specific CPUs, but they are > testing priority boosting, not simple reading. I see. > > > > Besides which, doesn't the scheduler prevent anything but the idle > > > thread from running after CPU_DYING time? > > > > Indeed migrate_tasks() is called on CPU_DYING but pinned kthreads, outside > > smpboot, have their own way to deal with hotplug through notifiers. > > Agreed, but the rcu_torture_reader() kthreads aren't pinned, so they > should migrate automatically at CPU_DYING time. Yes indeed. > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 7f2cae4620c7..1a91fc733a0f 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -580,6 +580,8 @@ static bool wake_up_full_nohz_cpu(int cpu) > * If needed we can still optimize that later with an > * empty IRQ. > */ > + if (cpu_is_offline(cpu)) > + return true; Preferably put this under the tick_nohz_full_cpu() below because it has a static key optimizations. Distros build NO_HZ_FULL but don't use it 99.99999% of the time. > if (tick_nohz_full_cpu(cpu)) { > if (cpu != smp_processor_id() || > tick_nohz_tick_stopped()) > @@ -590,6 +592,11 @@ static bool wake_up_full_nohz_cpu(int cpu) > return false; > } > > +/* > + * Wake up the specified CPU. If the CPU is going offline, it is the > + * caller's responsibility to deal with the lost wakeup, for example, > + * by hooking into the CPU_DEAD notifier like timers and hrtimers do. > + */ I think it's more transparent than that for the caller. migrate_timers() is called soon after and it takes care of waking up the destination of the migration if necessary. So the caller shouldn't care after all. But the cpu_is_offline() check above may need a comment about that. Thanks! ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH RFC] sched: Make wake_up_nohz_cpu() handle CPUs going offline 2016-07-04 12:21 ` Frederic Weisbecker @ 2016-07-04 16:55 ` Paul E. McKenney 2016-07-12 14:41 ` Paul E. McKenney 0 siblings, 1 reply; 9+ messages in thread From: Paul E. McKenney @ 2016-07-04 16:55 UTC (permalink / raw) To: Frederic Weisbecker; +Cc: peterz, tglx, linux-kernel, rgkernel On Mon, Jul 04, 2016 at 02:21:43PM +0200, Frederic Weisbecker wrote: > On Fri, Jul 01, 2016 at 05:15:06PM -0700, Paul E. McKenney wrote: > > On Sat, Jul 02, 2016 at 01:49:56AM +0200, Frederic Weisbecker wrote: > > > On Fri, Jul 01, 2016 at 11:40:54AM -0700, Paul E. McKenney wrote: > > > > On Fri, Jul 01, 2016 at 01:29:59AM +0200, Frederic Weisbecker wrote: > > > > > > +/* > > > > > > + * Wake up the specified CPU. If the CPU is going offline, it is the > > > > > > + * caller's responsibility to deal with the lost wakeup, for example, > > > > > > + * by hooking into the CPU_DEAD notifier like timers and hrtimers do. > > > > > > + */ > > > > > > void wake_up_nohz_cpu(int cpu) > > > > > > { > > > > > > - if (!wake_up_full_nohz_cpu(cpu)) > > > > > > + if (cpu_online(cpu) && !wake_up_full_nohz_cpu(cpu)) > > > > > > > > > > So at this point, as we passed CPU_DYING, I believe the CPU isn't visible in the domains > > > > > anymore (correct me if I'm wrong), therefore get_nohz_timer_target() can't return it, > > > > > unless smp_processor_id() is the only alternative. > > > > > > > > Right, but the timers have been posted long before even CPU_UP_PREPARE. > > > > From what I can see, they are left alone until CPU_DEAD. Which means > > > > that if you try to mod_timer() them between CPU_DYING and CPU_DEAD, > > > > you can get the above splat. > > > > > > > > Or am I missing somthing subtle here? > > > > > > Yes that's exactly what I meant. It happens on mod_timer() calls > > > between CPU_DYING and CPU_DEAD. I just wanted to clarify the > > > conditions for it to happen: the fact that it shouldn't concern > > > remote CPU targets, only local pinned timers. > > > > OK. What happens in the following sequence of events? > > > > o CPU 5 posts a timer, which might well be locally pinned. > > This is rcu_torture_reader() posting its on-stack timer > > creatively named "t". > > > > o CPU 5 starts going offline, so that rcu_torture_reader() gets > > migrated to CPU 6. > > > > o CPU 5 reaches CPU_DYING but has not yet reached CPU_DEAD. > > > > o CPU 6 invokes mod_timer() on its timer "t". > > > > Wouldn't that trigger the scenario that I am seeing? > > No I don't think so because "t" is then going to be enqueued to CPU 6. > __mod_timer() -> get_target_base() uses local accessor on timer base. > > So the target of pinned timers is always the CPU of the caller. > > > > > > BTW, it seems that rcutorture stops its kthreads after CPU_DYING, is it expected that > > > > > it queues timers at this stage? > > > > > > > > Hmmm... From what I can see, rcutorture cleans up its priority-boost > > > > kthreads at CPU_DOWN_PREPARE time. The other threads are allowed to > > > > migrate wherever the scheduler wants, give or take the task shuffling. > > > > The task shuffling only excludes one CPU at a time, and I have seen > > > > this occur when multiple CPUs were running, e.g., 0, 2, and 3 while > > > > offlining 1. > > > > > > But if rcutorture kthreads are cleaned up at CPU_DOWN_PREPARE, they > > > shouldn't be calling mod_timer() on CPU_DYING time. Or there are other > > > rcutorture threads? > > > > The rcu_torture_reader() kthreads aren't associated with any particular > > CPU, so when CPUs go offline, they just get migrated to other CPUs. > > This allows them to execute on those other CPUs between CPU_DYING and > > CPU_DEAD time, correct? > > Indeed. Timers get migrated on CPU_DEAD only. So if rcu_torture_reader() > enqueued a pinned timer to CPU 5, then get migrated to CPU 6, the timer > may well fire after CPU_DYING on CPU 5 and even re-enqueue itself in CPU 5, > provided the timer is self-enqueued. It is not self-enqueued, but rcu_torture_reader() will re-enqueue it any time after it fires. This means that it might re-enqueue it while it is still executing. If I understand correctly, at that point, the timer would still be referencing the outgoing CPU. But this is an odd scenario, so I might well have missed something. At this point, although I am sure we have a problem, I am not sure we understand (to say nothing of agree on) exactly what the problem is. ;-) The timer is not self-enqueued. It is posted like this, from rcu_torture_reader() in kernel/rcu/rcutorture.c: mod_timer(&t, jiffies + 1); That same function sets up the timer upon entry like this: setup_timer_on_stack(&t, rcu_torture_timer, 0); The splat is quite rare, which is why I suspect that it is reasonable to believe that it involves a race with the timer firing. > > Other rcutorture kthreads -are- bound to specific CPUs, but they are > > testing priority boosting, not simple reading. > > I see. > > > > > Besides which, doesn't the scheduler prevent anything but the idle > > > > thread from running after CPU_DYING time? > > > > > > Indeed migrate_tasks() is called on CPU_DYING but pinned kthreads, outside > > > smpboot, have their own way to deal with hotplug through notifiers. > > > > Agreed, but the rcu_torture_reader() kthreads aren't pinned, so they > > should migrate automatically at CPU_DYING time. > > Yes indeed. > > > > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > > index 7f2cae4620c7..1a91fc733a0f 100644 > > --- a/kernel/sched/core.c > > +++ b/kernel/sched/core.c > > @@ -580,6 +580,8 @@ static bool wake_up_full_nohz_cpu(int cpu) > > * If needed we can still optimize that later with an > > * empty IRQ. > > */ > > + if (cpu_is_offline(cpu)) > > + return true; > > Preferably put this under the tick_nohz_full_cpu() below because > it has a static key optimizations. Distros build NO_HZ_FULL but > don't use it 99.99999% of the time. Except that I need to need to return "true" if the CPU is offline even if it is not a tick_nohz_full_cpu(), don't I? In fact, the stack trace shows native_smp_send_reschedule(), which leads me to believe that the splat happened via the "return false" at the bottom of wake_up_full_nohz_cpu(). So I don't see a way to completely hide the check under tick_nohz_full_cpu(). Or am I missing something here? > > if (tick_nohz_full_cpu(cpu)) { > > if (cpu != smp_processor_id() || > > tick_nohz_tick_stopped()) > > @@ -590,6 +592,11 @@ static bool wake_up_full_nohz_cpu(int cpu) > > return false; > > } > > > > +/* > > + * Wake up the specified CPU. If the CPU is going offline, it is the > > + * caller's responsibility to deal with the lost wakeup, for example, > > + * by hooking into the CPU_DEAD notifier like timers and hrtimers do. > > + */ > > I think it's more transparent than that for the caller. migrate_timers() is > called soon after and it takes care of waking up the destination of the migration > if necessary. So the caller shouldn't care after all. For the current two callers, agreed. But is this function designed to be only used by timers and hrtimers? If so, shouldn't there be some kind of comment to that effect? If not, shouldn't people considering use of this function be warned that it is not unconditional? > But the cpu_is_offline() check above may need a comment about that. Like this? if (cpu_is_offline(cpu)) return true; /* Don't try to wake offline CPUs. */ Thanx, Paul ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH RFC] sched: Make wake_up_nohz_cpu() handle CPUs going offline 2016-07-04 16:55 ` Paul E. McKenney @ 2016-07-12 14:41 ` Paul E. McKenney 0 siblings, 0 replies; 9+ messages in thread From: Paul E. McKenney @ 2016-07-12 14:41 UTC (permalink / raw) To: Frederic Weisbecker; +Cc: peterz, tglx, linux-kernel, rgkernel On Mon, Jul 04, 2016 at 09:55:34AM -0700, Paul E. McKenney wrote: > On Mon, Jul 04, 2016 at 02:21:43PM +0200, Frederic Weisbecker wrote: > > On Fri, Jul 01, 2016 at 05:15:06PM -0700, Paul E. McKenney wrote: > > > On Sat, Jul 02, 2016 at 01:49:56AM +0200, Frederic Weisbecker wrote: > > > > On Fri, Jul 01, 2016 at 11:40:54AM -0700, Paul E. McKenney wrote: [ . . . ] > > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > > > index 7f2cae4620c7..1a91fc733a0f 100644 > > > --- a/kernel/sched/core.c > > > +++ b/kernel/sched/core.c > > > @@ -580,6 +580,8 @@ static bool wake_up_full_nohz_cpu(int cpu) > > > * If needed we can still optimize that later with an > > > * empty IRQ. > > > */ > > > + if (cpu_is_offline(cpu)) > > > + return true; > > > > Preferably put this under the tick_nohz_full_cpu() below because > > it has a static key optimizations. Distros build NO_HZ_FULL but > > don't use it 99.99999% of the time. > > Except that I need to need to return "true" if the CPU is offline even > if it is not a tick_nohz_full_cpu(), don't I? > > In fact, the stack trace shows native_smp_send_reschedule(), which leads > me to believe that the splat happened via the "return false" at the bottom > of wake_up_full_nohz_cpu(). So I don't see a way to completely hide > the check under tick_nohz_full_cpu(). > > Or am I missing something here? > > > > if (tick_nohz_full_cpu(cpu)) { > > > if (cpu != smp_processor_id() || > > > tick_nohz_tick_stopped()) > > > @@ -590,6 +592,11 @@ static bool wake_up_full_nohz_cpu(int cpu) > > > return false; > > > } > > > > > > +/* > > > + * Wake up the specified CPU. If the CPU is going offline, it is the > > > + * caller's responsibility to deal with the lost wakeup, for example, > > > + * by hooking into the CPU_DEAD notifier like timers and hrtimers do. > > > + */ > > > > I think it's more transparent than that for the caller. migrate_timers() is > > called soon after and it takes care of waking up the destination of the migration > > if necessary. So the caller shouldn't care after all. > > For the current two callers, agreed. But is this function designed to > be only used by timers and hrtimers? If so, shouldn't there be some > kind of comment to that effect? If not, shouldn't people considering > use of this function be warned that it is not unconditional? > > > But the cpu_is_offline() check above may need a comment about that. > > Like this? > > if (cpu_is_offline(cpu)) > return true; /* Don't try to wake offline CPUs. */ Hearing no objections, I am proceeding as shown below. Thanx, Paul ------------------------------------------------------------------------ commit b55692986a87f0cea92ddfebe7181598f774c2c2 Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Date: Thu Jun 30 10:37:20 2016 -0700 sched: Make wake_up_nohz_cpu() handle CPUs going offline Both timers and hrtimers are maintained on the outgoing CPU until CPU_DEAD time, at which point they are migrated to a surviving CPU. If a mod_timer() executes between CPU_DYING and CPU_DEAD time, x86 systems will splat in native_smp_send_reschedule() when attempting to wake up the just-now-offlined CPU, as shown below from a NO_HZ_FULL kernel: [ 7976.741556] WARNING: CPU: 0 PID: 661 at /home/paulmck/public_git/linux-rcu/arch/x86/kernel/smp.c:125 native_smp_send_reschedule+0x39/0x40 [ 7976.741595] Modules linked in: [ 7976.741595] CPU: 0 PID: 661 Comm: rcu_torture_rea Not tainted 4.7.0-rc2+ #1 [ 7976.741595] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [ 7976.741595] 0000000000000000 ffff88000002fcc8 ffffffff8138ab2e 0000000000000000 [ 7976.741595] 0000000000000000 ffff88000002fd08 ffffffff8105cabc 0000007d1fd0ee18 [ 7976.741595] 0000000000000001 ffff88001fd16d40 ffff88001fd0ee00 ffff88001fd0ee00 [ 7976.741595] Call Trace: [ 7976.741595] [<ffffffff8138ab2e>] dump_stack+0x67/0x99 [ 7976.741595] [<ffffffff8105cabc>] __warn+0xcc/0xf0 [ 7976.741595] [<ffffffff8105cb98>] warn_slowpath_null+0x18/0x20 [ 7976.741595] [<ffffffff8103cba9>] native_smp_send_reschedule+0x39/0x40 [ 7976.741595] [<ffffffff81089bc2>] wake_up_nohz_cpu+0x82/0x190 [ 7976.741595] [<ffffffff810d275a>] internal_add_timer+0x7a/0x80 [ 7976.741595] [<ffffffff810d3ee7>] mod_timer+0x187/0x2b0 [ 7976.741595] [<ffffffff810c89dd>] rcu_torture_reader+0x33d/0x380 [ 7976.741595] [<ffffffff810c66f0>] ? sched_torture_read_unlock+0x30/0x30 [ 7976.741595] [<ffffffff810c86a0>] ? rcu_bh_torture_read_lock+0x80/0x80 [ 7976.741595] [<ffffffff8108068f>] kthread+0xdf/0x100 [ 7976.741595] [<ffffffff819dd83f>] ret_from_fork+0x1f/0x40 [ 7976.741595] [<ffffffff810805b0>] ? kthread_create_on_node+0x200/0x200 However, in this case, the wakeup is redundant, because the timer migration will reprogram timer hardware as needed. Note that the fact that preemption is disabled does not avoid the splat, as the offline operation has already passed both the synchronize_sched() and the stop_machine() that would be blocked by disabled preemption. This commit therefore modifies wake_up_nohz_cpu() to avoid attempting to wake up offline CPUs. It also adds a comment stating that the caller must tolerate lost wakeups when the target CPU is going offline, and suggesting the CPU_DEAD notifier as a recovery mechanism. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 7f2cae4620c7..26c01b2e9881 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -580,6 +580,8 @@ static bool wake_up_full_nohz_cpu(int cpu) * If needed we can still optimize that later with an * empty IRQ. */ + if (cpu_is_offline(cpu)) + return true; /* Don't try to wake offline CPUs. */ if (tick_nohz_full_cpu(cpu)) { if (cpu != smp_processor_id() || tick_nohz_tick_stopped()) @@ -590,6 +592,11 @@ static bool wake_up_full_nohz_cpu(int cpu) return false; } +/* + * Wake up the specified CPU. If the CPU is going offline, it is the + * caller's responsibility to deal with the lost wakeup, for example, + * by hooking into the CPU_DEAD notifier like timers and hrtimers do. + */ void wake_up_nohz_cpu(int cpu) { if (!wake_up_full_nohz_cpu(cpu)) ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH RFC] sched: Make wake_up_nohz_cpu() handle CPUs going offline 2016-06-30 23:29 ` Frederic Weisbecker 2016-07-01 18:40 ` Paul E. McKenney @ 2016-07-12 14:19 ` Peter Zijlstra 1 sibling, 0 replies; 9+ messages in thread From: Peter Zijlstra @ 2016-07-12 14:19 UTC (permalink / raw) To: Frederic Weisbecker; +Cc: Paul E. McKenney, tglx, linux-kernel, rgkernel On Fri, Jul 01, 2016 at 01:29:59AM +0200, Frederic Weisbecker wrote: > > void wake_up_nohz_cpu(int cpu) > > { > > - if (!wake_up_full_nohz_cpu(cpu)) > > + if (cpu_online(cpu) && !wake_up_full_nohz_cpu(cpu)) > > So at this point, as we passed CPU_DYING, I believe the CPU isn't visible in the domains > anymore (correct me if I'm wrong), So rebuilding the domains is an utter trainwreck atm. But I suspect that's wrong. Esp. with cpusets enabled we rebuild the domains very late from a workqueue. That is why the scheduler has cpu_active_mask to constrain the domains during hotplug. Now I need to go sort through that trainwreck because deadline needs it, but I've not had the opportunity :/ > therefore get_nohz_timer_target() can't return it, > unless smp_processor_id() is the only alternative. With the below that should be true I think. diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 6c0cdb5a73f8..b35cacbe9b9e 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -626,7 +626,7 @@ int get_nohz_timer_target(void) rcu_read_lock(); for_each_domain(cpu, sd) { - for_each_cpu(i, sched_domain_span(sd)) { + for_each_cpu_and(i, sched_domain_span(sd), cpu_active_mask) { if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) { cpu = i; goto unlock; ^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-07-12 14:41 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-06-30 17:58 [PATCH RFC] sched: Make wake_up_nohz_cpu() handle CPUs going offline Paul E. McKenney 2016-06-30 23:29 ` Frederic Weisbecker 2016-07-01 18:40 ` Paul E. McKenney 2016-07-01 23:49 ` Frederic Weisbecker 2016-07-02 0:15 ` Paul E. McKenney 2016-07-04 12:21 ` Frederic Weisbecker 2016-07-04 16:55 ` Paul E. McKenney 2016-07-12 14:41 ` Paul E. McKenney 2016-07-12 14:19 ` Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).