* [RFC][PATCH 0/2] sched: Allow interrupts to be enabled in idle balance @ 2012-12-22 0:30 Steven Rostedt 2012-12-22 0:30 ` [RFC][PATCH 1/2] sched: Move idle_balance() to post_schedule Steven Rostedt 2012-12-22 0:30 ` [RFC][PATCH 2/2] sched: Enable interrupts in idle_balance() Steven Rostedt 0 siblings, 2 replies; 5+ messages in thread From: Steven Rostedt @ 2012-12-22 0:30 UTC (permalink / raw) To: linux-kernel Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra, Frederic Weisbecker, Clark Williams The idle_balance() code is treated special in the scheduler for no real apparent reason. There's an unlikely check for !rq->nr_running in the hot path of the scheduler, for just the idle load balancing. As sched classes can have their own post schedule routine, and the idle task has its own scheduler class, we can move the idle_balance() into the idle task's post schedule routine, and out of the hot path of the scheduler (no sepecial check). An added benefit for doing this is that this enables us to enable interrupts in the idle balance code. Shrinking the time interrupts are disabled while doing load balancing. These patches have been thoroughly tested (on a 40 core box as well as a 4 core and UP config), and are prime for inclusion (for 3.9). But I'm posted them now as an RFC for comments. Thanks! -- Steve ^ permalink raw reply [flat|nested] 5+ messages in thread
* [RFC][PATCH 1/2] sched: Move idle_balance() to post_schedule 2012-12-22 0:30 [RFC][PATCH 0/2] sched: Allow interrupts to be enabled in idle balance Steven Rostedt @ 2012-12-22 0:30 ` Steven Rostedt 2012-12-22 1:00 ` Steven Rostedt 2013-01-05 14:14 ` Frederic Weisbecker 2012-12-22 0:30 ` [RFC][PATCH 2/2] sched: Enable interrupts in idle_balance() Steven Rostedt 1 sibling, 2 replies; 5+ messages in thread From: Steven Rostedt @ 2012-12-22 0:30 UTC (permalink / raw) To: linux-kernel Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra, Frederic Weisbecker, Clark Williams [-- Attachment #1: idle-balance-post-sched.patch --] [-- Type: text/plain, Size: 2638 bytes --] The idle_balance() code is called to do task load balancing just before going to idle. This makes sense as the CPU is about to sleep anyway. But currently it's called in the middle of the scheduler and in a place that must have interrupts disabled. That means, while the load balancing is going on, if a task wakes up on this CPU, it wont get to run while the interrupts are disabled. The idle task doing the balancing will be clueless about it. There's no real reason that the idle_balance() needs to be called in the middle of schedule anyway. The only benefit is that if a task is pulled to this CPU, it can be scheduled without the need to schedule the idle task. But load balancing and migrating the task makes a switch to idle and back negligible. By using the post_schedule function pointer of the sched class, the unlikely branch in the hot path of the scheduler can be removed, and the idle task itself can do the load balancing. Another advantage of this, is that by moving the idle_balance to the post_schedule routine, interrupts can now be enabled in the load balance allowing for interrupts and wakeups to still occur on that CPU while a balance is taking place. The enabling of interrupts will come as a separate patch. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Index: linux-trace.git/kernel/sched/core.c =================================================================== --- linux-trace.git.orig/kernel/sched/core.c +++ linux-trace.git/kernel/sched/core.c @@ -2926,9 +2926,6 @@ need_resched: pre_schedule(rq, prev); - if (unlikely(!rq->nr_running)) - idle_balance(cpu, rq); - put_prev_task(rq, prev); next = pick_next_task(rq); clear_tsk_need_resched(prev); Index: linux-trace.git/kernel/sched/idle_task.c =================================================================== --- linux-trace.git.orig/kernel/sched/idle_task.c +++ linux-trace.git/kernel/sched/idle_task.c @@ -25,6 +25,7 @@ static void check_preempt_curr_idle(stru static struct task_struct *pick_next_task_idle(struct rq *rq) { schedstat_inc(rq, sched_goidle); + rq->post_schedule = 1; return rq->idle; } @@ -69,6 +70,12 @@ static unsigned int get_rr_interval_idle return 0; } + +static void post_schedule_idle(struct rq *rq) +{ + idle_balance(smp_processor_id(), rq); +} + /* * Simple, special scheduling class for the per-CPU idle tasks: */ @@ -91,6 +98,8 @@ const struct sched_class idle_sched_clas .set_curr_task = set_curr_task_idle, .task_tick = task_tick_idle, + .post_schedule = post_schedule_idle, + .get_rr_interval = get_rr_interval_idle, .prio_changed = prio_changed_idle, ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC][PATCH 1/2] sched: Move idle_balance() to post_schedule 2012-12-22 0:30 ` [RFC][PATCH 1/2] sched: Move idle_balance() to post_schedule Steven Rostedt @ 2012-12-22 1:00 ` Steven Rostedt 2013-01-05 14:14 ` Frederic Weisbecker 1 sibling, 0 replies; 5+ messages in thread From: Steven Rostedt @ 2012-12-22 1:00 UTC (permalink / raw) To: linux-kernel Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra, Frederic Weisbecker, Clark Williams --- So much for saying it was well tested. As my email box is separate from the development boxes, I didn't copy the write set of patches over. This patch was tested thoroughly on 40 core and 4 core, but not UP. The UP had some bugs that were fixed (but not sent :-p) Here's the correct patch with the UP fixes. The other patch shouldn't be affected. --- sched: Move idle_balance() to post_schedule The idle_balance() code is called to do task load balancing just before going to idle. This makes sense as the CPU is about to sleep anyway. But currently it's called in the middle of the scheduler and in a place that must have interrupts disabled. That means, while the load balancing is going on, if a task wakes up on this CPU, it wont get to run while the interrupts are disabled. The idle task doing the balancing will be clueless about it. There's no real reason that the idle_balance() needs to be called in the middle of schedule anyway. The only benefit is that if a task is pulled to this CPU, it can be scheduled without the need to schedule the idle task. But load balancing and migrating the task makes a switch to idle and back negligible. By using the post_schedule function pointer of the sched class, the unlikely branch in the hot path of the scheduler can be removed, and the idle task itself can do the load balancing. Another advantage of this, is that by moving the idle_balance to the post_schedule routine, interrupts can now be enabled in the load balance allowing for interrupts and wakeups to still occur on that CPU while a balance is taking place. The enabling of interrupts will come as a separate patch. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Index: linux-trace.git/kernel/sched/core.c =================================================================== --- linux-trace.git.orig/kernel/sched/core.c +++ linux-trace.git/kernel/sched/core.c @@ -2926,9 +2926,6 @@ need_resched: pre_schedule(rq, prev); - if (unlikely(!rq->nr_running)) - idle_balance(cpu, rq); - put_prev_task(rq, prev); next = pick_next_task(rq); clear_tsk_need_resched(prev); Index: linux-trace.git/kernel/sched/idle_task.c =================================================================== --- linux-trace.git.orig/kernel/sched/idle_task.c +++ linux-trace.git/kernel/sched/idle_task.c @@ -13,6 +13,11 @@ select_task_rq_idle(struct task_struct * { return task_cpu(p); /* IDLE tasks as never migrated */ } + +static void post_schedule_idle(struct rq *rq) +{ + idle_balance(smp_processor_id(), rq); +} #endif /* CONFIG_SMP */ /* * Idle tasks are unconditionally rescheduled: @@ -25,6 +30,10 @@ static void check_preempt_curr_idle(stru static struct task_struct *pick_next_task_idle(struct rq *rq) { schedstat_inc(rq, sched_goidle); +#ifdef CONFIG_SMP + /* Trigger the post schedule to do an idle_balance */ + rq->post_schedule = 1; +#endif return rq->idle; } @@ -86,6 +95,7 @@ const struct sched_class idle_sched_clas #ifdef CONFIG_SMP .select_task_rq = select_task_rq_idle, + .post_schedule = post_schedule_idle, #endif .set_curr_task = set_curr_task_idle, ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC][PATCH 1/2] sched: Move idle_balance() to post_schedule 2012-12-22 0:30 ` [RFC][PATCH 1/2] sched: Move idle_balance() to post_schedule Steven Rostedt 2012-12-22 1:00 ` Steven Rostedt @ 2013-01-05 14:14 ` Frederic Weisbecker 1 sibling, 0 replies; 5+ messages in thread From: Frederic Weisbecker @ 2013-01-05 14:14 UTC (permalink / raw) To: Steven Rostedt Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra, Clark Williams 2012/12/22 Steven Rostedt <rostedt@goodmis.org>: > The idle_balance() code is called to do task load balancing just before > going to idle. This makes sense as the CPU is about to sleep anyway. > But currently it's called in the middle of the scheduler and in a place > that must have interrupts disabled. That means, while the load balancing > is going on, if a task wakes up on this CPU, it wont get to run while > the interrupts are disabled. The idle task doing the balancing will be > clueless about it. > > There's no real reason that the idle_balance() needs to be called in the > middle of schedule anyway. The only benefit is that if a task is pulled > to this CPU, it can be scheduled without the need to schedule the idle > task. But load balancing and migrating the task makes a switch to idle > and back negligible. This cleanup looks nice as it does not only let us enable interrupts there but also debloats a bit the schedule() code from idle specific code. So it would be a pity if the optimization that goes away with your cleanup has any measurable impact. Is there any sensible benchmark that can be run against this patch? Something that may involve a lot of back and forth to idle with some bunch of tasks running around on other CPUs? ^ permalink raw reply [flat|nested] 5+ messages in thread
* [RFC][PATCH 2/2] sched: Enable interrupts in idle_balance() 2012-12-22 0:30 [RFC][PATCH 0/2] sched: Allow interrupts to be enabled in idle balance Steven Rostedt 2012-12-22 0:30 ` [RFC][PATCH 1/2] sched: Move idle_balance() to post_schedule Steven Rostedt @ 2012-12-22 0:30 ` Steven Rostedt 1 sibling, 0 replies; 5+ messages in thread From: Steven Rostedt @ 2012-12-22 0:30 UTC (permalink / raw) To: linux-kernel Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra, Frederic Weisbecker, Clark Williams [-- Attachment #1: idle-balance-resched.patch --] [-- Type: text/plain, Size: 1241 bytes --] Now that the idle_balance is called from the post_schedule of the idle task sched class, it is safe to enable interrupts. This allows for better interaction of tasks waking up and other interrupts that are triggered while the idle balance is in process. Preemption is still disabled, but perhaps that can change as well. That may need some more investigation. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Index: linux-trace.git/kernel/sched/fair.c =================================================================== --- linux-trace.git.orig/kernel/sched/fair.c +++ linux-trace.git/kernel/sched/fair.c @@ -5231,9 +5231,10 @@ void idle_balance(int this_cpu, struct r update_rq_runnable_avg(this_rq, 1); /* - * Drop the rq->lock, but keep IRQ/preempt disabled. + * Drop the rq->lock, but keep preempt disabled. */ - raw_spin_unlock(&this_rq->lock); + preempt_disable(); + raw_spin_unlock_irq(&this_rq->lock); update_blocked_averages(this_cpu); rcu_read_lock(); @@ -5260,7 +5261,8 @@ void idle_balance(int this_cpu, struct r } rcu_read_unlock(); - raw_spin_lock(&this_rq->lock); + raw_spin_lock_irq(&this_rq->lock); + preempt_enable(); if (pulled_task || time_after(jiffies, this_rq->next_balance)) { /* ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-01-05 14:14 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-12-22 0:30 [RFC][PATCH 0/2] sched: Allow interrupts to be enabled in idle balance Steven Rostedt 2012-12-22 0:30 ` [RFC][PATCH 1/2] sched: Move idle_balance() to post_schedule Steven Rostedt 2012-12-22 1:00 ` Steven Rostedt 2013-01-05 14:14 ` Frederic Weisbecker 2012-12-22 0:30 ` [RFC][PATCH 2/2] sched: Enable interrupts in idle_balance() Steven Rostedt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox