public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Gregory Haskins" <ghaskins@novell.com>
To: "Peter Zijlstra" <peterz@infradead.org>
Cc: <mingo@elte.hu>, <rostedt@goodmis.org>, <tglx@linutronix.de>,
	"David Bahi" <DBahi@novell.com>, <linux-kernel@vger.kernel.org>,
	<linux-rt-users@vger.kernel.org>
Subject: Re: [PATCH 1/3] sched: enable interrupts and drop rq-lock duringnewidle balancing
Date: Tue, 24 Jun 2008 07:15:00 -0600	[thread overview]
Message-ID: <4860BB14.BA47.005A.0@novell.com> (raw)
In-Reply-To: <1214302405.4351.21.camel@twins>

>>> On Tue, Jun 24, 2008 at  6:13 AM, in message <1214302405.4351.21.camel@twins>,
Peter Zijlstra <peterz@infradead.org> wrote: 
> On Mon, 2008-06-23 at 17:04 -0600, Gregory Haskins wrote:
>> We do find_busiest_groups() et. al. without locks held for normal balancing,
>> so lets do it for newidle as well.  It will allow other cpus to make
>> forward progress (against our RQ) while we try to balance and allow 
>> some interrupts to occur.
> 
> Is running f_b_g really that expensive? 

According to our oprofile data, yes.  I speculate that it works out that way because most newidle
attempts result in "no imbalance".  But we were spending ~60%+ time in find_busiest_groups()
because of all the heavy-context switching that goes on in PREEMPT_RT.  So while f_b_g() is
probably cheaper than double-lock/move_tasks(), the ratio of occurrence is off the charts in
comparison. Prior to this patch, those occurrences were preempt-disabled/irq-disabled/rq->lock critical
sections.

So while it is not clear if f_b_g() is the actual cost, it is a convenient (and legal, afaict) place to
deterministically reduce the rq->lock scope.  Additionally, doing so measurably helps
performance, so I think its a win.  Without this patch you have to hope the double_lock releases
this_rq, and even so were not checking for the NEEDS_RESCHED. 

Note: I have a refresh of this patch coming shortly, and I will drop the one you NAKed

Thanks Peter!

Regards,
-Greg

> 
>> Signed-off-by: Gregory Haskins <ghaskins@novell.com>
>> ---
>> 
>>  kernel/sched.c |   44 ++++++++++++++++++++++++++++++++++++++------
>>  1 files changed, 38 insertions(+), 6 deletions(-)
>> 
>> diff --git a/kernel/sched.c b/kernel/sched.c
>> index 31f91d9..490e6bc 100644
>> --- a/kernel/sched.c
>> +++ b/kernel/sched.c
>> @@ -3333,6 +3333,16 @@ load_balance_newidle(int this_cpu, struct rq *this_rq, 
> struct sched_domain *sd)
>>  	int sd_idle = 0;
>>  	int all_pinned = 0;
>>  	cpumask_t cpus = CPU_MASK_ALL;
>> +	int nr_running;
>> +
>> +	schedstat_inc(sd, lb_count[CPU_NEWLY_IDLE]);
>> +
>> +	/*
>> +	 * We are in a preempt-disabled section, so dropping the lock/irq
>> +	 * here simply means that other cores may acquire the lock,
>> +	 * and interrupts may occur.
>> +	 */
>> +	spin_unlock_irq(&this_rq->lock);
>>  
>>  	/*
>>  	 * When power savings policy is enabled for the parent domain, idle
>> @@ -3344,7 +3354,6 @@ load_balance_newidle(int this_cpu, struct rq *this_rq, 
> struct sched_domain *sd)
>>  	    !test_sd_parent(sd, SD_POWERSAVINGS_BALANCE))
>>  		sd_idle = 1;
>>  
>> -	schedstat_inc(sd, lb_count[CPU_NEWLY_IDLE]);
>>  redo:
>>  	group = find_busiest_group(sd, this_cpu, &imbalance, CPU_NEWLY_IDLE,
>>  				   &sd_idle, &cpus, NULL);
>> @@ -3366,14 +3375,33 @@ redo:
>>  
>>  	ld_moved = 0;
>>  	if (busiest->nr_running > 1) {
>> -		/* Attempt to move tasks */
>> -		double_lock_balance(this_rq, busiest);
>> -		/* this_rq->clock is already updated */
>> -		update_rq_clock(busiest);
>> +		local_irq_disable();
>> +		double_rq_lock(this_rq, busiest);
>> +
>> +		BUG_ON(this_cpu != smp_processor_id());
>> +
>> +		/*
>> +		 * Checking rq->nr_running covers both the case where
>> +		 * newidle-balancing pulls a task, as well as if something
>> +		 * else issued a NEEDS_RESCHED (since we would only need
>> +		 * a reschedule if something was moved to us)
>> +		 */
>> +		if (this_rq->nr_running) {
>> +			double_rq_unlock(this_rq, busiest);
>> +			local_irq_enable();
>> +			goto out_balanced;
>> +		}
>> +
>>  		ld_moved = move_tasks(this_rq, this_cpu, busiest,
>>  					imbalance, sd, CPU_NEWLY_IDLE,
>>  					&all_pinned);
>> -		spin_unlock(&busiest->lock);
>> +
>> +		nr_running = this_rq->nr_running;
>> +		double_rq_unlock(this_rq, busiest);
>> +		local_irq_enable();
>> +
>> +		if (nr_running)
>> +			goto out_balanced;
>>  
>>  		if (unlikely(all_pinned)) {
>>  			cpu_clear(cpu_of(busiest), cpus);
>> @@ -3382,6 +3410,8 @@ redo:
>>  		}
>>  	}
>>  
>> +	spin_lock_irq(&this_rq->lock);
>> +
>>  	if (!ld_moved) {
>>  		schedstat_inc(sd, lb_failed[CPU_NEWLY_IDLE]);
>>  		if (!sd_idle && sd->flags & SD_SHARE_CPUPOWER &&
>> @@ -3393,6 +3423,8 @@ redo:
>>  	return ld_moved;
>>  
>>  out_balanced:
>> +	spin_lock_irq(&this_rq->lock);
>> +
>>  	schedstat_inc(sd, lb_balanced[CPU_NEWLY_IDLE]);
>>  	if (!sd_idle && sd->flags & SD_SHARE_CPUPOWER &&
>>  	    !test_sd_parent(sd, SD_POWERSAVINGS_BALANCE))
>> 



  reply	other threads:[~2008-06-24 12:15 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-23 23:04 [PATCH 0/3] RT: scheduler newidle enhancements Gregory Haskins
2008-06-23 23:04 ` [PATCH 1/3] sched: enable interrupts and drop rq-lock during newidle balancing Gregory Haskins
2008-06-24  0:11   ` Steven Rostedt
2008-06-24 10:13   ` Peter Zijlstra
2008-06-24 13:15     ` Gregory Haskins [this message]
2008-06-24 12:24       ` [PATCH 1/3] sched: enable interrupts and drop rq-lock duringnewidle balancing Peter Zijlstra
2008-06-24 12:39         ` [PATCH 1/3] sched: enable interrupts and drop rq-lockduringnewidle balancing Gregory Haskins
2008-06-23 23:04 ` [PATCH 2/3] sched: only run newidle if previous task was CFS Gregory Haskins
2008-06-24  9:58   ` Peter Zijlstra
2008-06-24 10:38     ` Peter Zijlstra
2008-06-23 23:04 ` [PATCH 3/3] sched: terminate newidle balancing once at least one task has moved over Gregory Haskins
2008-06-24  0:50   ` Nick Piggin
2008-06-24  1:07     ` Steven Rostedt
2008-06-24  1:26       ` Nick Piggin
2008-06-24  2:39     ` Gregory Haskins
2008-06-24  1:46       ` Nick Piggin
2008-06-24  2:59         ` Gregory Haskins
2008-06-24 10:13   ` Peter Zijlstra
2008-06-24 13:18     ` [PATCH 3/3] sched: terminate newidle balancing once at leastone " Gregory Haskins
2008-06-24 13:31       ` Peter Zijlstra
2008-06-24 16:55         ` [PATCH 3/3] sched: terminate newidle balancing once atleastone " Gregory Haskins
2008-06-24 19:44           ` Peter Zijlstra
2008-06-24  0:15 ` [PATCH 0/3] RT: scheduler newidle enhancements Steven Rostedt
2008-06-24  1:51 ` Gregory Haskins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4860BB14.BA47.005A.0@novell.com \
    --to=ghaskins@novell.com \
    --cc=DBahi@novell.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox