All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Gregory Haskins" <ghaskins@novell.com>
To: "Peter Zijlstra" <peterz@infradead.org>
Cc: <mingo@elte.hu>, <rostedt@goodmis.org>, <tglx@linutronix.de>,
	"David Bahi" <DBahi@novell.com>, <linux-kernel@vger.kernel.org>,
	<linux-rt-users@vger.kernel.org>
Subject: Re: [PATCH 1/3] sched: enable interrupts and drop rq-lock duringnewidle balancing
Date: Tue, 24 Jun 2008 07:15:00 -0600	[thread overview]
Message-ID: <4860BB14.BA47.005A.0@novell.com> (raw)
In-Reply-To: <1214302405.4351.21.camel@twins>

>>> On Tue, Jun 24, 2008 at  6:13 AM, in message <1214302405.4351.21.camel@twins>,
Peter Zijlstra <peterz@infradead.org> wrote: 
> On Mon, 2008-06-23 at 17:04 -0600, Gregory Haskins wrote:
>> We do find_busiest_groups() et. al. without locks held for normal balancing,
>> so lets do it for newidle as well.  It will allow other cpus to make
>> forward progress (against our RQ) while we try to balance and allow 
>> some interrupts to occur.
> 
> Is running f_b_g really that expensive? 

According to our oprofile data, yes.  I speculate that it works out that way because most newidle
attempts result in "no imbalance".  But we were spending ~60%+ time in find_busiest_groups()
because of all the heavy-context switching that goes on in PREEMPT_RT.  So while f_b_g() is
probably cheaper than double-lock/move_tasks(), the ratio of occurrence is off the charts in
comparison. Prior to this patch, those occurrences were preempt-disabled/irq-disabled/rq->lock critical
sections.

So while it is not clear if f_b_g() is the actual cost, it is a convenient (and legal, afaict) place to
deterministically reduce the rq->lock scope.  Additionally, doing so measurably helps
performance, so I think its a win.  Without this patch you have to hope the double_lock releases
this_rq, and even so were not checking for the NEEDS_RESCHED. 

Note: I have a refresh of this patch coming shortly, and I will drop the one you NAKed

Thanks Peter!

Regards,
-Greg

> 
>> Signed-off-by: Gregory Haskins <ghaskins@novell.com>
>> ---
>> 
>>  kernel/sched.c |   44 ++++++++++++++++++++++++++++++++++++++------
>>  1 files changed, 38 insertions(+), 6 deletions(-)
>> 
>> diff --git a/kernel/sched.c b/kernel/sched.c
>> index 31f91d9..490e6bc 100644
>> --- a/kernel/sched.c
>> +++ b/kernel/sched.c
>> @@ -3333,6 +3333,16 @@ load_balance_newidle(int this_cpu, struct rq *this_rq, 
> struct sched_domain *sd)
>>  	int sd_idle = 0;
>>  	int all_pinned = 0;
>>  	cpumask_t cpus = CPU_MASK_ALL;
>> +	int nr_running;
>> +
>> +	schedstat_inc(sd, lb_count[CPU_NEWLY_IDLE]);
>> +
>> +	/*
>> +	 * We are in a preempt-disabled section, so dropping the lock/irq
>> +	 * here simply means that other cores may acquire the lock,
>> +	 * and interrupts may occur.
>> +	 */
>> +	spin_unlock_irq(&this_rq->lock);
>>  
>>  	/*
>>  	 * When power savings policy is enabled for the parent domain, idle
>> @@ -3344,7 +3354,6 @@ load_balance_newidle(int this_cpu, struct rq *this_rq, 
> struct sched_domain *sd)
>>  	    !test_sd_parent(sd, SD_POWERSAVINGS_BALANCE))
>>  		sd_idle = 1;
>>  
>> -	schedstat_inc(sd, lb_count[CPU_NEWLY_IDLE]);
>>  redo:
>>  	group = find_busiest_group(sd, this_cpu, &imbalance, CPU_NEWLY_IDLE,
>>  				   &sd_idle, &cpus, NULL);
>> @@ -3366,14 +3375,33 @@ redo:
>>  
>>  	ld_moved = 0;
>>  	if (busiest->nr_running > 1) {
>> -		/* Attempt to move tasks */
>> -		double_lock_balance(this_rq, busiest);
>> -		/* this_rq->clock is already updated */
>> -		update_rq_clock(busiest);
>> +		local_irq_disable();
>> +		double_rq_lock(this_rq, busiest);
>> +
>> +		BUG_ON(this_cpu != smp_processor_id());
>> +
>> +		/*
>> +		 * Checking rq->nr_running covers both the case where
>> +		 * newidle-balancing pulls a task, as well as if something
>> +		 * else issued a NEEDS_RESCHED (since we would only need
>> +		 * a reschedule if something was moved to us)
>> +		 */
>> +		if (this_rq->nr_running) {
>> +			double_rq_unlock(this_rq, busiest);
>> +			local_irq_enable();
>> +			goto out_balanced;
>> +		}
>> +
>>  		ld_moved = move_tasks(this_rq, this_cpu, busiest,
>>  					imbalance, sd, CPU_NEWLY_IDLE,
>>  					&all_pinned);
>> -		spin_unlock(&busiest->lock);
>> +
>> +		nr_running = this_rq->nr_running;
>> +		double_rq_unlock(this_rq, busiest);
>> +		local_irq_enable();
>> +
>> +		if (nr_running)
>> +			goto out_balanced;
>>  
>>  		if (unlikely(all_pinned)) {
>>  			cpu_clear(cpu_of(busiest), cpus);
>> @@ -3382,6 +3410,8 @@ redo:
>>  		}
>>  	}
>>  
>> +	spin_lock_irq(&this_rq->lock);
>> +
>>  	if (!ld_moved) {
>>  		schedstat_inc(sd, lb_failed[CPU_NEWLY_IDLE]);
>>  		if (!sd_idle && sd->flags & SD_SHARE_CPUPOWER &&
>> @@ -3393,6 +3423,8 @@ redo:
>>  	return ld_moved;
>>  
>>  out_balanced:
>> +	spin_lock_irq(&this_rq->lock);
>> +
>>  	schedstat_inc(sd, lb_balanced[CPU_NEWLY_IDLE]);
>>  	if (!sd_idle && sd->flags & SD_SHARE_CPUPOWER &&
>>  	    !test_sd_parent(sd, SD_POWERSAVINGS_BALANCE))
>> 



  reply	other threads:[~2008-06-24 12:14 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-23 23:04 [PATCH 0/3] RT: scheduler newidle enhancements Gregory Haskins
2008-06-23 23:04 ` [PATCH 1/3] sched: enable interrupts and drop rq-lock during newidle balancing Gregory Haskins
2008-06-24  0:11   ` Steven Rostedt
2008-06-24 10:13   ` Peter Zijlstra
2008-06-24 13:15     ` Gregory Haskins [this message]
2008-06-24 12:24       ` [PATCH 1/3] sched: enable interrupts and drop rq-lock duringnewidle balancing Peter Zijlstra
2008-06-24 12:39         ` [PATCH 1/3] sched: enable interrupts and drop rq-lockduringnewidle balancing Gregory Haskins
2008-06-23 23:04 ` [PATCH 2/3] sched: only run newidle if previous task was CFS Gregory Haskins
2008-06-24  9:58   ` Peter Zijlstra
2008-06-24 10:38     ` Peter Zijlstra
2008-06-23 23:04 ` [PATCH 3/3] sched: terminate newidle balancing once at least one task has moved over Gregory Haskins
2008-06-24  0:50   ` Nick Piggin
2008-06-24  1:07     ` Steven Rostedt
2008-06-24  1:26       ` Nick Piggin
2008-06-24  2:39     ` Gregory Haskins
2008-06-24  1:46       ` Nick Piggin
2008-06-24  2:59         ` Gregory Haskins
2008-06-24 10:13   ` Peter Zijlstra
2008-06-24 13:18     ` [PATCH 3/3] sched: terminate newidle balancing once at leastone " Gregory Haskins
2008-06-24 13:31       ` Peter Zijlstra
2008-06-24 16:55         ` [PATCH 3/3] sched: terminate newidle balancing once atleastone " Gregory Haskins
2008-06-24 19:44           ` Peter Zijlstra
2008-06-24  0:15 ` [PATCH 0/3] RT: scheduler newidle enhancements Steven Rostedt
2008-06-24  1:51 ` Gregory Haskins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4860BB14.BA47.005A.0@novell.com \
    --to=ghaskins@novell.com \
    --cc=DBahi@novell.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.