Re: [PATCH] sched_rt: fix overload bug on rt group scheduling

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Gregory Haskins <ghaskins@novell.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>, Steven Rostedt <rostedt@goodmis.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH] sched_rt: fix overload bug on rt group scheduling
Date: Thu, 02 Apr 2009 07:21:26 -0400	[thread overview]
Message-ID: <49D49FB6.3000101@novell.com> (raw)
In-Reply-To: <1238654901.8530.5373.camel@twins>

[-- Attachment #1: Type: text/plain, Size: 3890 bytes --]

Peter Zijlstra wrote:
> On Wed, 2009-04-01 at 20:58 -0400, Gregory Haskins wrote:
>   
>> Hi Peter,
>>
>> Peter Zijlstra wrote:
>>     
>>> Fixes an easily triggerable BUG() when setting process affinities.
>>>
>>> Make sure to count the number of migratable tasks in the same place:
>>> the root rt_rq. Otherwise the number doesn't make sense and we'll hit
>>> the BUG in set_cpus_allowed_rt().
>>>
>>> Also, make sure we only count tasks, not groups (this is probably
>>> already taken care of by the fact that rt_se->nr_cpus_allowed will be 0
>>> for groups, but be more explicit)
>>>
>>> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
>>> Tested-by: Thomas Gleixner <tglx@linutronix.de>
>>> CC: stable@kernel.org
>>> ---
>>>  kernel/sched_rt.c |   16 +++++++++++++++-
>>>  1 files changed, 15 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
>>> index de4469a..c1ee8dc 100644
>>> --- a/kernel/sched_rt.c
>>> +++ b/kernel/sched_rt.c
>>> @@ -10,6 +10,8 @@ static inline struct task_struct *rt_task_of(struct sched_rt_entity *rt_se)
>>>  
>>>  #ifdef CONFIG_RT_GROUP_SCHED
>>>  
>>> +#define rt_entity_is_task(rt_se) (!(rt_se)->my_q)
>>> +
>>>  static inline struct rq *rq_of_rt_rq(struct rt_rq *rt_rq)
>>>  {
>>>  	return rt_rq->rq;
>>> @@ -22,6 +24,8 @@ static inline struct rt_rq *rt_rq_of_se(struct sched_rt_entity *rt_se)
>>>  
>>>  #else /* CONFIG_RT_GROUP_SCHED */
>>>  
>>> +#define rt_entity_is_task(rt_se) (1)
>>> +
>>>  static inline struct rq *rq_of_rt_rq(struct rt_rq *rt_rq)
>>>  {
>>>  	return container_of(rt_rq, struct rq, rt);
>>> @@ -73,7 +77,7 @@ static inline void rt_clear_overload(struct rq *rq)
>>>  
>>>  static void update_rt_migration(struct rt_rq *rt_rq)
>>>  {
>>> -	if (rt_rq->rt_nr_migratory && (rt_rq->rt_nr_running > 1)) {
>>> +	if (rt_rq->rt_nr_migratory > 1) {
>>>   
>>>       
>> The rest of the patch is making sense to me, but I am a little concerned
>> about this change.
>>
>> The original logic was designed to catch the condition when you might
>> have a non-migratory task running, and a migratory task queued.   This
>> would mean nr_running == 2, and nr_migratory == 1, which is eligible for
>> overload handling.  (Of course, the opposite could be true..the
>> migratory is running and the non-migratory is queued...we cannot discern
>> the difference here and we go into overload anyway.  This is just
>> suboptimal but functionally correct).
>>
>> What can happen now is you could have that above condition but we will
>> not go into overload unless there is at least two migratory tasks
>> queued.  This will undoubtedly allow a potential scheduling latency on
>> task #2.
>>
>> I think we really need to qualify overload on both running > 1 and at
>> least one migratory task.  Is there a way to get this state, even if by
>> other means?
>>     
>
> Ah, yes, I missed that bit. I ripped out the rt_nr_running because I 1)
> didn't think of this, and 2) rt_nr_running is accounted per rt_rq, not
> per-cpu, so it doesn't match.
>
> Since rt_nr_running is also used in a per rt_rq setting, changing that
> isn't possible and we'd need to introduce another per-cpu variant is you
> want to re-instate this.
>   
Yeah, I actually don't care if its literally a nr_running stat
reinstated, or some other way to restore "correctness" ;)

Double bonus if you can solve that problem I mentioned above where I
can't tell if its really eligible for overload in all cases (but goes
into overload anyway to be conservative).  I had been thinking of doing
something like subtracting the nr_migration number when a migratory task
is put on the cpu.  But this is kind of messy because you need to handle
all the places that can manipulate nr_migratory to make sure it doesnt
break.

Thanks Peter!
-Greg




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

next prev parent reply	other threads:[~2009-04-02 11:19 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-01 16:40 [PATCH] sched_rt: fix overload bug on rt group scheduling Peter Zijlstra
2009-04-02  0:58 ` Gregory Haskins
2009-04-02  6:48   ` Peter Zijlstra
2009-04-02 11:21     ` Gregory Haskins [this message]
2009-07-08 15:37       ` [PATCH] sched_rt: fix overload bug on rt group scheduling -v2 Peter Zijlstra
2009-07-08 15:54         ` Gregory Haskins
2009-07-10  4:05         ` Gregory Haskins
2009-07-10 10:41         ` [tip:sched/urgent] sched_rt: Fix overload bug on rt group scheduling tip-bot for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49D49FB6.3000101@novell.com \
    --to=ghaskins@novell.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox