All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>,
	linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org,
	hpj@urpla.net, stable <stable@kernel.org>,
	Dmitry Adamushko <dmitry.adamushko@gmail.com>
Subject: Re: [PATCH] sched: fix race in schedule
Date: Tue, 11 Mar 2008 10:10:36 -0700	[thread overview]
Message-ID: <47D6BD0C.6070401@ct.jp.nec.com> (raw)
In-Reply-To: <1205224835.8514.183.camel@twins>

Peter Zijlstra wrote:
> On Mon, 2008-03-10 at 19:12 -0700, Hiroshi Shimamoto wrote:
>> Peter Zijlstra wrote:
>>> On Mon, 2008-03-10 at 13:01 -0700, Hiroshi Shimamoto wrote:
>>>
>>>> thanks, your patch looks nice to me.
>>>> I had focused setprio, on_rq=0 and running=1 situation, it makes me to
>>>> fix these functions.
>>>> But one point, I've just noticed. I'm not sure on same situation against
>>>> sched_rt. I think the pre_schedule() of rt has chance to drop rq lock.
>>>> Is it OK?
>>> Ah, you are quite right, that'll teach me to rush out a patch just
>>> because dinner is ready :-). 
>>>
>>> How about we submit the following patch for mainline and CC -stable to
>>> fix .23 and .24:
>>>
>> Unfortunately, I encountered similar panic with this patch on -rt.
>> I'll look into this, again. I might have missed something...
>>
>> Unable to handle kernel NULL pointer dereference at 0000000000000128 RIP:
>>  [<ffffffff802297f5>] pick_next_task_fair+0x2d/0x42
> 
> :-(
> 
> OK, so that means I'm not getting it.
> 
> So what does your patch do that mine doesn't?
> 
> It removes the dependency of running (=task_current()) from on_rq
> (p->se.on_rq).
> 
> So how can a current task not be on the runqueue?
> 
> Only sched.c:dequeue_task() and sched_fair.c:account_entity_dequeue()
> set on_rq to 0, the only one changing rq->curr is schedule().
> 
> So the only scheme I can come up with is that we did dequeue p (on_rq ==
> 0), but we didn't yet schedule so rq->curr == p.
> 
> Is this how you ended up with your previuos analysis that it must be due
> to a hole introduced by double_lock_balance()?
> 
> Because now we can seemingly call deactivate_task() and put_prev_task()
> in non-atomic fashion, but by placing the put_prev_task() before the
> load balance calls we should avoid doing that.
> 
> So what else is going on... /me puzzled

thanks for taking time.
I've tested it with debug tracer, it took several hours to half day to
reproduce it on my box. And I got the possible scenario.

Before begin, I can tell that se->on_rq is changed at enqueue_task() or
dequeue_task() in sched.c.

Here is the flow to panic which I got;
   CPU0                               CPU1
    |                              schedule()
    |                              ->deactivate_task()
    |                              -->dequeue_task()
    |                                  * on_rq=0
    |                              ->put_prev_task_fair()
    |                              ->idle_balance()
    |                              -->load_balance_newidle()
(a wakeup function)                    |
    |                              --->double_lock_balance()
    *get lock                          *rel lock
    * wake up target is CPU1's curr    |
->enqueue_task()                       |
    * on_rq=1                          |
->rt_mutex_setprio()                   |
    * on_rq=1, ruuning=1               |
-->dequeue_task()!!                    |
-->put_prev_task_fair()!!              |
    * sched_class is changed           |
-->set_curr_task_fair()!!              |
-->enqueue_task()!!                    |
    *rel lock                          *get lock again
    :                                  |
                                       :
                                   ->pick_next_task_fair()
                                       => panic

The difference from the previous scenario is;
some one enqueues CPU1,s current task before setprio,
it makes on_rq=1 and it causes set_curr_task_fair() called.
It means that the cfs_rq state of the current task is set to
before put_prev_task_fair(), I think.

Does this scenario help to solve this issue?
And I only test it on -rt because I can reproduce it on -rt only.

thanks,
Hiroshi Shimamoto

  reply	other threads:[~2008-03-11 17:10 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-10 18:01 [PATCH] sched: fix race in schedule Hiroshi Shimamoto
2008-03-10 18:36 ` Peter Zijlstra
2008-03-10 20:01   ` Hiroshi Shimamoto
2008-03-10 20:34     ` Peter Zijlstra
2008-03-10 20:54       ` Hiroshi Shimamoto
2008-03-10 21:01         ` Peter Zijlstra
2008-03-10 21:07           ` Hiroshi Shimamoto
2008-03-11  2:12       ` Hiroshi Shimamoto
2008-03-11  8:40         ` Peter Zijlstra
2008-03-11 17:10           ` Hiroshi Shimamoto [this message]
2008-03-11 23:38             ` Dmitry Adamushko
2008-03-12 13:27               ` Peter Zijlstra
2008-03-12 14:48                 ` Dmitry Adamushko
2008-03-12 14:57                   ` Peter Zijlstra
2008-03-14 17:58                     ` Hiroshi Shimamoto
2008-03-14 22:47                       ` Dmitry Adamushko
2008-03-14 22:57                         ` Peter Zijlstra
2008-03-20  5:44 ` Sripathi Kodi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47D6BD0C.6070401@ct.jp.nec.com \
    --to=h-shimamoto@ct.jp.nec.com \
    --cc=dmitry.adamushko@gmail.com \
    --cc=hpj@urpla.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=stable@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.