public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Tejun Heo <tj@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>,
	Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rusty Russell <rusty@rustcorp.com.au>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH/RFC] timer: fix deadlock on cpu hotplug
Date: Tue, 21 Sep 2010 17:40:18 +0200	[thread overview]
Message-ID: <1285083618.2275.884.camel@laptop> (raw)
In-Reply-To: <4C98D0EB.30002@kernel.org>

On Tue, 2010-09-21 at 17:36 +0200, Tejun Heo wrote:
> Hello,
> 
> On 09/21/2010 04:20 PM, Heiko Carstens wrote:
> > For some reason the scheduler decided to throttle RT tasks on the runqueue
> > of cpu 5 (rt_throttled = 1). So as long as rt_throttled == 1 we won't see the
> > migration thread coming back to execution.
> > The only thing that would unthrottle the runqueue would be the rt_period_timer.
> > The timer is indeed scheduled, however in the dump I have it has been expired
> > for more than four hours.
> > The reason is simply that the timer is pending on the offlined cpu 0 and
> > therefore would never fire before it gets migrated to an online cpu. Before
> > the cpu hotplug mechanisms (cpu hotplug notifier with state CPU_DEAD) would
> > migrate the timer to an online cpu stop_machine() must complete ---> deadlock.
> > 
> > The fix _seems_ to be simple: just migrate timers after __cpu_disable() has
> > been called and use the CPU_DYING state. The subtle difference is of course
> > that the migration code now gets executed on the cpu that actually just is
> > going to disable itself instead of an arbitrary cpu that stays online.
> 
> I think this is the second time we're seeing deadlock during cpu down
> due to RT throttling and timer problem.  The rather delicate
> dependency there makes me somewhat nervous.  If possible, I think it
> would be better if we can simply turn the RT throttling off when
> cpu_stop kicks in.  It's intended to be a mechanism to monopolize all
> CPU cycles to begin with.  Would that be difficult?

I've wanted to pull the whole migration thread out from SCHED_FIFO for a
while. Doing that is probably the easiest thing.

Still would be nice to also cure this problem differently.

  reply	other threads:[~2010-09-21 16:28 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-21 14:20 [PATCH/RFC] timer: fix deadlock on cpu hotplug Heiko Carstens
2010-09-21 15:36 ` Tejun Heo
2010-09-21 15:40   ` Peter Zijlstra [this message]
2010-09-22  8:37     ` Heiko Carstens
2010-09-22  9:22       ` Peter Zijlstra
2010-09-22 14:29         ` Peter Zijlstra
2010-09-23 13:31           ` Heiko Carstens
2010-09-25  0:19             ` Peter Zijlstra
2010-10-11 12:31             ` Peter Zijlstra
2010-10-11 13:51               ` Heiko Carstens
2010-10-18 19:16           ` [tip:sched/core] sched: Create special class for stop/migrate work tip-bot for Peter Zijlstra
2010-09-21 15:39 ` [PATCH/RFC] timer: fix deadlock on cpu hotplug Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1285083618.2275.884.camel@laptop \
    --to=peterz@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=heiko.carstens@de.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rusty@rustcorp.com.au \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox