From: Ben Greear <greearb@candelatech.com>
To: Tejun Heo <tj@kernel.org>
Cc: linux-kernel@vger.kernel.org, eric.dumazet@gmail.com,
stable@vger.kernel.org, torvalds@linux-foundation.org
Subject: Re: [PATCH v3] Fix lockup related to stop_machine being stuck in __do_softirq.
Date: Mon, 10 Jun 2013 10:08:19 -0700 [thread overview]
Message-ID: <51B60803.9020706@candelatech.com> (raw)
In-Reply-To: <20130606214014.GK5045@htj.dyndns.org>
On 06/06/2013 02:40 PM, Tejun Heo wrote:
> On Thu, Jun 06, 2013 at 02:29:49PM -0700, greearb@candelatech.com wrote:
>> From: Ben Greear <greearb@candelatech.com>
>>
>> The stop machine logic can lock up if all but one of
>> the migration threads make it through the disable-irq
>> step and the one remaining thread gets stuck in
>> __do_softirq. The reason __do_softirq can hang is
>> that it has a bail-out based on jiffies timeout, but
>> in the lockup case, jiffies itself is not incremented.
>>
>> To work around this, re-add the max_restart counter in __do_irq
>> and stop processing irqs after 10 restarts.
>>
>> Thanks to Tejun Heo and Rusty Russell and others for
>> helping me track this down.
>>
>> This was introduced in 3.9 by commit: c10d73671ad30f5469
>> (softirq: reduce latencies).
>>
>> It may be worth looking into ath9k to see if it has issues with
>> it's irq handler at a later date.
>>
>> The hang stack traces look something like this:
> ...
>> Signed-off-by: Ben Greear <greearb@candelatech.com>
>
> Acked-by: Tejun Heo <tj@kernel.org>
>
> Linus, while this doesn't fix the root cause of the problem - softirq
> runaway - I still think this is a worthwhile protection to have. Ben
> is in the process of finding out why the softirq runaway happens in
> the first place. We probably want to add Cc: stable@vger.kernel.org
> tag.
This patch does not seem to be in mainline yet.
Do I need to do something else, or just be patient?
Thanks,
Ben
>
> Thanks.
>
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
prev parent reply other threads:[~2013-06-10 17:08 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-06 21:29 [PATCH v3] Fix lockup related to stop_machine being stuck in __do_softirq greearb
2013-06-06 21:40 ` Tejun Heo
2013-06-07 5:23 ` Pekka Riikonen
2013-06-10 17:08 ` Ben Greear [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51B60803.9020706@candelatech.com \
--to=greearb@candelatech.com \
--cc=eric.dumazet@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.