* [PACTH v1] kernel/hung_task.c: Dump all UNINTERUPTIBLE tasks
@ 2016-08-02 15:23 robert.foss
2016-08-04 13:22 ` Tetsuo Handa
2016-08-10 22:43 ` Andrew Morton
0 siblings, 2 replies; 6+ messages in thread
From: robert.foss @ 2016-08-02 15:23 UTC (permalink / raw)
To: adurbin, penguin-kernel, robert.foss, akpm, linux-kernel
From: Aaron Durbin <adurbin@chromium.org>
When the panic path is taken for khungtaskd dump all
tasks with the UNINTERUPTIBLE state. That way, any
inter-dependent tasks that caused one another to hang
will be saved in the crash output.
Signed-off-by: Aaron Durbin <adurbin@chromium.org>
Tested-by: Robert Foss <robert.foss@collabora.com>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
---
kernel/hung_task.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index d234022..946caf9 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -122,6 +122,8 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
touch_nmi_watchdog();
if (sysctl_hung_task_panic) {
+ /* Dump all tasks. */
+ show_state_filter(TASK_UNINTERRUPTIBLE);
trigger_all_cpu_backtrace();
panic("hung_task: blocked tasks");
}
--
2.7.4
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PACTH v1] kernel/hung_task.c: Dump all UNINTERUPTIBLE tasks
2016-08-02 15:23 [PACTH v1] kernel/hung_task.c: Dump all UNINTERUPTIBLE tasks robert.foss
@ 2016-08-04 13:22 ` Tetsuo Handa
2016-08-04 15:29 ` Robert Foss
2016-08-10 22:43 ` Andrew Morton
1 sibling, 1 reply; 6+ messages in thread
From: Tetsuo Handa @ 2016-08-04 13:22 UTC (permalink / raw)
To: robert.foss, adurbin, akpm, linux-kernel
Robert Foss wrote:
> From: Aaron Durbin <adurbin@chromium.org>
>
> When the panic path is taken for khungtaskd dump all
> tasks with the UNINTERUPTIBLE state. That way, any
> inter-dependent tasks that caused one another to hang
> will be saved in the crash output.
How useful do you think this change is? If kdump is configured, you
can obtain the same information from vmcore using crash utility. If
kdump is not configured, it might be so in order to reboot as quick
as possible using panic_timeout < 0.
Also, from my experience, inter-dependent tasks are not always blocked
in UNINTERUPTIBLE state. If they are AB-BA deadlock, this change will
help only when they are waiting using unkillable version of wait
primitives. I think most of simple AB-BA deadlocks are detected by
lockdep and fixed by now. An example where kswapd is reported as a
hung task ( http://lkml.kernel.org/r/20160211225929.GU14668@dastard ) is
AB-BA deadlock which lockdep can't detect and inter-dependent tasks are
not blocked for long in UNINTERUPTIBLE state.
Also, slide 18 of http://I-love.SAKURA.ne.jp/tomoyo/LCJ2014-en.pdf uses
SystemTap to install a hook for SysRq-t when a hung task is reported.
You will be able to run SysRq-w using it.
I don't NACK this patch. But I think this patch could be conditional (e.g.
show_state_filter(0) if sysctl_hung_task_panic == 2 and
show_state_filter(TASK_UNINTERRUPTIBLE) if sysctl_hung_task_panic == 3)
for environment where SystemTap can't be used.
>
> Signed-off-by: Aaron Durbin <adurbin@chromium.org>
> Tested-by: Robert Foss <robert.foss@collabora.com>
> Signed-off-by: Robert Foss <robert.foss@collabora.com>
> ---
> kernel/hung_task.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index d234022..946caf9 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -122,6 +122,8 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
> touch_nmi_watchdog();
>
> if (sysctl_hung_task_panic) {
> + /* Dump all tasks. */
> + show_state_filter(TASK_UNINTERRUPTIBLE);
> trigger_all_cpu_backtrace();
> panic("hung_task: blocked tasks");
> }
> --
> 2.7.4
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PACTH v1] kernel/hung_task.c: Dump all UNINTERUPTIBLE tasks
2016-08-04 13:22 ` Tetsuo Handa
@ 2016-08-04 15:29 ` Robert Foss
0 siblings, 0 replies; 6+ messages in thread
From: Robert Foss @ 2016-08-04 15:29 UTC (permalink / raw)
To: Tetsuo Handa, adurbin, akpm, linux-kernel
On 2016-08-04 09:22 AM, Tetsuo Handa wrote:
> Robert Foss wrote:
>> From: Aaron Durbin <adurbin@chromium.org>
>>
>> When the panic path is taken for khungtaskd dump all
>> tasks with the UNINTERUPTIBLE state. That way, any
>> inter-dependent tasks that caused one another to hang
>> will be saved in the crash output.
>
> How useful do you think this change is? If kdump is configured, you
> can obtain the same information from vmcore using crash utility. If
> kdump is not configured, it might be so in order to reboot as quick
> as possible using panic_timeout < 0.
The general idea is to provide more potentially helpful information
since we have access to it.
kdump may or may not be configured on a system, but even without kdump
we have access to this set of information, which me might as well supply
to provide more information for figuring out the root cause of the hung
task.
Having the information be available by default simplifies debugging when
interacting with non-technical endusers.
>
> Also, from my experience, inter-dependent tasks are not always blocked
> in UNINTERUPTIBLE state. If they are AB-BA deadlock, this change will
> help only when they are waiting using unkillable version of wait
> primitives. I think most of simple AB-BA deadlocks are detected by
> lockdep and fixed by now. An example where kswapd is reported as a
> hung task ( http://lkml.kernel.org/r/20160211225929.GU14668@dastard ) is
> AB-BA deadlock which lockdep can't detect and inter-dependent tasks are
> not blocked for long in UNINTERUPTIBLE state.
If a task is blocking in a non-UNINTERUPTIBLE state this patch won't do
anything, but this logging not catching all potential inter-dependency
caused hangs, does not mean that the patch won't be helpful is some
scenarios.
>
> Also, slide 18 of http://I-love.SAKURA.ne.jp/tomoyo/LCJ2014-en.pdf uses
> SystemTap to install a hook for SysRq-t when a hung task is reported.
> You will be able to run SysRq-w using it.
>
> I don't NACK this patch. But I think this patch could be conditional (e.g.
> show_state_filter(0) if sysctl_hung_task_panic == 2 and
> show_state_filter(TASK_UNINTERRUPTIBLE) if sysctl_hung_task_panic == 3)
> for environment where SystemTap can't be used.
That sounds like a good alternative to me, if the below snippet looks
good to you, I'll submit it as v2.
if (sysctl_hung_task_panic) {
+ if (sysctl_hung_task_panic == 2)
+ show_state_filter(0);
+ else if (sysctl_hung_task_panic == 3)
+ show_state_filter(TASK_UNINTERRUPTIBLE);
+
trigger_all_cpu_backtrace();
>
>>
>> Signed-off-by: Aaron Durbin <adurbin@chromium.org>
>> Tested-by: Robert Foss <robert.foss@collabora.com>
>> Signed-off-by: Robert Foss <robert.foss@collabora.com>
>> ---
>> kernel/hung_task.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
>> index d234022..946caf9 100644
>> --- a/kernel/hung_task.c
>> +++ b/kernel/hung_task.c
>> @@ -122,6 +122,8 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
>> touch_nmi_watchdog();
>>
>> if (sysctl_hung_task_panic) {
>> + /* Dump all tasks. */
>> + show_state_filter(TASK_UNINTERRUPTIBLE);
>> trigger_all_cpu_backtrace();
>> panic("hung_task: blocked tasks");
>> }
>> --
>> 2.7.4
>>
>>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PACTH v1] kernel/hung_task.c: Dump all UNINTERUPTIBLE tasks
2016-08-02 15:23 [PACTH v1] kernel/hung_task.c: Dump all UNINTERUPTIBLE tasks robert.foss
2016-08-04 13:22 ` Tetsuo Handa
@ 2016-08-10 22:43 ` Andrew Morton
2016-08-11 16:35 ` Robert Foss
1 sibling, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2016-08-10 22:43 UTC (permalink / raw)
To: robert.foss; +Cc: adurbin, penguin-kernel, linux-kernel
On Tue, 2 Aug 2016 11:23:11 -0400 robert.foss@collabora.com wrote:
> From: Aaron Durbin <adurbin@chromium.org>
>
> When the panic path is taken for khungtaskd dump all
> tasks with the UNINTERUPTIBLE state. That way, any
> inter-dependent tasks that caused one another to hang
> will be saved in the crash output.
>
> ...
>
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -122,6 +122,8 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
> touch_nmi_watchdog();
>
> if (sysctl_hung_task_panic) {
> + /* Dump all tasks. */
> + show_state_filter(TASK_UNINTERRUPTIBLE);
> trigger_all_cpu_backtrace();
> panic("hung_task: blocked tasks");
> }
Well, it's going to produce more gunk for the operator to read through
and understand.
I'd like to hear a little more about the value of this change: what
particular problem prompted it, etc.
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PACTH v1] kernel/hung_task.c: Dump all UNINTERUPTIBLE tasks
2016-08-10 22:43 ` Andrew Morton
@ 2016-08-11 16:35 ` Robert Foss
2016-08-29 19:04 ` Robert Foss
0 siblings, 1 reply; 6+ messages in thread
From: Robert Foss @ 2016-08-11 16:35 UTC (permalink / raw)
To: Andrew Morton; +Cc: adurbin, penguin-kernel, linux-kernel
On 2016-08-10 06:43 PM, Andrew Morton wrote:
> On Tue, 2 Aug 2016 11:23:11 -0400 robert.foss@collabora.com wrote:
>
>> From: Aaron Durbin <adurbin@chromium.org>
>>
>> When the panic path is taken for khungtaskd dump all
>> tasks with the UNINTERUPTIBLE state. That way, any
>> inter-dependent tasks that caused one another to hang
>> will be saved in the crash output.
>>
>> ...
>>
>> --- a/kernel/hung_task.c
>> +++ b/kernel/hung_task.c
>> @@ -122,6 +122,8 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
>> touch_nmi_watchdog();
>>
>> if (sysctl_hung_task_panic) {
>> + /* Dump all tasks. */
>> + show_state_filter(TASK_UNINTERRUPTIBLE);
>> trigger_all_cpu_backtrace();
>> panic("hung_task: blocked tasks");
>> }
>
> Well, it's going to produce more gunk for the operator to read through
> and understand.
>
> I'd like to hear a little more about the value of this change: what
> particular problem prompted it, etc.
>
It would indeed provide more gunk. What makes it useful is that is on
enabled by default and enables rapid debugging of devices that are not
physically accessible or accessible for debugging otherwise.
So the primary usecase would be when a user of a device is seeing some
issues and submits the logs from the device.
Without any further action from the user, the problem could potentially
be solved.
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PACTH v1] kernel/hung_task.c: Dump all UNINTERUPTIBLE tasks
2016-08-11 16:35 ` Robert Foss
@ 2016-08-29 19:04 ` Robert Foss
0 siblings, 0 replies; 6+ messages in thread
From: Robert Foss @ 2016-08-29 19:04 UTC (permalink / raw)
To: Andrew Morton; +Cc: adurbin, penguin-kernel, linux-kernel
On 2016-08-11 12:35 PM, Robert Foss wrote:
>
>
> On 2016-08-10 06:43 PM, Andrew Morton wrote:
>> On Tue, 2 Aug 2016 11:23:11 -0400 robert.foss@collabora.com wrote:
>>
>>> From: Aaron Durbin <adurbin@chromium.org>
>>>
>>> When the panic path is taken for khungtaskd dump all
>>> tasks with the UNINTERUPTIBLE state. That way, any
>>> inter-dependent tasks that caused one another to hang
>>> will be saved in the crash output.
>>>
>>> ...
>>>
>>> --- a/kernel/hung_task.c
>>> +++ b/kernel/hung_task.c
>>> @@ -122,6 +122,8 @@ static void check_hung_task(struct task_struct
>>> *t, unsigned long timeout)
>>> touch_nmi_watchdog();
>>>
>>> if (sysctl_hung_task_panic) {
>>> + /* Dump all tasks. */
>>> + show_state_filter(TASK_UNINTERRUPTIBLE);
>>> trigger_all_cpu_backtrace();
>>> panic("hung_task: blocked tasks");
>>> }
>>
>> Well, it's going to produce more gunk for the operator to read through
>> and understand.
>>
>> I'd like to hear a little more about the value of this change: what
>> particular problem prompted it, etc.
>>
>
> It would indeed provide more gunk. What makes it useful is that is on
> enabled by default and enables rapid debugging of devices that are not
> physically accessible or accessible for debugging otherwise.
>
> So the primary usecase would be when a user of a device is seeing some
> issues and submits the logs from the device.
> Without any further action from the user, the problem could potentially
> be solved.
The debug output could be formatted better, would that make this patch
more appealing?
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-08-29 19:05 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-02 15:23 [PACTH v1] kernel/hung_task.c: Dump all UNINTERUPTIBLE tasks robert.foss
2016-08-04 13:22 ` Tetsuo Handa
2016-08-04 15:29 ` Robert Foss
2016-08-10 22:43 ` Andrew Morton
2016-08-11 16:35 ` Robert Foss
2016-08-29 19:04 ` Robert Foss
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox