From: Waiman Long <llong@redhat.com>
To: Steven Rostedt <rostedt@goodmis.org>,
"Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>, Will Deacon <will@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Boqun Feng <boqun.feng@gmail.com>,
Joel Granados <joel.granados@kernel.org>,
Anna Schumaker <anna.schumaker@oracle.com>,
Lance Yang <ioworker0@gmail.com>,
Kent Overstreet <kent.overstreet@linux.dev>,
Yongliang Gao <leonylgao@tencent.com>,
Tomasz Figa <tfiga@chromium.org>,
Sergey Senozhatsky <senozhatsky@chromium.org>,
linux-kernel@vger.kernel.org,
Linux Memory Management List <linux-mm@kvack.org>
Subject: Re: [PATCH 1/2] hung_task: Show the blocker task if the task is hung on mutex
Date: Wed, 19 Feb 2025 15:18:57 -0500 [thread overview]
Message-ID: <0fa9dd8e-2d83-487e-bfb1-1f5d20cd9fe6@redhat.com> (raw)
In-Reply-To: <20250219112308.5d905680@gandalf.local.home>
On 2/19/25 11:23 AM, Steven Rostedt wrote:
> On Wed, 19 Feb 2025 22:00:49 +0900
> "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:
>
>> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>>
>> The "hung_task" shows a long-time uninterruptible slept task, but most
>> often, it's blocked on a mutex acquired by another task. Without
>> dumping such a task, investigating the root cause of the hung task
>> problem is very difficult.
>>
>> Fortunately CONFIG_DEBUG_MUTEXES=y allows us to identify the mutex
>> blocking the task. And the mutex has "owner" information, which can
>> be used to find the owner task and dump it with hung tasks.
>>
>> With this change, the hung task shows blocker task's info like below;
>>
> We've hit bugs like this in the field a few times, and it was very
> difficult to debug. Something like this would have made our lives much
> easier!
I agree that it will be a useful feature.
>> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>> ---
>> kernel/hung_task.c | 38 ++++++++++++++++++++++++++++++++++++++
>> kernel/locking/mutex-debug.c | 1 +
>> kernel/locking/mutex.c | 9 +++++++++
>> kernel/locking/mutex.h | 6 ++++++
>> 4 files changed, 54 insertions(+)
>>
>> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
>> index 04efa7a6e69b..d1ce69504090 100644
>> --- a/kernel/hung_task.c
>> +++ b/kernel/hung_task.c
>> @@ -25,6 +25,8 @@
>>
>> #include <trace/events/sched.h>
>>
>> +#include "locking/mutex.h"
>> +
>> /*
>> * The number of tasks checked:
>> */
>> @@ -93,6 +95,41 @@ static struct notifier_block panic_block = {
>> .notifier_call = hung_task_panic,
>> };
>>
>> +
>> +#ifdef CONFIG_DEBUG_MUTEXES
>> +static void debug_show_blocker(struct task_struct *task)
>> +{
>> + struct task_struct *g, *t;
>> + unsigned long owner;
>> + struct mutex *lock;
>> +
>> + if (!task->blocked_on)
>> + return;
>> +
>> + lock = task->blocked_on->mutex;
> This is a catch 22. To look at the task's blocked_on, we need the
> lock->wait_lock held, otherwise this could be an issue. But to get that
> lock, we need to look at the task's blocked_on field! As this can race.
>
> Another thing is that the waiter is on the task's stack. Perhaps we need to
> move this into sched/core.c and be able to lock the task's rq. Because even
> something like:
>
> waiter = READ_ONCE(task->blocked_on);
>
> May be garbage if the task were to suddenly wake up and run.
>
> Now if we were able to lock the task's rq, which would prevent it from
> being woken up, then the blocked_on field would not be at risk of being
> corrupted.
It is tricky to access the mutex_waiter structure which is allocated
from stack. So another way to work around this issue is to add a new
blocked_on_mutex field in task_struct to directly point to relevant
mutex. Yes, that increase the size of task_struct by 8 bytes, but it is
a pretty large structure anyway. Using READ_ONCE/WRITE_ONCE() to access
this field, we don't need to take lock, though taking the wait_lock may
still be needed to examine other information inside the mutex.
Cheers,
Longman
next prev parent reply other threads:[~2025-02-19 20:19 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-19 13:00 [PATCH 0/2] hung_task: Dump the blocking task stacktrace Masami Hiramatsu (Google)
2025-02-19 13:00 ` [PATCH 1/2] hung_task: Show the blocker task if the task is hung on mutex Masami Hiramatsu (Google)
2025-02-19 16:23 ` Steven Rostedt
2025-02-19 20:18 ` Waiman Long [this message]
2025-02-19 20:24 ` Steven Rostedt
2025-02-19 22:44 ` Waiman Long
2025-02-19 22:56 ` Masami Hiramatsu
2025-02-19 23:55 ` Steven Rostedt
2025-02-20 1:52 ` Lance Yang
2025-02-20 2:07 ` Masami Hiramatsu
2025-02-20 2:21 ` Waiman Long
2025-02-20 2:23 ` Steven Rostedt
2025-02-20 1:36 ` Waiman Long
2025-02-20 1:41 ` Steven Rostedt
2025-02-20 2:15 ` Waiman Long
2025-02-20 2:27 ` Steven Rostedt
2025-02-20 3:29 ` Waiman Long
2025-02-20 2:59 ` Masami Hiramatsu
2025-02-20 3:37 ` Waiman Long
2025-02-20 9:29 ` Masami Hiramatsu
2025-02-20 13:28 ` Waiman Long
2025-02-20 2:40 ` Masami Hiramatsu
2025-02-20 3:11 ` Steven Rostedt
2025-02-20 13:13 ` Waiman Long
2025-02-20 16:30 ` Steven Rostedt
2025-02-19 23:09 ` Masami Hiramatsu
2025-02-19 23:58 ` Steven Rostedt
2025-02-20 2:08 ` Masami Hiramatsu
2025-02-20 2:25 ` Waiman Long
2025-02-20 1:40 ` Waiman Long
2025-02-20 2:45 ` Sergey Senozhatsky
2025-02-20 3:46 ` Sergey Senozhatsky
2025-02-20 3:49 ` Waiman Long
2025-02-20 4:19 ` Sergey Senozhatsky
2025-02-20 9:25 ` Masami Hiramatsu
2025-02-19 13:00 ` [PATCH 2/2] samples: Add hung_task detector mutex blocking sample Masami Hiramatsu (Google)
2025-02-19 13:33 ` [PATCH 0/2] hung_task: Dump the blocking task stacktrace Lance Yang
2025-02-19 15:02 ` Lance Yang
2025-02-19 20:20 ` Waiman Long
2025-02-20 1:27 ` Lance Yang
2025-02-20 14:18 ` Masami Hiramatsu
2025-02-20 14:22 ` Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0fa9dd8e-2d83-487e-bfb1-1f5d20cd9fe6@redhat.com \
--to=llong@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=anna.schumaker@oracle.com \
--cc=boqun.feng@gmail.com \
--cc=ioworker0@gmail.com \
--cc=joel.granados@kernel.org \
--cc=kent.overstreet@linux.dev \
--cc=leonylgao@tencent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhiramat@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=senozhatsky@chromium.org \
--cc=tfiga@chromium.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox