From: Lance Yang <lance.yang@linux.dev>
To: Sergey Senozhatsky <senozhatsky@chromium.org>,
"Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>, Will Deacon <will@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Boqun Feng <boqun.feng@gmail.com>,
Waiman Long <longman@redhat.com>,
Joel Granados <joel.granados@kernel.org>,
Anna Schumaker <anna.schumaker@oracle.com>,
Lance Yang <ioworker0@gmail.com>,
Kent Overstreet <kent.overstreet@linux.dev>,
Yongliang Gao <leonylgao@tencent.com>,
Steven Rostedt <rostedt@goodmis.org>,
Tomasz Figa <tfiga@chromium.org>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
Date: Wed, 30 Jul 2025 17:22:59 +0800 [thread overview]
Message-ID: <392dc660-b4b0-4a0e-8e70-51e19c62589c@linux.dev> (raw)
In-Reply-To: <tfzs3z7yjs6ppobm53hxwjzhhptgq2aqc2obylblz5rk7mdstg@bkas4xcq66xk>
On 2025/7/30 15:59, Sergey Senozhatsky wrote:
> On (25/02/25 16:02), Masami Hiramatsu (Google) wrote:
>> The "hung_task" shows a long-time uninterruptible slept task, but most
>> often, it's blocked on a mutex acquired by another task. Without
>> dumping such a task, investigating the root cause of the hung task
>> problem is very difficult.
>>
>> This introduce task_struct::blocker_mutex to point the mutex lock
>> which this task is waiting for. Since the mutex has "owner"
>> information, we can find the owner task and dump it with hung tasks.
>>
>> Note: the owner can be changed while dumping the owner task, so
>> this is "likely" the owner of the mutex.
>>
>> With this change, the hung task shows blocker task's info like below;
>>
>> INFO: task cat:115 blocked for more than 122 seconds.
>> Not tainted 6.14.0-rc3-00003-ga8946be3de00 #156
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> task:cat state:D stack:13432 pid:115 tgid:115 ppid:106 task_flags:0x400100 flags:0x00000002
>> Call Trace:
>> <TASK>
>> __schedule+0x731/0x960
>> ? schedule_preempt_disabled+0x54/0xa0
>> schedule+0xb7/0x140
>> ? __mutex_lock+0x51b/0xa60
>> ? __mutex_lock+0x51b/0xa60
>> schedule_preempt_disabled+0x54/0xa0
>> __mutex_lock+0x51b/0xa60
>> read_dummy+0x23/0x70
>> full_proxy_read+0x6a/0xc0
>> vfs_read+0xc2/0x340
>> ? __pfx_direct_file_splice_eof+0x10/0x10
>> ? do_sendfile+0x1bd/0x2e0
>> ksys_read+0x76/0xe0
>> do_syscall_64+0xe3/0x1c0
>> ? exc_page_fault+0xa9/0x1d0
>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>> RIP: 0033:0x4840cd
>> RSP: 002b:00007ffe99071828 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
>> RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
>> RDX: 0000000000001000 RSI: 00007ffe99071870 RDI: 0000000000000003
>> RBP: 00007ffe99071870 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
>> R13: 00000000132fd3a0 R14: 0000000000000001 R15: ffffffffffffffff
>> </TASK>
>> INFO: task cat:115 is blocked on a mutex likely owned by task cat:114.
>> task:cat state:S stack:13432 pid:114 tgid:114 ppid:106 task_flags:0x400100 flags:0x00000002
>> Call Trace:
>> <TASK>
>> __schedule+0x731/0x960
>> ? schedule_timeout+0xa8/0x120
>> schedule+0xb7/0x140
>> schedule_timeout+0xa8/0x120
>> ? __pfx_process_timeout+0x10/0x10
>> msleep_interruptible+0x3e/0x60
>> read_dummy+0x2d/0x70
>> full_proxy_read+0x6a/0xc0
>> vfs_read+0xc2/0x340
>> ? __pfx_direct_file_splice_eof+0x10/0x10
>> ? do_sendfile+0x1bd/0x2e0
>> ksys_read+0x76/0xe0
>> do_syscall_64+0xe3/0x1c0
>> ? exc_page_fault+0xa9/0x1d0
>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>> RIP: 0033:0x4840cd
>> RSP: 002b:00007ffe3e0147b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
>> RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
>> RDX: 0000000000001000 RSI: 00007ffe3e014800 RDI: 0000000000000003
>> RBP: 00007ffe3e014800 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
>> R13: 000000001a0a93a0 R14: 0000000000000001 R15: ffffffffffffffff
>> </TASK>
>
> One thing that gives me a bit of "inconvenience" is that in certain
> cases this significantly increases the amount of stack traces to go
> through. A distilled real life example:
> - task T1 acquires lock L1, attempts to acquire L2
> - task T2 acquires lock L2, attempts to acquire L3
> - task T3 acquires lock L3, attempts to acquire L1
>
> So we'd now see:
> - a backtrace of T1, followed by a backtrace of T2 (owner of L2)
> - a backtrace of T2, followed by a backtrace of T3 (owner of L3)
> - a backtrace of T3, followed by a backtrace of T1 (owner of L1)
>
> Notice how each task is backtraced twice. I wonder if it's worth it
> to de-dup the backtraces. E.g. in
>
> task cat:115 is blocked on a mutex likely owned by task cat:114
>
> if we know that cat:114 is also blocked on a lock, then we probably
> can just say "is blocked on a mutex likely owned by task cat:114" and
> continue iterating through tasks. That "cat:114" will be backtraced
> individually later, as it's also blocked on a lock, owned by another
> task.
>
> Does this make any sense?
Good spot! There is room for improvement.
In a deadlock chain like T1->T2->T3->T1, by definition, T1, T2, and T3
are all hung tasks, and the detector's primary responsibility is to
generate a report for each of them. The current implementation, when
reporting on one task, also dumps the backtrace of its blocker.
This results in a task's backtrace being printed twice — once as a
blocker and again as a primary hung task.
Regarding the de-duplication idea: it is elegant, but it does introduce
more complexity into the detector. We should also consider that in many
real-world cases, the blocking chain is just one level deep, where this
isn't an issue, IHMO ;)
Thanks,
Lance
next prev parent reply other threads:[~2025-07-30 9:23 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-25 7:02 [PATCH v4 0/2] hung_task: Dump the blocking task stacktrace Masami Hiramatsu (Google)
2025-02-25 7:02 ` [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex Masami Hiramatsu (Google)
2025-02-26 1:23 ` Waiman Long
2025-03-06 2:32 ` Masami Hiramatsu
[not found] ` <5f7bc403-be75-4ae3-b6ff-5ff0673847f9@redhat.com>
2025-03-06 3:10 ` Waiman Long
2025-03-13 3:55 ` Masami Hiramatsu
2025-02-26 1:44 ` Lance Yang
2025-02-26 4:38 ` Sergey Senozhatsky
2025-02-26 15:07 ` Steven Rostedt
2025-03-13 22:29 ` Andrew Morton
2025-03-14 3:57 ` Masami Hiramatsu
2025-07-30 7:59 ` Sergey Senozhatsky
2025-07-30 8:51 ` Masami Hiramatsu
2025-07-30 9:36 ` Lance Yang
2025-07-30 10:01 ` Masami Hiramatsu
2025-07-30 10:42 ` Lance Yang
2025-07-30 10:16 ` Sergey Senozhatsky
2025-07-30 10:40 ` Lance Yang
2025-07-30 9:53 ` [RFC PATCH] hung_task: Dump blocker task if it is not hung Masami Hiramatsu (Google)
2025-07-30 13:28 ` Sergey Senozhatsky
2025-07-30 13:55 ` Masami Hiramatsu
2025-07-30 13:46 ` Lance Yang
2025-07-30 21:50 ` Masami Hiramatsu
2025-07-30 9:56 ` [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex Sergey Senozhatsky
2025-07-30 9:22 ` Lance Yang [this message]
2025-07-30 9:46 ` Sergey Senozhatsky
2025-02-25 7:02 ` [PATCH v4 2/2] samples: Add hung_task detector mutex blocking sample Masami Hiramatsu (Google)
2025-02-26 1:50 ` Lance Yang
2025-02-26 7:04 ` Masami Hiramatsu
2025-02-26 11:58 ` Lance Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=392dc660-b4b0-4a0e-8e70-51e19c62589c@linux.dev \
--to=lance.yang@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=anna.schumaker@oracle.com \
--cc=boqun.feng@gmail.com \
--cc=ioworker0@gmail.com \
--cc=joel.granados@kernel.org \
--cc=kent.overstreet@linux.dev \
--cc=leonylgao@tencent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=mhiramat@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=senozhatsky@chromium.org \
--cc=tfiga@chromium.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.