[PATCH v4 0/2] hung_task: Dump the blocking task stacktrace

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v4 0/2] hung_task: Dump the blocking task stacktrace
@ 2025-02-25  7:02 Masami Hiramatsu (Google)
  2025-02-25  7:02 ` [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex Masami Hiramatsu (Google)
  2025-02-25  7:02 ` [PATCH v4 2/2] samples: Add hung_task detector mutex blocking sample Masami Hiramatsu (Google)
  0 siblings, 2 replies; 30+ messages in thread
From: Masami Hiramatsu (Google) @ 2025-02-25  7:02 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton
  Cc: Boqun Feng, Waiman Long, Joel Granados, Masami Hiramatsu,
	Anna Schumaker, Lance Yang, Kent Overstreet, Yongliang Gao,
	Steven Rostedt, Tomasz Figa, Sergey Senozhatsky, linux-kernel

Hi,

Here is the 4th version of the dumping mutex blocker in hung_task
message. The previous version is here;

https://lore.kernel.org/all/174018982058.2766225.1721562132740498299.stgit@mhiramat.tok.corp.google.com/

This version makes CONFIG_DETECT_HUNG_TASK_BLOCKER depending
on !CONFIG_PREEMPT_RT.

The hung_task detector is very useful for detecting the lockup.
However, since it only dumps the blocked (uninterruptible sleep)
processes, it is not enough to identify the root cause of that
lockup.

For example, if a process holds a mutex and sleep an event in
interruptible state long time, the other processes will wait on
the mutex in uninterruptible state. In this case, the waiter
processes are dumped, but the blocker process is not shown
because it is sleep in interruptible state.

This adds a feature to dump the blocker task which holds a mutex
when detecting a hung task. e.g.

 INFO: task cat:115 blocked for more than 122 seconds.
       Not tainted 6.14.0-rc3-00003-ga8946be3de00 #156
 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 task:cat             state:D stack:13432 pid:115   tgid:115   ppid:106    task_flags:0x400100 flags:0x00000002
 Call Trace:
  <TASK>
  __schedule+0x731/0x960
  ? schedule_preempt_disabled+0x54/0xa0
  schedule+0xb7/0x140
  ? __mutex_lock+0x51b/0xa60
  ? __mutex_lock+0x51b/0xa60
  schedule_preempt_disabled+0x54/0xa0
  __mutex_lock+0x51b/0xa60
  read_dummy+0x23/0x70
  full_proxy_read+0x6a/0xc0
  vfs_read+0xc2/0x340
  ? __pfx_direct_file_splice_eof+0x10/0x10
  ? do_sendfile+0x1bd/0x2e0
  ksys_read+0x76/0xe0
  do_syscall_64+0xe3/0x1c0
  ? exc_page_fault+0xa9/0x1d0
  entry_SYSCALL_64_after_hwframe+0x77/0x7f
 RIP: 0033:0x4840cd
 RSP: 002b:00007ffe99071828 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
 RDX: 0000000000001000 RSI: 00007ffe99071870 RDI: 0000000000000003
 RBP: 00007ffe99071870 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
 R13: 00000000132fd3a0 R14: 0000000000000001 R15: ffffffffffffffff
  </TASK>
 INFO: task cat:115 is blocked on a mutex likely owned by task cat:114.
 task:cat             state:S stack:13432 pid:114   tgid:114   ppid:106    task_flags:0x400100 flags:0x00000002
 Call Trace:
  <TASK>
  __schedule+0x731/0x960
  ? schedule_timeout+0xa8/0x120
  schedule+0xb7/0x140
  schedule_timeout+0xa8/0x120
  ? __pfx_process_timeout+0x10/0x10
  msleep_interruptible+0x3e/0x60
  read_dummy+0x2d/0x70
  full_proxy_read+0x6a/0xc0
  vfs_read+0xc2/0x340
  ? __pfx_direct_file_splice_eof+0x10/0x10
  ? do_sendfile+0x1bd/0x2e0
  ksys_read+0x76/0xe0
  do_syscall_64+0xe3/0x1c0
  ? exc_page_fault+0xa9/0x1d0
  entry_SYSCALL_64_after_hwframe+0x77/0x7f
 RIP: 0033:0x4840cd
 RSP: 002b:00007ffe3e0147b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
 RDX: 0000000000001000 RSI: 00007ffe3e014800 RDI: 0000000000000003
 RBP: 00007ffe3e014800 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
 R13: 000000001a0a93a0 R14: 0000000000000001 R15: ffffffffffffffff
  </TASK>

TBD:
We can extend this feature to cover other locks like rwsem and rt_mutex,
but rwsem requires to dump all the tasks which acquire and wait that
rwsem. We can follow the waiter link but the output will be a bit
different compared with mutex case.

Thank you,

---

Masami Hiramatsu (Google) (2):
      hung_task: Show the blocker task if the task is hung on mutex
      samples: Add hung_task detector mutex blocking sample


 include/linux/mutex.h               |    2 +
 include/linux/sched.h               |    4 ++
 kernel/hung_task.c                  |   36 +++++++++++++++++++
 kernel/locking/mutex.c              |   14 +++++++
 lib/Kconfig.debug                   |   11 ++++++
 samples/Kconfig                     |    9 +++++
 samples/Makefile                    |    1 +
 samples/hung_task/Makefile          |    2 +
 samples/hung_task/hung_task_mutex.c |   66 +++++++++++++++++++++++++++++++++++
 9 files changed, 145 insertions(+)
 create mode 100644 samples/hung_task/Makefile
 create mode 100644 samples/hung_task/hung_task_mutex.c

--
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
  2025-02-25  7:02 [PATCH v4 0/2] hung_task: Dump the blocking task stacktrace Masami Hiramatsu (Google)
@ 2025-02-25  7:02 ` Masami Hiramatsu (Google)
  2025-02-26  1:23   ` Waiman Long
                     ` (4 more replies)
  2025-02-25  7:02 ` [PATCH v4 2/2] samples: Add hung_task detector mutex blocking sample Masami Hiramatsu (Google)
  1 sibling, 5 replies; 30+ messages in thread
From: Masami Hiramatsu (Google) @ 2025-02-25  7:02 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton
  Cc: Boqun Feng, Waiman Long, Joel Granados, Masami Hiramatsu,
	Anna Schumaker, Lance Yang, Kent Overstreet, Yongliang Gao,
	Steven Rostedt, Tomasz Figa, Sergey Senozhatsky, linux-kernel

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

The "hung_task" shows a long-time uninterruptible slept task, but most
often, it's blocked on a mutex acquired by another task. Without
dumping such a task, investigating the root cause of the hung task
problem is very difficult.

This introduce task_struct::blocker_mutex to point the mutex lock
which this task is waiting for. Since the mutex has "owner"
information, we can find the owner task and dump it with hung tasks.

Note: the owner can be changed while dumping the owner task, so
this is "likely" the owner of the mutex.

With this change, the hung task shows blocker task's info like below;

 INFO: task cat:115 blocked for more than 122 seconds.
       Not tainted 6.14.0-rc3-00003-ga8946be3de00 #156
 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 task:cat             state:D stack:13432 pid:115   tgid:115   ppid:106    task_flags:0x400100 flags:0x00000002
 Call Trace:
  <TASK>
  __schedule+0x731/0x960
  ? schedule_preempt_disabled+0x54/0xa0
  schedule+0xb7/0x140
  ? __mutex_lock+0x51b/0xa60
  ? __mutex_lock+0x51b/0xa60
  schedule_preempt_disabled+0x54/0xa0
  __mutex_lock+0x51b/0xa60
  read_dummy+0x23/0x70
  full_proxy_read+0x6a/0xc0
  vfs_read+0xc2/0x340
  ? __pfx_direct_file_splice_eof+0x10/0x10
  ? do_sendfile+0x1bd/0x2e0
  ksys_read+0x76/0xe0
  do_syscall_64+0xe3/0x1c0
  ? exc_page_fault+0xa9/0x1d0
  entry_SYSCALL_64_after_hwframe+0x77/0x7f
 RIP: 0033:0x4840cd
 RSP: 002b:00007ffe99071828 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
 RDX: 0000000000001000 RSI: 00007ffe99071870 RDI: 0000000000000003
 RBP: 00007ffe99071870 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
 R13: 00000000132fd3a0 R14: 0000000000000001 R15: ffffffffffffffff
  </TASK>
 INFO: task cat:115 is blocked on a mutex likely owned by task cat:114.
 task:cat             state:S stack:13432 pid:114   tgid:114   ppid:106    task_flags:0x400100 flags:0x00000002
 Call Trace:
  <TASK>
  __schedule+0x731/0x960
  ? schedule_timeout+0xa8/0x120
  schedule+0xb7/0x140
  schedule_timeout+0xa8/0x120
  ? __pfx_process_timeout+0x10/0x10
  msleep_interruptible+0x3e/0x60
  read_dummy+0x2d/0x70
  full_proxy_read+0x6a/0xc0
  vfs_read+0xc2/0x340
  ? __pfx_direct_file_splice_eof+0x10/0x10
  ? do_sendfile+0x1bd/0x2e0
  ksys_read+0x76/0xe0
  do_syscall_64+0xe3/0x1c0
  ? exc_page_fault+0xa9/0x1d0
  entry_SYSCALL_64_after_hwframe+0x77/0x7f
 RIP: 0033:0x4840cd
 RSP: 002b:00007ffe3e0147b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
 RDX: 0000000000001000 RSI: 00007ffe3e014800 RDI: 0000000000000003
 RBP: 00007ffe3e014800 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
 R13: 000000001a0a93a0 R14: 0000000000000001 R15: ffffffffffffffff
  </TASK>

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 Changes in v4:
  - Make this option depends on !PREEMPT_RT because it changes mutex to
    rt_mutex.
---
 include/linux/mutex.h  |    2 ++
 include/linux/sched.h  |    4 ++++
 kernel/hung_task.c     |   36 ++++++++++++++++++++++++++++++++++++
 kernel/locking/mutex.c |   14 ++++++++++++++
 lib/Kconfig.debug      |   11 +++++++++++
 5 files changed, 67 insertions(+)

diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index 2bf91b57591b..2143d05116be 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -202,4 +202,6 @@ DEFINE_GUARD(mutex, struct mutex *, mutex_lock(_T), mutex_unlock(_T))
 DEFINE_GUARD_COND(mutex, _try, mutex_trylock(_T))
 DEFINE_GUARD_COND(mutex, _intr, mutex_lock_interruptible(_T) == 0)
 
+extern unsigned long mutex_get_owner(struct mutex *lock);
+
 #endif /* __LINUX_MUTEX_H */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 9632e3318e0d..0cebdd736d44 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1217,6 +1217,10 @@ struct task_struct {
 	struct mutex_waiter		*blocked_on;
 #endif
 
+#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
+	struct mutex			*blocker_mutex;
+#endif
+
 #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
 	int				non_block_count;
 #endif
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 04efa7a6e69b..ccd7217fcec1 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -93,6 +93,41 @@ static struct notifier_block panic_block = {
 	.notifier_call = hung_task_panic,
 };
 
+
+#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
+static void debug_show_blocker(struct task_struct *task)
+{
+	struct task_struct *g, *t;
+	unsigned long owner;
+	struct mutex *lock;
+
+	RCU_LOCKDEP_WARN(!rcu_read_lock_held(), "No rcu lock held");
+
+	lock = READ_ONCE(task->blocker_mutex);
+	if (!lock)
+		return;
+
+	owner = mutex_get_owner(lock);
+	if (unlikely(!owner)) {
+		pr_err("INFO: task %s:%d is blocked on a mutex, but the owner is not found.\n",
+			task->comm, task->pid);
+		return;
+	}
+
+	/* Ensure the owner information is correct. */
+	for_each_process_thread(g, t) {
+		if ((unsigned long)t == owner) {
+			pr_err("INFO: task %s:%d is blocked on a mutex likely owned by task %s:%d.\n",
+				task->comm, task->pid, t->comm, t->pid);
+			sched_show_task(t);
+			return;
+		}
+	}
+}
+#else
+#define debug_show_blocker(t)	do {} while (0)
+#endif
+
 static void check_hung_task(struct task_struct *t, unsigned long timeout)
 {
 	unsigned long switch_count = t->nvcsw + t->nivcsw;
@@ -152,6 +187,7 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
 		pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\""
 			" disables this message.\n");
 		sched_show_task(t);
+		debug_show_blocker(t);
 		hung_task_show_lock = true;
 
 		if (sysctl_hung_task_all_cpu_backtrace)
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index b36f23de48f1..6a543c204a14 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -72,6 +72,14 @@ static inline unsigned long __owner_flags(unsigned long owner)
 	return owner & MUTEX_FLAGS;
 }
 
+/* Do not use the return value as a pointer directly. */
+unsigned long mutex_get_owner(struct mutex *lock)
+{
+	unsigned long owner = atomic_long_read(&lock->owner);
+
+	return (unsigned long)__owner_task(owner);
+}
+
 /*
  * Returns: __mutex_owner(lock) on failure or NULL on success.
  */
@@ -180,6 +188,9 @@ static void
 __mutex_add_waiter(struct mutex *lock, struct mutex_waiter *waiter,
 		   struct list_head *list)
 {
+#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
+	WRITE_ONCE(current->blocker_mutex, lock);
+#endif
 	debug_mutex_add_waiter(lock, waiter, current);
 
 	list_add_tail(&waiter->list, list);
@@ -195,6 +206,9 @@ __mutex_remove_waiter(struct mutex *lock, struct mutex_waiter *waiter)
 		__mutex_clear_flag(lock, MUTEX_FLAGS);
 
 	debug_mutex_remove_waiter(lock, waiter, current);
+#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
+	WRITE_ONCE(current->blocker_mutex, NULL);
+#endif
 }
 
 /*
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 1af972a92d06..77d8c7e5ce96 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1260,6 +1260,17 @@ config BOOTPARAM_HUNG_TASK_PANIC
 
 	  Say N if unsure.
 
+config DETECT_HUNG_TASK_BLOCKER
+	bool "Dump Hung Tasks Blocker"
+	depends on DETECT_HUNG_TASK
+	depends on !PREEMPT_RT
+	default y
+	help
+	  Say Y here to show the blocker task's stacktrace who acquires
+	  the mutex lock which "hung tasks" are waiting.
+	  This will add overhead a bit but shows suspicious tasks and
+	  call trace if it comes from waiting a mutex.
+
 config WQ_WATCHDOG
 	bool "Detect Workqueue Stalls"
 	depends on DEBUG_KERNEL


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
  2025-02-25  7:02 ` [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex Masami Hiramatsu (Google)
@ 2025-02-26  1:23   ` Waiman Long
  2025-03-06  2:32     ` Masami Hiramatsu
  2025-02-26  1:44   ` Lance Yang
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 30+ messages in thread
From: Waiman Long @ 2025-02-26  1:23 UTC (permalink / raw)
  To: Masami Hiramatsu (Google), Peter Zijlstra, Ingo Molnar,
	Will Deacon, Andrew Morton
  Cc: Boqun Feng, Joel Granados, Anna Schumaker, Lance Yang,
	Kent Overstreet, Yongliang Gao, Steven Rostedt, Tomasz Figa,
	Sergey Senozhatsky, linux-kernel

On 2/25/25 2:02 AM, Masami Hiramatsu (Google) wrote:
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
> The "hung_task" shows a long-time uninterruptible slept task, but most
> often, it's blocked on a mutex acquired by another task. Without
> dumping such a task, investigating the root cause of the hung task
> problem is very difficult.
>
> This introduce task_struct::blocker_mutex to point the mutex lock
> which this task is waiting for. Since the mutex has "owner"
> information, we can find the owner task and dump it with hung tasks.
>
> Note: the owner can be changed while dumping the owner task, so
> this is "likely" the owner of the mutex.
>
> With this change, the hung task shows blocker task's info like below;
>
>   INFO: task cat:115 blocked for more than 122 seconds.
>         Not tainted 6.14.0-rc3-00003-ga8946be3de00 #156
>   "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>   task:cat             state:D stack:13432 pid:115   tgid:115   ppid:106    task_flags:0x400100 flags:0x00000002
>   Call Trace:
>    <TASK>
>    __schedule+0x731/0x960
>    ? schedule_preempt_disabled+0x54/0xa0
>    schedule+0xb7/0x140
>    ? __mutex_lock+0x51b/0xa60
>    ? __mutex_lock+0x51b/0xa60
>    schedule_preempt_disabled+0x54/0xa0
>    __mutex_lock+0x51b/0xa60
>    read_dummy+0x23/0x70
>    full_proxy_read+0x6a/0xc0
>    vfs_read+0xc2/0x340
>    ? __pfx_direct_file_splice_eof+0x10/0x10
>    ? do_sendfile+0x1bd/0x2e0
>    ksys_read+0x76/0xe0
>    do_syscall_64+0xe3/0x1c0
>    ? exc_page_fault+0xa9/0x1d0
>    entry_SYSCALL_64_after_hwframe+0x77/0x7f
>   RIP: 0033:0x4840cd
>   RSP: 002b:00007ffe99071828 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
>   RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
>   RDX: 0000000000001000 RSI: 00007ffe99071870 RDI: 0000000000000003
>   RBP: 00007ffe99071870 R08: 0000000000000000 R09: 0000000000000000
>   R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
>   R13: 00000000132fd3a0 R14: 0000000000000001 R15: ffffffffffffffff
>    </TASK>
>   INFO: task cat:115 is blocked on a mutex likely owned by task cat:114.
>   task:cat             state:S stack:13432 pid:114   tgid:114   ppid:106    task_flags:0x400100 flags:0x00000002
>   Call Trace:
>    <TASK>
>    __schedule+0x731/0x960
>    ? schedule_timeout+0xa8/0x120
>    schedule+0xb7/0x140
>    schedule_timeout+0xa8/0x120
>    ? __pfx_process_timeout+0x10/0x10
>    msleep_interruptible+0x3e/0x60
>    read_dummy+0x2d/0x70
>    full_proxy_read+0x6a/0xc0
>    vfs_read+0xc2/0x340
>    ? __pfx_direct_file_splice_eof+0x10/0x10
>    ? do_sendfile+0x1bd/0x2e0
>    ksys_read+0x76/0xe0
>    do_syscall_64+0xe3/0x1c0
>    ? exc_page_fault+0xa9/0x1d0
>    entry_SYSCALL_64_after_hwframe+0x77/0x7f
>   RIP: 0033:0x4840cd
>   RSP: 002b:00007ffe3e0147b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
>   RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
>   RDX: 0000000000001000 RSI: 00007ffe3e014800 RDI: 0000000000000003
>   RBP: 00007ffe3e014800 R08: 0000000000000000 R09: 0000000000000000
>   R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
>   R13: 000000001a0a93a0 R14: 0000000000000001 R15: ffffffffffffffff
>    </TASK>
>
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> ---
>   Changes in v4:
>    - Make this option depends on !PREEMPT_RT because it changes mutex to
>      rt_mutex.
> ---
>   include/linux/mutex.h  |    2 ++
>   include/linux/sched.h  |    4 ++++
>   kernel/hung_task.c     |   36 ++++++++++++++++++++++++++++++++++++
>   kernel/locking/mutex.c |   14 ++++++++++++++
>   lib/Kconfig.debug      |   11 +++++++++++
>   5 files changed, 67 insertions(+)
>
> diff --git a/include/linux/mutex.h b/include/linux/mutex.h
> index 2bf91b57591b..2143d05116be 100644
> --- a/include/linux/mutex.h
> +++ b/include/linux/mutex.h
> @@ -202,4 +202,6 @@ DEFINE_GUARD(mutex, struct mutex *, mutex_lock(_T), mutex_unlock(_T))
>   DEFINE_GUARD_COND(mutex, _try, mutex_trylock(_T))
>   DEFINE_GUARD_COND(mutex, _intr, mutex_lock_interruptible(_T) == 0)
>   
> +extern unsigned long mutex_get_owner(struct mutex *lock);
> +
>   #endif /* __LINUX_MUTEX_H */
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 9632e3318e0d..0cebdd736d44 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1217,6 +1217,10 @@ struct task_struct {
>   	struct mutex_waiter		*blocked_on;
>   #endif
>   
> +#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> +	struct mutex			*blocker_mutex;
> +#endif
> +
>   #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
>   	int				non_block_count;
>   #endif
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index 04efa7a6e69b..ccd7217fcec1 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -93,6 +93,41 @@ static struct notifier_block panic_block = {
>   	.notifier_call = hung_task_panic,
>   };
>   
> +
> +#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> +static void debug_show_blocker(struct task_struct *task)
> +{
> +	struct task_struct *g, *t;
> +	unsigned long owner;
> +	struct mutex *lock;
> +
> +	RCU_LOCKDEP_WARN(!rcu_read_lock_held(), "No rcu lock held");
> +
> +	lock = READ_ONCE(task->blocker_mutex);
> +	if (!lock)
> +		return;
> +
> +	owner = mutex_get_owner(lock);
> +	if (unlikely(!owner)) {
> +		pr_err("INFO: task %s:%d is blocked on a mutex, but the owner is not found.\n",
> +			task->comm, task->pid);
> +		return;
> +	}
> +
> +	/* Ensure the owner information is correct. */
> +	for_each_process_thread(g, t) {
> +		if ((unsigned long)t == owner) {
> +			pr_err("INFO: task %s:%d is blocked on a mutex likely owned by task %s:%d.\n",
> +				task->comm, task->pid, t->comm, t->pid);
> +			sched_show_task(t);
> +			return;
> +		}
> +	}
> +}
> +#else
> +#define debug_show_blocker(t)	do {} while (0)
> +#endif
> +
>   static void check_hung_task(struct task_struct *t, unsigned long timeout)
>   {
>   	unsigned long switch_count = t->nvcsw + t->nivcsw;
> @@ -152,6 +187,7 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
>   		pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\""
>   			" disables this message.\n");
>   		sched_show_task(t);
> +		debug_show_blocker(t);
>   		hung_task_show_lock = true;
>   
>   		if (sysctl_hung_task_all_cpu_backtrace)
> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> index b36f23de48f1..6a543c204a14 100644
> --- a/kernel/locking/mutex.c
> +++ b/kernel/locking/mutex.c
> @@ -72,6 +72,14 @@ static inline unsigned long __owner_flags(unsigned long owner)
>   	return owner & MUTEX_FLAGS;
>   }
>   
> +/* Do not use the return value as a pointer directly. */
> +unsigned long mutex_get_owner(struct mutex *lock)
> +{
> +	unsigned long owner = atomic_long_read(&lock->owner);
> +
> +	return (unsigned long)__owner_task(owner);
> +}
> +
>   /*
>    * Returns: __mutex_owner(lock) on failure or NULL on success.
>    */
> @@ -180,6 +188,9 @@ static void
>   __mutex_add_waiter(struct mutex *lock, struct mutex_waiter *waiter,
>   		   struct list_head *list)
>   {
> +#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> +	WRITE_ONCE(current->blocker_mutex, lock);
> +#endif
>   	debug_mutex_add_waiter(lock, waiter, current);
>   
>   	list_add_tail(&waiter->list, list);
> @@ -195,6 +206,9 @@ __mutex_remove_waiter(struct mutex *lock, struct mutex_waiter *waiter)
>   		__mutex_clear_flag(lock, MUTEX_FLAGS);
>   
>   	debug_mutex_remove_waiter(lock, waiter, current);
> +#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> +	WRITE_ONCE(current->blocker_mutex, NULL);
> +#endif
>   }
>   
>   /*
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 1af972a92d06..77d8c7e5ce96 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1260,6 +1260,17 @@ config BOOTPARAM_HUNG_TASK_PANIC
>   
>   	  Say N if unsure.
>   
> +config DETECT_HUNG_TASK_BLOCKER
> +	bool "Dump Hung Tasks Blocker"
> +	depends on DETECT_HUNG_TASK
> +	depends on !PREEMPT_RT
> +	default y
> +	help
> +	  Say Y here to show the blocker task's stacktrace who acquires
> +	  the mutex lock which "hung tasks" are waiting.
> +	  This will add overhead a bit but shows suspicious tasks and
> +	  call trace if it comes from waiting a mutex.
> +
>   config WQ_WATCHDOG
>   	bool "Detect Workqueue Stalls"
>   	depends on DEBUG_KERNEL
>
Reviewed-by: Waiman Long <longman@redhat.com>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
  2025-02-26  1:23   ` Waiman Long
@ 2025-03-06  2:32     ` Masami Hiramatsu
       [not found]       ` <5f7bc403-be75-4ae3-b6ff-5ff0673847f9@redhat.com>
  0 siblings, 1 reply; 30+ messages in thread
From: Masami Hiramatsu @ 2025-03-06  2:32 UTC (permalink / raw)
  To: Andrew Morton, Waiman Long
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng,
	Joel Granados, Anna Schumaker, Lance Yang, Kent Overstreet,
	Yongliang Gao, Steven Rostedt, Tomasz Figa, Sergey Senozhatsky,
	linux-kernel

On Tue, 25 Feb 2025 20:23:41 -0500
Waiman Long <llong@redhat.com> wrote:

> On 2/25/25 2:02 AM, Masami Hiramatsu (Google) wrote:
> > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> >
> > The "hung_task" shows a long-time uninterruptible slept task, but most
> > often, it's blocked on a mutex acquired by another task. Without
> > dumping such a task, investigating the root cause of the hung task
> > problem is very difficult.
> >
> > This introduce task_struct::blocker_mutex to point the mutex lock
> > which this task is waiting for. Since the mutex has "owner"
> > information, we can find the owner task and dump it with hung tasks.
> >
> > Note: the owner can be changed while dumping the owner task, so
> > this is "likely" the owner of the mutex.
> >
> > With this change, the hung task shows blocker task's info like below;
> >
> >   INFO: task cat:115 blocked for more than 122 seconds.
> >         Not tainted 6.14.0-rc3-00003-ga8946be3de00 #156
> >   "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >   task:cat             state:D stack:13432 pid:115   tgid:115   ppid:106    task_flags:0x400100 flags:0x00000002
> >   Call Trace:
> >    <TASK>
> >    __schedule+0x731/0x960
> >    ? schedule_preempt_disabled+0x54/0xa0
> >    schedule+0xb7/0x140
> >    ? __mutex_lock+0x51b/0xa60
> >    ? __mutex_lock+0x51b/0xa60
> >    schedule_preempt_disabled+0x54/0xa0
> >    __mutex_lock+0x51b/0xa60
> >    read_dummy+0x23/0x70
> >    full_proxy_read+0x6a/0xc0
> >    vfs_read+0xc2/0x340
> >    ? __pfx_direct_file_splice_eof+0x10/0x10
> >    ? do_sendfile+0x1bd/0x2e0
> >    ksys_read+0x76/0xe0
> >    do_syscall_64+0xe3/0x1c0
> >    ? exc_page_fault+0xa9/0x1d0
> >    entry_SYSCALL_64_after_hwframe+0x77/0x7f
> >   RIP: 0033:0x4840cd
> >   RSP: 002b:00007ffe99071828 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> >   RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
> >   RDX: 0000000000001000 RSI: 00007ffe99071870 RDI: 0000000000000003
> >   RBP: 00007ffe99071870 R08: 0000000000000000 R09: 0000000000000000
> >   R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
> >   R13: 00000000132fd3a0 R14: 0000000000000001 R15: ffffffffffffffff
> >    </TASK>
> >   INFO: task cat:115 is blocked on a mutex likely owned by task cat:114.
> >   task:cat             state:S stack:13432 pid:114   tgid:114   ppid:106    task_flags:0x400100 flags:0x00000002
> >   Call Trace:
> >    <TASK>
> >    __schedule+0x731/0x960
> >    ? schedule_timeout+0xa8/0x120
> >    schedule+0xb7/0x140
> >    schedule_timeout+0xa8/0x120
> >    ? __pfx_process_timeout+0x10/0x10
> >    msleep_interruptible+0x3e/0x60
> >    read_dummy+0x2d/0x70
> >    full_proxy_read+0x6a/0xc0
> >    vfs_read+0xc2/0x340
> >    ? __pfx_direct_file_splice_eof+0x10/0x10
> >    ? do_sendfile+0x1bd/0x2e0
> >    ksys_read+0x76/0xe0
> >    do_syscall_64+0xe3/0x1c0
> >    ? exc_page_fault+0xa9/0x1d0
> >    entry_SYSCALL_64_after_hwframe+0x77/0x7f
> >   RIP: 0033:0x4840cd
> >   RSP: 002b:00007ffe3e0147b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> >   RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
> >   RDX: 0000000000001000 RSI: 00007ffe3e014800 RDI: 0000000000000003
> >   RBP: 00007ffe3e014800 R08: 0000000000000000 R09: 0000000000000000
> >   R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
> >   R13: 000000001a0a93a0 R14: 0000000000000001 R15: ffffffffffffffff
> >    </TASK>
> >
> > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > ---
> >   Changes in v4:
> >    - Make this option depends on !PREEMPT_RT because it changes mutex to
> >      rt_mutex.
> > ---
> >   include/linux/mutex.h  |    2 ++
> >   include/linux/sched.h  |    4 ++++
> >   kernel/hung_task.c     |   36 ++++++++++++++++++++++++++++++++++++
> >   kernel/locking/mutex.c |   14 ++++++++++++++
> >   lib/Kconfig.debug      |   11 +++++++++++
> >   5 files changed, 67 insertions(+)
> >
> > diff --git a/include/linux/mutex.h b/include/linux/mutex.h
> > index 2bf91b57591b..2143d05116be 100644
> > --- a/include/linux/mutex.h
> > +++ b/include/linux/mutex.h
> > @@ -202,4 +202,6 @@ DEFINE_GUARD(mutex, struct mutex *, mutex_lock(_T), mutex_unlock(_T))
> >   DEFINE_GUARD_COND(mutex, _try, mutex_trylock(_T))
> >   DEFINE_GUARD_COND(mutex, _intr, mutex_lock_interruptible(_T) == 0)
> >   
> > +extern unsigned long mutex_get_owner(struct mutex *lock);
> > +
> >   #endif /* __LINUX_MUTEX_H */
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 9632e3318e0d..0cebdd736d44 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -1217,6 +1217,10 @@ struct task_struct {
> >   	struct mutex_waiter		*blocked_on;
> >   #endif
> >   
> > +#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> > +	struct mutex			*blocker_mutex;
> > +#endif
> > +
> >   #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
> >   	int				non_block_count;
> >   #endif
> > diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> > index 04efa7a6e69b..ccd7217fcec1 100644
> > --- a/kernel/hung_task.c
> > +++ b/kernel/hung_task.c
> > @@ -93,6 +93,41 @@ static struct notifier_block panic_block = {
> >   	.notifier_call = hung_task_panic,
> >   };
> >   
> > +
> > +#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> > +static void debug_show_blocker(struct task_struct *task)
> > +{
> > +	struct task_struct *g, *t;
> > +	unsigned long owner;
> > +	struct mutex *lock;
> > +
> > +	RCU_LOCKDEP_WARN(!rcu_read_lock_held(), "No rcu lock held");
> > +
> > +	lock = READ_ONCE(task->blocker_mutex);
> > +	if (!lock)
> > +		return;
> > +
> > +	owner = mutex_get_owner(lock);
> > +	if (unlikely(!owner)) {
> > +		pr_err("INFO: task %s:%d is blocked on a mutex, but the owner is not found.\n",
> > +			task->comm, task->pid);
> > +		return;
> > +	}
> > +
> > +	/* Ensure the owner information is correct. */
> > +	for_each_process_thread(g, t) {
> > +		if ((unsigned long)t == owner) {
> > +			pr_err("INFO: task %s:%d is blocked on a mutex likely owned by task %s:%d.\n",
> > +				task->comm, task->pid, t->comm, t->pid);
> > +			sched_show_task(t);
> > +			return;
> > +		}
> > +	}
> > +}
> > +#else
> > +#define debug_show_blocker(t)	do {} while (0)
> > +#endif
> > +
> >   static void check_hung_task(struct task_struct *t, unsigned long timeout)
> >   {
> >   	unsigned long switch_count = t->nvcsw + t->nivcsw;
> > @@ -152,6 +187,7 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
> >   		pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\""
> >   			" disables this message.\n");
> >   		sched_show_task(t);
> > +		debug_show_blocker(t);
> >   		hung_task_show_lock = true;
> >   
> >   		if (sysctl_hung_task_all_cpu_backtrace)
> > diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> > index b36f23de48f1..6a543c204a14 100644
> > --- a/kernel/locking/mutex.c
> > +++ b/kernel/locking/mutex.c
> > @@ -72,6 +72,14 @@ static inline unsigned long __owner_flags(unsigned long owner)
> >   	return owner & MUTEX_FLAGS;
> >   }
> >   
> > +/* Do not use the return value as a pointer directly. */
> > +unsigned long mutex_get_owner(struct mutex *lock)
> > +{
> > +	unsigned long owner = atomic_long_read(&lock->owner);
> > +
> > +	return (unsigned long)__owner_task(owner);
> > +}
> > +
> >   /*
> >    * Returns: __mutex_owner(lock) on failure or NULL on success.
> >    */
> > @@ -180,6 +188,9 @@ static void
> >   __mutex_add_waiter(struct mutex *lock, struct mutex_waiter *waiter,
> >   		   struct list_head *list)
> >   {
> > +#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> > +	WRITE_ONCE(current->blocker_mutex, lock);
> > +#endif
> >   	debug_mutex_add_waiter(lock, waiter, current);
> >   
> >   	list_add_tail(&waiter->list, list);
> > @@ -195,6 +206,9 @@ __mutex_remove_waiter(struct mutex *lock, struct mutex_waiter *waiter)
> >   		__mutex_clear_flag(lock, MUTEX_FLAGS);
> >   
> >   	debug_mutex_remove_waiter(lock, waiter, current);
> > +#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> > +	WRITE_ONCE(current->blocker_mutex, NULL);
> > +#endif
> >   }
> >   
> >   /*
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index 1af972a92d06..77d8c7e5ce96 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -1260,6 +1260,17 @@ config BOOTPARAM_HUNG_TASK_PANIC
> >   
> >   	  Say N if unsure.
> >   
> > +config DETECT_HUNG_TASK_BLOCKER
> > +	bool "Dump Hung Tasks Blocker"
> > +	depends on DETECT_HUNG_TASK
> > +	depends on !PREEMPT_RT
> > +	default y
> > +	help
> > +	  Say Y here to show the blocker task's stacktrace who acquires
> > +	  the mutex lock which "hung tasks" are waiting.
> > +	  This will add overhead a bit but shows suspicious tasks and
> > +	  call trace if it comes from waiting a mutex.
> > +
> >   config WQ_WATCHDOG
> >   	bool "Detect Workqueue Stalls"
> >   	depends on DEBUG_KERNEL
> >
> Reviewed-by: Waiman Long <longman@redhat.com>
> 

Thanks Waiman! BTW, who will pick this patch?
Andrew, could you pick this series?

Thank you,



-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

[parent not found: <5f7bc403-be75-4ae3-b6ff-5ff0673847f9@redhat.com>]

* Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
       [not found]       ` <5f7bc403-be75-4ae3-b6ff-5ff0673847f9@redhat.com>
@ 2025-03-06  3:10         ` Waiman Long
  2025-03-13  3:55           ` Masami Hiramatsu
  0 siblings, 1 reply; 30+ messages in thread
From: Waiman Long @ 2025-03-06  3:10 UTC (permalink / raw)
  To: Waiman Long, Peter Zijlstra
  Cc: Ingo Molnar, Will Deacon, Boqun Feng, Joel Granados,
	Anna Schumaker, Lance Yang, Kent Overstreet, Yongliang Gao,
	Steven Rostedt, Tomasz Figa, Sergey Senozhatsky, linux-kernel,
	Masami Hiramatsu (Google), Andrew Morton

On 3/5/25 10:07 PM, Waiman Long wrote:
>
>
> On 3/5/25 9:32 PM, Masami Hiramatsu (Google) wrote:
>>>> +config DETECT_HUNG_TASK_BLOCKER
>>>> +	bool "Dump Hung Tasks Blocker"
>>>> +	depends on DETECT_HUNG_TASK
>>>> +	depends on !PREEMPT_RT
>>>> +	default y
>>>> +	help
>>>> +	  Say Y here to show the blocker task's stacktrace who acquires
>>>> +	  the mutex lock which "hung tasks" are waiting.
>>>> +	  This will add overhead a bit but shows suspicious tasks and
>>>> +	  call trace if it comes from waiting a mutex.
>>>> +
>>>>    config WQ_WATCHDOG
>>>>    	bool "Detect Workqueue Stalls"
>>>>    	depends on DEBUG_KERNEL
>>>>
>>> Reviewed-by: Waiman Long<longman@redhat.com>
>>>
>> Thanks Waiman! BTW, who will pick this patch?
>> Andrew, could you pick this series?
>
> Peter, do you mind routing this patch via Andrew?
>
Resend as plain text.

Regards,
Longman


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
  2025-03-06  3:10         ` Waiman Long
@ 2025-03-13  3:55           ` Masami Hiramatsu
  0 siblings, 0 replies; 30+ messages in thread
From: Masami Hiramatsu @ 2025-03-13  3:55 UTC (permalink / raw)
  To: Peter Zijlstra, Waiman Long, Andrew Morton
  Cc: Ingo Molnar, Will Deacon, Boqun Feng, Joel Granados,
	Anna Schumaker, Lance Yang, Kent Overstreet, Yongliang Gao,
	Steven Rostedt, Tomasz Figa, Sergey Senozhatsky, linux-kernel,
	Masami Hiramatsu (Google)

On Wed, 5 Mar 2025 22:10:10 -0500
Waiman Long <llong@redhat.com> wrote:

> On 3/5/25 10:07 PM, Waiman Long wrote:
> >
> >
> > On 3/5/25 9:32 PM, Masami Hiramatsu (Google) wrote:
> >>>> +config DETECT_HUNG_TASK_BLOCKER
> >>>> +	bool "Dump Hung Tasks Blocker"
> >>>> +	depends on DETECT_HUNG_TASK
> >>>> +	depends on !PREEMPT_RT
> >>>> +	default y
> >>>> +	help
> >>>> +	  Say Y here to show the blocker task's stacktrace who acquires
> >>>> +	  the mutex lock which "hung tasks" are waiting.
> >>>> +	  This will add overhead a bit but shows suspicious tasks and
> >>>> +	  call trace if it comes from waiting a mutex.
> >>>> +
> >>>>    config WQ_WATCHDOG
> >>>>    	bool "Detect Workqueue Stalls"
> >>>>    	depends on DEBUG_KERNEL
> >>>>
> >>> Reviewed-by: Waiman Long<longman@redhat.com>
> >>>
> >> Thanks Waiman! BTW, who will pick this patch?
> >> Andrew, could you pick this series?
> >
> > Peter, do you mind routing this patch via Andrew?

Gently ping.

Thank you,

> >
> Resend as plain text.
> 
> Regards,
> Longman
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
  2025-02-25  7:02 ` [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex Masami Hiramatsu (Google)
  2025-02-26  1:23   ` Waiman Long
@ 2025-02-26  1:44   ` Lance Yang
  2025-02-26  4:38   ` Sergey Senozhatsky
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 30+ messages in thread
From: Lance Yang @ 2025-02-26  1:44 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton,
	Boqun Feng, Waiman Long, Joel Granados, Anna Schumaker,
	Kent Overstreet, Yongliang Gao, Steven Rostedt, Tomasz Figa,
	Sergey Senozhatsky, linux-kernel

On Tue, Feb 25, 2025 at 3:02 PM Masami Hiramatsu (Google)
<mhiramat@kernel.org> wrote:
>
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
> The "hung_task" shows a long-time uninterruptible slept task, but most
> often, it's blocked on a mutex acquired by another task. Without
> dumping such a task, investigating the root cause of the hung task
> problem is very difficult.
>
> This introduce task_struct::blocker_mutex to point the mutex lock
> which this task is waiting for. Since the mutex has "owner"
> information, we can find the owner task and dump it with hung tasks.
>
> Note: the owner can be changed while dumping the owner task, so
> this is "likely" the owner of the mutex.
>
> With this change, the hung task shows blocker task's info like below;
>
>  INFO: task cat:115 blocked for more than 122 seconds.
>        Not tainted 6.14.0-rc3-00003-ga8946be3de00 #156
>  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>  task:cat             state:D stack:13432 pid:115   tgid:115   ppid:106    task_flags:0x400100 flags:0x00000002
>  Call Trace:
>   <TASK>
>   __schedule+0x731/0x960
>   ? schedule_preempt_disabled+0x54/0xa0
>   schedule+0xb7/0x140
>   ? __mutex_lock+0x51b/0xa60
>   ? __mutex_lock+0x51b/0xa60
>   schedule_preempt_disabled+0x54/0xa0
>   __mutex_lock+0x51b/0xa60
>   read_dummy+0x23/0x70
>   full_proxy_read+0x6a/0xc0
>   vfs_read+0xc2/0x340
>   ? __pfx_direct_file_splice_eof+0x10/0x10
>   ? do_sendfile+0x1bd/0x2e0
>   ksys_read+0x76/0xe0
>   do_syscall_64+0xe3/0x1c0
>   ? exc_page_fault+0xa9/0x1d0
>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
>  RIP: 0033:0x4840cd
>  RSP: 002b:00007ffe99071828 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
>  RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
>  RDX: 0000000000001000 RSI: 00007ffe99071870 RDI: 0000000000000003
>  RBP: 00007ffe99071870 R08: 0000000000000000 R09: 0000000000000000
>  R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
>  R13: 00000000132fd3a0 R14: 0000000000000001 R15: ffffffffffffffff
>   </TASK>
>  INFO: task cat:115 is blocked on a mutex likely owned by task cat:114.
>  task:cat             state:S stack:13432 pid:114   tgid:114   ppid:106    task_flags:0x400100 flags:0x00000002
>  Call Trace:
>   <TASK>
>   __schedule+0x731/0x960
>   ? schedule_timeout+0xa8/0x120
>   schedule+0xb7/0x140
>   schedule_timeout+0xa8/0x120
>   ? __pfx_process_timeout+0x10/0x10
>   msleep_interruptible+0x3e/0x60
>   read_dummy+0x2d/0x70
>   full_proxy_read+0x6a/0xc0
>   vfs_read+0xc2/0x340
>   ? __pfx_direct_file_splice_eof+0x10/0x10
>   ? do_sendfile+0x1bd/0x2e0
>   ksys_read+0x76/0xe0
>   do_syscall_64+0xe3/0x1c0
>   ? exc_page_fault+0xa9/0x1d0
>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
>  RIP: 0033:0x4840cd
>  RSP: 002b:00007ffe3e0147b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
>  RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
>  RDX: 0000000000001000 RSI: 00007ffe3e014800 RDI: 0000000000000003
>  RBP: 00007ffe3e014800 R08: 0000000000000000 R09: 0000000000000000
>  R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
>  R13: 000000001a0a93a0 R14: 0000000000000001 R15: ffffffffffffffff
>   </TASK>
>
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Feel free to add:
Reviewed-by: Lance Yang <ioworker0@gmail.com>

Thanks,
Lance

> ---
>  Changes in v4:
>   - Make this option depends on !PREEMPT_RT because it changes mutex to
>     rt_mutex.
> ---
>  include/linux/mutex.h  |    2 ++
>  include/linux/sched.h  |    4 ++++
>  kernel/hung_task.c     |   36 ++++++++++++++++++++++++++++++++++++
>  kernel/locking/mutex.c |   14 ++++++++++++++
>  lib/Kconfig.debug      |   11 +++++++++++
>  5 files changed, 67 insertions(+)
>
> diff --git a/include/linux/mutex.h b/include/linux/mutex.h
> index 2bf91b57591b..2143d05116be 100644
> --- a/include/linux/mutex.h
> +++ b/include/linux/mutex.h
> @@ -202,4 +202,6 @@ DEFINE_GUARD(mutex, struct mutex *, mutex_lock(_T), mutex_unlock(_T))
>  DEFINE_GUARD_COND(mutex, _try, mutex_trylock(_T))
>  DEFINE_GUARD_COND(mutex, _intr, mutex_lock_interruptible(_T) == 0)
>
> +extern unsigned long mutex_get_owner(struct mutex *lock);
> +
>  #endif /* __LINUX_MUTEX_H */
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 9632e3318e0d..0cebdd736d44 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1217,6 +1217,10 @@ struct task_struct {
>         struct mutex_waiter             *blocked_on;
>  #endif
>
> +#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> +       struct mutex                    *blocker_mutex;
> +#endif
> +
>  #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
>         int                             non_block_count;
>  #endif
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index 04efa7a6e69b..ccd7217fcec1 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -93,6 +93,41 @@ static struct notifier_block panic_block = {
>         .notifier_call = hung_task_panic,
>  };
>
> +
> +#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> +static void debug_show_blocker(struct task_struct *task)
> +{
> +       struct task_struct *g, *t;
> +       unsigned long owner;
> +       struct mutex *lock;
> +
> +       RCU_LOCKDEP_WARN(!rcu_read_lock_held(), "No rcu lock held");
> +
> +       lock = READ_ONCE(task->blocker_mutex);
> +       if (!lock)
> +               return;
> +
> +       owner = mutex_get_owner(lock);
> +       if (unlikely(!owner)) {
> +               pr_err("INFO: task %s:%d is blocked on a mutex, but the owner is not found.\n",
> +                       task->comm, task->pid);
> +               return;
> +       }
> +
> +       /* Ensure the owner information is correct. */
> +       for_each_process_thread(g, t) {
> +               if ((unsigned long)t == owner) {
> +                       pr_err("INFO: task %s:%d is blocked on a mutex likely owned by task %s:%d.\n",
> +                               task->comm, task->pid, t->comm, t->pid);
> +                       sched_show_task(t);
> +                       return;
> +               }
> +       }
> +}
> +#else
> +#define debug_show_blocker(t)  do {} while (0)
> +#endif
> +
>  static void check_hung_task(struct task_struct *t, unsigned long timeout)
>  {
>         unsigned long switch_count = t->nvcsw + t->nivcsw;
> @@ -152,6 +187,7 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
>                 pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\""
>                         " disables this message.\n");
>                 sched_show_task(t);
> +               debug_show_blocker(t);
>                 hung_task_show_lock = true;
>
>                 if (sysctl_hung_task_all_cpu_backtrace)
> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> index b36f23de48f1..6a543c204a14 100644
> --- a/kernel/locking/mutex.c
> +++ b/kernel/locking/mutex.c
> @@ -72,6 +72,14 @@ static inline unsigned long __owner_flags(unsigned long owner)
>         return owner & MUTEX_FLAGS;
>  }
>
> +/* Do not use the return value as a pointer directly. */
> +unsigned long mutex_get_owner(struct mutex *lock)
> +{
> +       unsigned long owner = atomic_long_read(&lock->owner);
> +
> +       return (unsigned long)__owner_task(owner);
> +}
> +
>  /*
>   * Returns: __mutex_owner(lock) on failure or NULL on success.
>   */
> @@ -180,6 +188,9 @@ static void
>  __mutex_add_waiter(struct mutex *lock, struct mutex_waiter *waiter,
>                    struct list_head *list)
>  {
> +#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> +       WRITE_ONCE(current->blocker_mutex, lock);
> +#endif
>         debug_mutex_add_waiter(lock, waiter, current);
>
>         list_add_tail(&waiter->list, list);
> @@ -195,6 +206,9 @@ __mutex_remove_waiter(struct mutex *lock, struct mutex_waiter *waiter)
>                 __mutex_clear_flag(lock, MUTEX_FLAGS);
>
>         debug_mutex_remove_waiter(lock, waiter, current);
> +#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> +       WRITE_ONCE(current->blocker_mutex, NULL);
> +#endif
>  }
>
>  /*
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 1af972a92d06..77d8c7e5ce96 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1260,6 +1260,17 @@ config BOOTPARAM_HUNG_TASK_PANIC
>
>           Say N if unsure.
>
> +config DETECT_HUNG_TASK_BLOCKER
> +       bool "Dump Hung Tasks Blocker"
> +       depends on DETECT_HUNG_TASK
> +       depends on !PREEMPT_RT
> +       default y
> +       help
> +         Say Y here to show the blocker task's stacktrace who acquires
> +         the mutex lock which "hung tasks" are waiting.
> +         This will add overhead a bit but shows suspicious tasks and
> +         call trace if it comes from waiting a mutex.
> +
>  config WQ_WATCHDOG
>         bool "Detect Workqueue Stalls"
>         depends on DEBUG_KERNEL
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
  2025-02-25  7:02 ` [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex Masami Hiramatsu (Google)
  2025-02-26  1:23   ` Waiman Long
  2025-02-26  1:44   ` Lance Yang
@ 2025-02-26  4:38   ` Sergey Senozhatsky
  2025-02-26 15:07     ` Steven Rostedt
  2025-03-13 22:29   ` Andrew Morton
  2025-07-30  7:59   ` Sergey Senozhatsky
  4 siblings, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2025-02-26  4:38 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton,
	Boqun Feng, Waiman Long, Joel Granados, Anna Schumaker,
	Lance Yang, Kent Overstreet, Yongliang Gao, Steven Rostedt,
	Tomasz Figa, Sergey Senozhatsky, linux-kernel

On (25/02/25 16:02), Masami Hiramatsu (Google) wrote:
> The "hung_task" shows a long-time uninterruptible slept task, but most
> often, it's blocked on a mutex acquired by another task. Without
> dumping such a task, investigating the root cause of the hung task
> problem is very difficult.
> 
> This introduce task_struct::blocker_mutex to point the mutex lock
> which this task is waiting for. Since the mutex has "owner"
> information, we can find the owner task and dump it with hung tasks.
> 
> Note: the owner can be changed while dumping the owner task, so
> this is "likely" the owner of the mutex.

I assume another possibility can be that the owner is still around,
let's say a kworker that simply forgot to mutex_unlock(), so we'll
get its backtrace but it can be misleading, because kworker in
question might be doing something completely unrelated.  But this
is still better than nothing.

FWIW
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
  2025-02-26  4:38   ` Sergey Senozhatsky
@ 2025-02-26 15:07     ` Steven Rostedt
  0 siblings, 0 replies; 30+ messages in thread
From: Steven Rostedt @ 2025-02-26 15:07 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Masami Hiramatsu (Google), Peter Zijlstra, Ingo Molnar,
	Will Deacon, Andrew Morton, Boqun Feng, Waiman Long,
	Joel Granados, Anna Schumaker, Lance Yang, Kent Overstreet,
	Yongliang Gao, Tomasz Figa, linux-kernel

On Wed, 26 Feb 2025 13:38:19 +0900
Sergey Senozhatsky <senozhatsky@chromium.org> wrote:

> I assume another possibility can be that the owner is still around,
> let's say a kworker that simply forgot to mutex_unlock(), so we'll
> get its backtrace but it can be misleading, because kworker in
> question might be doing something completely unrelated.

Well, if that happens, then we have much bigger problems than this ;-)

-- Steve

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
  2025-02-25  7:02 ` [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex Masami Hiramatsu (Google)
                     ` (2 preceding siblings ...)
  2025-02-26  4:38   ` Sergey Senozhatsky
@ 2025-03-13 22:29   ` Andrew Morton
  2025-03-14  3:57     ` Masami Hiramatsu
  2025-07-30  7:59   ` Sergey Senozhatsky
  4 siblings, 1 reply; 30+ messages in thread
From: Andrew Morton @ 2025-03-13 22:29 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Waiman Long,
	Joel Granados, Anna Schumaker, Lance Yang, Kent Overstreet,
	Yongliang Gao, Steven Rostedt, Tomasz Figa, Sergey Senozhatsky,
	linux-kernel

On Tue, 25 Feb 2025 16:02:34 +0900 "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:

> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> The "hung_task" shows a long-time uninterruptible slept task, but most
> often, it's blocked on a mutex acquired by another task. Without
> dumping such a task, investigating the root cause of the hung task
> problem is very difficult.
> 
> This introduce task_struct::blocker_mutex to point the mutex lock
> which this task is waiting for. Since the mutex has "owner"
> information, we can find the owner task and dump it with hung tasks.
> 
> Note: the owner can be changed while dumping the owner task, so
> this is "likely" the owner of the mutex.
> 
> With this change, the hung task shows blocker task's info like below;

Seems useful.

> ...
>
> +static void debug_show_blocker(struct task_struct *task)
> +{
>
> ...
>
> +}
> +#else
> +#define debug_show_blocker(t)	do {} while (0)
> +#endif
> +

Nit. It's unpleasing to have one side a C function and the other a
macro.  Plus C functions are simply better - only use a macro if one
has to!

So,

--- a/kernel/hung_task.c~hung_task-show-the-blocker-task-if-the-task-is-hung-on-mutex-fix
+++ a/kernel/hung_task.c
@@ -125,7 +125,9 @@ static void debug_show_blocker(struct ta
 	}
 }
 #else
-#define debug_show_blocker(t)	do {} while (0)
+static inline void debug_show_blocker(struct task_struct *task)
+{
+}
 #endif
 
 static void check_hung_task(struct task_struct *t, unsigned long timeout)
_


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
  2025-03-13 22:29   ` Andrew Morton
@ 2025-03-14  3:57     ` Masami Hiramatsu
  0 siblings, 0 replies; 30+ messages in thread
From: Masami Hiramatsu @ 2025-03-14  3:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Waiman Long,
	Joel Granados, Anna Schumaker, Lance Yang, Kent Overstreet,
	Yongliang Gao, Steven Rostedt, Tomasz Figa, Sergey Senozhatsky,
	linux-kernel

On Thu, 13 Mar 2025 15:29:46 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Tue, 25 Feb 2025 16:02:34 +0900 "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:
> 
> > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > 
> > The "hung_task" shows a long-time uninterruptible slept task, but most
> > often, it's blocked on a mutex acquired by another task. Without
> > dumping such a task, investigating the root cause of the hung task
> > problem is very difficult.
> > 
> > This introduce task_struct::blocker_mutex to point the mutex lock
> > which this task is waiting for. Since the mutex has "owner"
> > information, we can find the owner task and dump it with hung tasks.
> > 
> > Note: the owner can be changed while dumping the owner task, so
> > this is "likely" the owner of the mutex.
> > 
> > With this change, the hung task shows blocker task's info like below;
> 
> Seems useful.
> 
> > ...
> >
> > +static void debug_show_blocker(struct task_struct *task)
> > +{
> >
> > ...
> >
> > +}
> > +#else
> > +#define debug_show_blocker(t)	do {} while (0)
> > +#endif
> > +
> 
> Nit. It's unpleasing to have one side a C function and the other a
> macro.  Plus C functions are simply better - only use a macro if one
> has to!

Ah, that's nice to know. Thanks for the fix!

> 
> So,
> 
> --- a/kernel/hung_task.c~hung_task-show-the-blocker-task-if-the-task-is-hung-on-mutex-fix
> +++ a/kernel/hung_task.c
> @@ -125,7 +125,9 @@ static void debug_show_blocker(struct ta
>  	}
>  }
>  #else
> -#define debug_show_blocker(t)	do {} while (0)
> +static inline void debug_show_blocker(struct task_struct *task)
> +{
> +}
>  #endif
>  
>  static void check_hung_task(struct task_struct *t, unsigned long timeout)
> _
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
  2025-02-25  7:02 ` [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex Masami Hiramatsu (Google)
                     ` (3 preceding siblings ...)
  2025-03-13 22:29   ` Andrew Morton
@ 2025-07-30  7:59   ` Sergey Senozhatsky
  2025-07-30  8:51     ` Masami Hiramatsu
  2025-07-30  9:22     ` Lance Yang
  4 siblings, 2 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2025-07-30  7:59 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton,
	Boqun Feng, Waiman Long, Joel Granados, Anna Schumaker,
	Lance Yang, Kent Overstreet, Yongliang Gao, Steven Rostedt,
	Tomasz Figa, Sergey Senozhatsky, linux-kernel

On (25/02/25 16:02), Masami Hiramatsu (Google) wrote:
> The "hung_task" shows a long-time uninterruptible slept task, but most
> often, it's blocked on a mutex acquired by another task. Without
> dumping such a task, investigating the root cause of the hung task
> problem is very difficult.
> 
> This introduce task_struct::blocker_mutex to point the mutex lock
> which this task is waiting for. Since the mutex has "owner"
> information, we can find the owner task and dump it with hung tasks.
> 
> Note: the owner can be changed while dumping the owner task, so
> this is "likely" the owner of the mutex.
> 
> With this change, the hung task shows blocker task's info like below;
> 
>  INFO: task cat:115 blocked for more than 122 seconds.
>        Not tainted 6.14.0-rc3-00003-ga8946be3de00 #156
>  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>  task:cat             state:D stack:13432 pid:115   tgid:115   ppid:106    task_flags:0x400100 flags:0x00000002
>  Call Trace:
>   <TASK>
>   __schedule+0x731/0x960
>   ? schedule_preempt_disabled+0x54/0xa0
>   schedule+0xb7/0x140
>   ? __mutex_lock+0x51b/0xa60
>   ? __mutex_lock+0x51b/0xa60
>   schedule_preempt_disabled+0x54/0xa0
>   __mutex_lock+0x51b/0xa60
>   read_dummy+0x23/0x70
>   full_proxy_read+0x6a/0xc0
>   vfs_read+0xc2/0x340
>   ? __pfx_direct_file_splice_eof+0x10/0x10
>   ? do_sendfile+0x1bd/0x2e0
>   ksys_read+0x76/0xe0
>   do_syscall_64+0xe3/0x1c0
>   ? exc_page_fault+0xa9/0x1d0
>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
>  RIP: 0033:0x4840cd
>  RSP: 002b:00007ffe99071828 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
>  RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
>  RDX: 0000000000001000 RSI: 00007ffe99071870 RDI: 0000000000000003
>  RBP: 00007ffe99071870 R08: 0000000000000000 R09: 0000000000000000
>  R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
>  R13: 00000000132fd3a0 R14: 0000000000000001 R15: ffffffffffffffff
>   </TASK>
>  INFO: task cat:115 is blocked on a mutex likely owned by task cat:114.
>  task:cat             state:S stack:13432 pid:114   tgid:114   ppid:106    task_flags:0x400100 flags:0x00000002
>  Call Trace:
>   <TASK>
>   __schedule+0x731/0x960
>   ? schedule_timeout+0xa8/0x120
>   schedule+0xb7/0x140
>   schedule_timeout+0xa8/0x120
>   ? __pfx_process_timeout+0x10/0x10
>   msleep_interruptible+0x3e/0x60
>   read_dummy+0x2d/0x70
>   full_proxy_read+0x6a/0xc0
>   vfs_read+0xc2/0x340
>   ? __pfx_direct_file_splice_eof+0x10/0x10
>   ? do_sendfile+0x1bd/0x2e0
>   ksys_read+0x76/0xe0
>   do_syscall_64+0xe3/0x1c0
>   ? exc_page_fault+0xa9/0x1d0
>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
>  RIP: 0033:0x4840cd
>  RSP: 002b:00007ffe3e0147b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
>  RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
>  RDX: 0000000000001000 RSI: 00007ffe3e014800 RDI: 0000000000000003
>  RBP: 00007ffe3e014800 R08: 0000000000000000 R09: 0000000000000000
>  R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
>  R13: 000000001a0a93a0 R14: 0000000000000001 R15: ffffffffffffffff
>   </TASK>

One thing that gives me a bit of "inconvenience" is that in certain
cases this significantly increases the amount of stack traces to go
through.  A distilled real life example:
- task T1 acquires lock L1, attempts to acquire L2
- task T2 acquires lock L2, attempts to acquire L3
- task T3 acquires lock L3, attempts to acquire L1

So we'd now see:
- a backtrace of T1, followed by a backtrace of T2 (owner of L2)
- a backtrace of T2, followed by a backtrace of T3 (owner of L3)
- a backtrace of T3, followed by a backtrace of T1 (owner of L1)

Notice how each task is backtraced twice.  I wonder if it's worth it
to de-dup the backtraces.  E.g. in

	task cat:115 is blocked on a mutex likely owned by task cat:114

if we know that cat:114 is also blocked on a lock, then we probably
can just say "is blocked on a mutex likely owned by task cat:114" and
continue iterating through tasks.  That "cat:114" will be backtraced
individually later, as it's also blocked on a lock, owned by another
task.

Does this make any sense?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
  2025-07-30  7:59   ` Sergey Senozhatsky
@ 2025-07-30  8:51     ` Masami Hiramatsu
  2025-07-30  9:36       ` Lance Yang
                         ` (2 more replies)
  2025-07-30  9:22     ` Lance Yang
  1 sibling, 3 replies; 30+ messages in thread
From: Masami Hiramatsu @ 2025-07-30  8:51 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton,
	Boqun Feng, Waiman Long, Joel Granados, Anna Schumaker,
	Lance Yang, Kent Overstreet, Yongliang Gao, Steven Rostedt,
	Tomasz Figa, linux-kernel

On Wed, 30 Jul 2025 16:59:22 +0900
Sergey Senozhatsky <senozhatsky@chromium.org> wrote:

> One thing that gives me a bit of "inconvenience" is that in certain
> cases this significantly increases the amount of stack traces to go
> through.  A distilled real life example:
> - task T1 acquires lock L1, attempts to acquire L2
> - task T2 acquires lock L2, attempts to acquire L3
> - task T3 acquires lock L3, attempts to acquire L1
> 
> So we'd now see:
> - a backtrace of T1, followed by a backtrace of T2 (owner of L2)
> - a backtrace of T2, followed by a backtrace of T3 (owner of L3)
> - a backtrace of T3, followed by a backtrace of T1 (owner of L1)
> 
> Notice how each task is backtraced twice.  I wonder if it's worth it
> to de-dup the backtraces.  E.g. in
> 
> 	task cat:115 is blocked on a mutex likely owned by task cat:114
> 
> if we know that cat:114 is also blocked on a lock, then we probably
> can just say "is blocked on a mutex likely owned by task cat:114" and
> continue iterating through tasks.  That "cat:114" will be backtraced
> individually later, as it's also blocked on a lock, owned by another
> task.
> 
> Does this make any sense?

Hrm, OK. So what about dump the blocker task only if that task is
NOT blocked? (because if the task is blocked, it should be dumped
afterwards (or already))

Thank you,

-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
  2025-07-30  8:51     ` Masami Hiramatsu
@ 2025-07-30  9:36       ` Lance Yang
  2025-07-30 10:01         ` Masami Hiramatsu
  2025-07-30 10:16         ` Sergey Senozhatsky
  2025-07-30  9:53       ` [RFC PATCH] hung_task: Dump blocker task if it is not hung Masami Hiramatsu (Google)
  2025-07-30  9:56       ` [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex Sergey Senozhatsky
  2 siblings, 2 replies; 30+ messages in thread
From: Lance Yang @ 2025-07-30  9:36 UTC (permalink / raw)
  To: Masami Hiramatsu (Google), Sergey Senozhatsky
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton,
	Boqun Feng, Waiman Long, Joel Granados, Anna Schumaker,
	Lance Yang, Kent Overstreet, Yongliang Gao, Steven Rostedt,
	Tomasz Figa, linux-kernel



On 2025/7/30 16:51, Masami Hiramatsu (Google) wrote:
> On Wed, 30 Jul 2025 16:59:22 +0900
> Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
> 
>> One thing that gives me a bit of "inconvenience" is that in certain
>> cases this significantly increases the amount of stack traces to go
>> through.  A distilled real life example:
>> - task T1 acquires lock L1, attempts to acquire L2
>> - task T2 acquires lock L2, attempts to acquire L3
>> - task T3 acquires lock L3, attempts to acquire L1
>>
>> So we'd now see:
>> - a backtrace of T1, followed by a backtrace of T2 (owner of L2)
>> - a backtrace of T2, followed by a backtrace of T3 (owner of L3)
>> - a backtrace of T3, followed by a backtrace of T1 (owner of L1)
>>
>> Notice how each task is backtraced twice.  I wonder if it's worth it
>> to de-dup the backtraces.  E.g. in
>>
>> 	task cat:115 is blocked on a mutex likely owned by task cat:114
>>
>> if we know that cat:114 is also blocked on a lock, then we probably
>> can just say "is blocked on a mutex likely owned by task cat:114" and
>> continue iterating through tasks.  That "cat:114" will be backtraced
>> individually later, as it's also blocked on a lock, owned by another
>> task.
>>
>> Does this make any sense?
> 
> Hrm, OK. So what about dump the blocker task only if that task is
> NOT blocked? (because if the task is blocked, it should be dumped
> afterwards (or already))

Hmm... I'm concerned about a potential side effect of that logic.

Consider a simple, non-circular blocking chain like T1 -> T2 -> T3.

In this scenario, T1, T2, and T3 would all be dumped as hung tasks.
However, with the proposed rule (dump only if NOT blocked), when the
detector processes T1, it would see that its blocker (T2) is also
blocked and would therefore skip printing any blocker information about
T2.

The key issue is that we would lose the crucial T1 -> T2 relationship
information from the log.

While all three tasks would still be dumped, we would no longer be able
to see the explicit dependency chain. It seems like the blocker tracking
itself would be broken in this case.

Thanks,
Lance



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
  2025-07-30  9:36       ` Lance Yang
@ 2025-07-30 10:01         ` Masami Hiramatsu
  2025-07-30 10:42           ` Lance Yang
  2025-07-30 10:16         ` Sergey Senozhatsky
  1 sibling, 1 reply; 30+ messages in thread
From: Masami Hiramatsu @ 2025-07-30 10:01 UTC (permalink / raw)
  To: Lance Yang
  Cc: Sergey Senozhatsky, Peter Zijlstra, Ingo Molnar, Will Deacon,
	Andrew Morton, Boqun Feng, Waiman Long, Joel Granados,
	Anna Schumaker, Lance Yang, Kent Overstreet, Yongliang Gao,
	Steven Rostedt, Tomasz Figa, linux-kernel

On Wed, 30 Jul 2025 17:36:04 +0800
Lance Yang <lance.yang@linux.dev> wrote:

> 
> 
> On 2025/7/30 16:51, Masami Hiramatsu (Google) wrote:
> > On Wed, 30 Jul 2025 16:59:22 +0900
> > Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
> > 
> >> One thing that gives me a bit of "inconvenience" is that in certain
> >> cases this significantly increases the amount of stack traces to go
> >> through.  A distilled real life example:
> >> - task T1 acquires lock L1, attempts to acquire L2
> >> - task T2 acquires lock L2, attempts to acquire L3
> >> - task T3 acquires lock L3, attempts to acquire L1
> >>
> >> So we'd now see:
> >> - a backtrace of T1, followed by a backtrace of T2 (owner of L2)
> >> - a backtrace of T2, followed by a backtrace of T3 (owner of L3)
> >> - a backtrace of T3, followed by a backtrace of T1 (owner of L1)
> >>
> >> Notice how each task is backtraced twice.  I wonder if it's worth it
> >> to de-dup the backtraces.  E.g. in
> >>
> >> 	task cat:115 is blocked on a mutex likely owned by task cat:114
> >>
> >> if we know that cat:114 is also blocked on a lock, then we probably
> >> can just say "is blocked on a mutex likely owned by task cat:114" and
> >> continue iterating through tasks.  That "cat:114" will be backtraced
> >> individually later, as it's also blocked on a lock, owned by another
> >> task.
> >>
> >> Does this make any sense?
> > 
> > Hrm, OK. So what about dump the blocker task only if that task is
> > NOT blocked? (because if the task is blocked, it should be dumped
> > afterwards (or already))
> 
> Hmm... I'm concerned about a potential side effect of that logic.
> 
> Consider a simple, non-circular blocking chain like T1 -> T2 -> T3.
> 
> In this scenario, T1, T2, and T3 would all be dumped as hung tasks.
> However, with the proposed rule (dump only if NOT blocked), when the
> detector processes T1, it would see that its blocker (T2) is also
> blocked and would therefore skip printing any blocker information about
> T2.
> 
> The key issue is that we would lose the crucial T1 -> T2 relationship
> information from the log.

I just skip printing T2's stack dump, but still show "T1 is blocked by T2"
so the relationship is still clear.

Thank you,

> 
> While all three tasks would still be dumped, we would no longer be able
> to see the explicit dependency chain. It seems like the blocker tracking
> itself would be broken in this case.
> 
> Thanks,
> Lance
> 
> 
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
  2025-07-30 10:01         ` Masami Hiramatsu
@ 2025-07-30 10:42           ` Lance Yang
  0 siblings, 0 replies; 30+ messages in thread
From: Lance Yang @ 2025-07-30 10:42 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Sergey Senozhatsky, Peter Zijlstra, Ingo Molnar, Will Deacon,
	Andrew Morton, Boqun Feng, Waiman Long, Joel Granados,
	Anna Schumaker, Lance Yang, Kent Overstreet, Yongliang Gao,
	Steven Rostedt, Tomasz Figa, linux-kernel



On 2025/7/30 18:01, Masami Hiramatsu (Google) wrote:
> On Wed, 30 Jul 2025 17:36:04 +0800
> Lance Yang <lance.yang@linux.dev> wrote:
> 
>>
>>
>> On 2025/7/30 16:51, Masami Hiramatsu (Google) wrote:
>>> On Wed, 30 Jul 2025 16:59:22 +0900
>>> Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
>>>
>>>> One thing that gives me a bit of "inconvenience" is that in certain
>>>> cases this significantly increases the amount of stack traces to go
>>>> through.  A distilled real life example:
>>>> - task T1 acquires lock L1, attempts to acquire L2
>>>> - task T2 acquires lock L2, attempts to acquire L3
>>>> - task T3 acquires lock L3, attempts to acquire L1
>>>>
>>>> So we'd now see:
>>>> - a backtrace of T1, followed by a backtrace of T2 (owner of L2)
>>>> - a backtrace of T2, followed by a backtrace of T3 (owner of L3)
>>>> - a backtrace of T3, followed by a backtrace of T1 (owner of L1)
>>>>
>>>> Notice how each task is backtraced twice.  I wonder if it's worth it
>>>> to de-dup the backtraces.  E.g. in
>>>>
>>>> 	task cat:115 is blocked on a mutex likely owned by task cat:114
>>>>
>>>> if we know that cat:114 is also blocked on a lock, then we probably
>>>> can just say "is blocked on a mutex likely owned by task cat:114" and
>>>> continue iterating through tasks.  That "cat:114" will be backtraced
>>>> individually later, as it's also blocked on a lock, owned by another
>>>> task.
>>>>
>>>> Does this make any sense?
>>>
>>> Hrm, OK. So what about dump the blocker task only if that task is
>>> NOT blocked? (because if the task is blocked, it should be dumped
>>> afterwards (or already))
>>
>> Hmm... I'm concerned about a potential side effect of that logic.
>>
>> Consider a simple, non-circular blocking chain like T1 -> T2 -> T3.
>>
>> In this scenario, T1, T2, and T3 would all be dumped as hung tasks.
>> However, with the proposed rule (dump only if NOT blocked), when the
>> detector processes T1, it would see that its blocker (T2) is also
>> blocked and would therefore skip printing any blocker information about
>> T2.
>>
>> The key issue is that we would lose the crucial T1 -> T2 relationship
>> information from the log.
> 
> I just skip printing T2's stack dump, but still show "T1 is blocked by T2"
> so the relationship is still clear.

Ah, I see! That approach makes sense to me ;)


Thanks,
Lance


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
  2025-07-30  9:36       ` Lance Yang
  2025-07-30 10:01         ` Masami Hiramatsu
@ 2025-07-30 10:16         ` Sergey Senozhatsky
  2025-07-30 10:40           ` Lance Yang
  1 sibling, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2025-07-30 10:16 UTC (permalink / raw)
  To: Lance Yang
  Cc: Masami Hiramatsu (Google), Sergey Senozhatsky, Peter Zijlstra,
	Ingo Molnar, Will Deacon, Andrew Morton, Boqun Feng, Waiman Long,
	Joel Granados, Anna Schumaker, Lance Yang, Kent Overstreet,
	Yongliang Gao, Steven Rostedt, Tomasz Figa, linux-kernel

On (25/07/30 17:36), Lance Yang wrote:
> On 2025/7/30 16:51, Masami Hiramatsu (Google) wrote:
> > On Wed, 30 Jul 2025 16:59:22 +0900
> > Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
> > 
> > > One thing that gives me a bit of "inconvenience" is that in certain
> > > cases this significantly increases the amount of stack traces to go
> > > through.  A distilled real life example:
> > > - task T1 acquires lock L1, attempts to acquire L2
> > > - task T2 acquires lock L2, attempts to acquire L3
> > > - task T3 acquires lock L3, attempts to acquire L1
> > > 
> > > So we'd now see:
> > > - a backtrace of T1, followed by a backtrace of T2 (owner of L2)
> > > - a backtrace of T2, followed by a backtrace of T3 (owner of L3)
> > > - a backtrace of T3, followed by a backtrace of T1 (owner of L1)
> > > 
> > > Notice how each task is backtraced twice.  I wonder if it's worth it
> > > to de-dup the backtraces.  E.g. in
> > > 
> > > 	task cat:115 is blocked on a mutex likely owned by task cat:114
> > > 
> > > if we know that cat:114 is also blocked on a lock, then we probably
> > > can just say "is blocked on a mutex likely owned by task cat:114" and
> > > continue iterating through tasks.  That "cat:114" will be backtraced
> > > individually later, as it's also blocked on a lock, owned by another
> > > task.
> > > 
> > > Does this make any sense?
> > 
> > Hrm, OK. So what about dump the blocker task only if that task is
> > NOT blocked? (because if the task is blocked, it should be dumped
> > afterwards (or already))
> 
> Hmm... I'm concerned about a potential side effect of that logic.
> 
> Consider a simple, non-circular blocking chain like T1 -> T2 -> T3.
> 
> In this scenario, T1, T2, and T3 would all be dumped as hung tasks.
> However, with the proposed rule (dump only if NOT blocked), when the
> detector processes T1, it would see that its blocker (T2) is also
> blocked and would therefore skip printing any blocker information about
> T2.

That's not what I proposed.  The suggestions here is to print only
"is blocked likely owned by task cat:114" and do not append the
backtrace of that cat:114, because it will be printed separately
(since it's a blocked task).  But we should do so only if blocker
is also blocked.  So the relation between T1 and T2 will still be
exposed.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
  2025-07-30 10:16         ` Sergey Senozhatsky
@ 2025-07-30 10:40           ` Lance Yang
  0 siblings, 0 replies; 30+ messages in thread
From: Lance Yang @ 2025-07-30 10:40 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Masami Hiramatsu (Google), Peter Zijlstra, Ingo Molnar,
	Will Deacon, Andrew Morton, Boqun Feng, Waiman Long,
	Joel Granados, Anna Schumaker, Lance Yang, Kent Overstreet,
	Yongliang Gao, Steven Rostedt, Tomasz Figa, linux-kernel



On 2025/7/30 18:16, Sergey Senozhatsky wrote:
> On (25/07/30 17:36), Lance Yang wrote:
>> On 2025/7/30 16:51, Masami Hiramatsu (Google) wrote:
>>> On Wed, 30 Jul 2025 16:59:22 +0900
>>> Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
>>>
>>>> One thing that gives me a bit of "inconvenience" is that in certain
>>>> cases this significantly increases the amount of stack traces to go
>>>> through.  A distilled real life example:
>>>> - task T1 acquires lock L1, attempts to acquire L2
>>>> - task T2 acquires lock L2, attempts to acquire L3
>>>> - task T3 acquires lock L3, attempts to acquire L1
>>>>
>>>> So we'd now see:
>>>> - a backtrace of T1, followed by a backtrace of T2 (owner of L2)
>>>> - a backtrace of T2, followed by a backtrace of T3 (owner of L3)
>>>> - a backtrace of T3, followed by a backtrace of T1 (owner of L1)
>>>>
>>>> Notice how each task is backtraced twice.  I wonder if it's worth it
>>>> to de-dup the backtraces.  E.g. in
>>>>
>>>> 	task cat:115 is blocked on a mutex likely owned by task cat:114
>>>>
>>>> if we know that cat:114 is also blocked on a lock, then we probably
>>>> can just say "is blocked on a mutex likely owned by task cat:114" and
>>>> continue iterating through tasks.  That "cat:114" will be backtraced
>>>> individually later, as it's also blocked on a lock, owned by another
>>>> task.
>>>>
>>>> Does this make any sense?
>>>
>>> Hrm, OK. So what about dump the blocker task only if that task is
>>> NOT blocked? (because if the task is blocked, it should be dumped
>>> afterwards (or already))
>>
>> Hmm... I'm concerned about a potential side effect of that logic.
>>
>> Consider a simple, non-circular blocking chain like T1 -> T2 -> T3.
>>
>> In this scenario, T1, T2, and T3 would all be dumped as hung tasks.
>> However, with the proposed rule (dump only if NOT blocked), when the
>> detector processes T1, it would see that its blocker (T2) is also
>> blocked and would therefore skip printing any blocker information about
>> T2.
> 
> That's not what I proposed.  The suggestions here is to print only
> "is blocked likely owned by task cat:114" and do not append the
> backtrace of that cat:114, because it will be printed separately
> (since it's a blocked task).  But we should do so only if blocker
> is also blocked.  So the relation between T1 and T2 will still be
> exposed.

You're right, thanks for clarifying! I misunderstood the key detail :(

Thanks,
Lance


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC PATCH] hung_task: Dump blocker task if it is not hung
  2025-07-30  8:51     ` Masami Hiramatsu
  2025-07-30  9:36       ` Lance Yang
@ 2025-07-30  9:53       ` Masami Hiramatsu (Google)
  2025-07-30 13:28         ` Sergey Senozhatsky
  2025-07-30 13:46         ` Lance Yang
  2025-07-30  9:56       ` [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex Sergey Senozhatsky
  2 siblings, 2 replies; 30+ messages in thread
From: Masami Hiramatsu (Google) @ 2025-07-30  9:53 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton,
	Boqun Feng, Waiman Long, Joel Granados, Anna Schumaker,
	Lance Yang, Kent Overstreet, Yongliang Gao, Steven Rostedt,
	Tomasz Figa, linux-kernel

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Dump the lock blocker task if it is not hung because if the blocker
task is also hung, it should be dumped by the detector. This will
de-duplicate the same stackdumps if the blocker task is also blocked
by another task (and hung).

Suggested-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 kernel/hung_task.c |   86 ++++++++++++++++++++++++++++++----------------------
 1 file changed, 49 insertions(+), 37 deletions(-)

diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index d2432df2b905..52d72beb2233 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -94,9 +94,49 @@ static struct notifier_block panic_block = {
 	.notifier_call = hung_task_panic,
 };
 
+static bool task_is_hung(struct task_struct *t, unsigned long timeout)
+{
+	unsigned long switch_count = t->nvcsw + t->nivcsw;
+	unsigned int state;
+
+	/*
+	 * skip the TASK_KILLABLE tasks -- these can be killed
+	 * skip the TASK_IDLE tasks -- those are genuinely idle
+	 */
+	state = READ_ONCE(t->__state);
+	if (!(state & TASK_UNINTERRUPTIBLE) ||
+		(state & TASK_WAKEKILL) ||
+		(state & TASK_NOLOAD))
+		return false;
+
+	/*
+	 * Ensure the task is not frozen.
+	 * Also, skip vfork and any other user process that freezer should skip.
+	 */
+	if (unlikely(READ_ONCE(t->__state) & TASK_FROZEN))
+		return false;
+
+	/*
+	 * When a freshly created task is scheduled once, changes its state to
+	 * TASK_UNINTERRUPTIBLE without having ever been switched out once, it
+	 * musn't be checked.
+	 */
+	if (unlikely(!switch_count))
+		return false;
+
+	if (switch_count != t->last_switch_count) {
+		t->last_switch_count = switch_count;
+		t->last_switch_time = jiffies;
+		return false;
+	}
+	if (time_is_after_jiffies(t->last_switch_time + timeout * HZ))
+		return false;
+
+	return true;
+}
 
 #ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
-static void debug_show_blocker(struct task_struct *task)
+static void debug_show_blocker(struct task_struct *task, unsigned long timeout)
 {
 	struct task_struct *g, *t;
 	unsigned long owner, blocker, blocker_type;
@@ -153,41 +193,21 @@ static void debug_show_blocker(struct task_struct *task)
 			       task->comm, task->pid, t->comm, t->pid);
 			break;
 		}
-		sched_show_task(t);
+		/* Avoid duplicated task dump, skip if the task is also hung. */
+		if (!task_is_hung(t, timeout))
+			sched_show_task(t);
 		return;
 	}
 }
 #else
-static inline void debug_show_blocker(struct task_struct *task)
+static inline void debug_show_blocker(struct task_struct *task, unsigned long timeout)
 {
 }
 #endif
 
 static void check_hung_task(struct task_struct *t, unsigned long timeout)
 {
-	unsigned long switch_count = t->nvcsw + t->nivcsw;
-
-	/*
-	 * Ensure the task is not frozen.
-	 * Also, skip vfork and any other user process that freezer should skip.
-	 */
-	if (unlikely(READ_ONCE(t->__state) & TASK_FROZEN))
-		return;
-
-	/*
-	 * When a freshly created task is scheduled once, changes its state to
-	 * TASK_UNINTERRUPTIBLE without having ever been switched out once, it
-	 * musn't be checked.
-	 */
-	if (unlikely(!switch_count))
-		return;
-
-	if (switch_count != t->last_switch_count) {
-		t->last_switch_count = switch_count;
-		t->last_switch_time = jiffies;
-		return;
-	}
-	if (time_is_after_jiffies(t->last_switch_time + timeout * HZ))
+	if (!task_is_hung(t, timeout))
 		return;
 
 	/*
@@ -222,7 +242,7 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
 		pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\""
 			" disables this message.\n");
 		sched_show_task(t);
-		debug_show_blocker(t);
+		debug_show_blocker(t, timeout);
 		hung_task_show_lock = true;
 
 		if (sysctl_hung_task_all_cpu_backtrace)
@@ -278,7 +298,6 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
 	hung_task_show_lock = false;
 	rcu_read_lock();
 	for_each_process_thread(g, t) {
-		unsigned int state;
 
 		if (!max_count--)
 			goto unlock;
@@ -287,15 +306,8 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
 				goto unlock;
 			last_break = jiffies;
 		}
-		/*
-		 * skip the TASK_KILLABLE tasks -- these can be killed
-		 * skip the TASK_IDLE tasks -- those are genuinely idle
-		 */
-		state = READ_ONCE(t->__state);
-		if ((state & TASK_UNINTERRUPTIBLE) &&
-		    !(state & TASK_WAKEKILL) &&
-		    !(state & TASK_NOLOAD))
-			check_hung_task(t, timeout);
+
+		check_hung_task(t, timeout);
 	}
  unlock:
 	rcu_read_unlock();


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH] hung_task: Dump blocker task if it is not hung
  2025-07-30  9:53       ` [RFC PATCH] hung_task: Dump blocker task if it is not hung Masami Hiramatsu (Google)
@ 2025-07-30 13:28         ` Sergey Senozhatsky
  2025-07-30 13:55           ` Masami Hiramatsu
  2025-07-30 13:46         ` Lance Yang
  1 sibling, 1 reply; 30+ messages in thread
From: Sergey Senozhatsky @ 2025-07-30 13:28 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Sergey Senozhatsky, Peter Zijlstra, Ingo Molnar, Will Deacon,
	Andrew Morton, Boqun Feng, Waiman Long, Joel Granados,
	Anna Schumaker, Lance Yang, Kent Overstreet, Yongliang Gao,
	Steven Rostedt, Tomasz Figa, linux-kernel

On (25/07/30 18:53), Masami Hiramatsu (Google) wrote:
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> Dump the lock blocker task if it is not hung because if the blocker
> task is also hung, it should be dumped by the detector. This will
> de-duplicate the same stackdumps if the blocker task is also blocked
> by another task (and hung).

[..]

>  #ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> -static void debug_show_blocker(struct task_struct *task)
> +static void debug_show_blocker(struct task_struct *task, unsigned long timeout)
>  {
>  	struct task_struct *g, *t;
>  	unsigned long owner, blocker, blocker_type;
> @@ -153,41 +193,21 @@ static void debug_show_blocker(struct task_struct *task)
>  			       task->comm, task->pid, t->comm, t->pid);
>  			break;
>  		}
> -		sched_show_task(t);
> +		/* Avoid duplicated task dump, skip if the task is also hung. */
> +		if (!task_is_hung(t, timeout))
> +			sched_show_task(t);
>  		return;
>  	}

This patch seems to be against the tree that is significantly
behind the current linux-next.  Namely it's in conflict with
linux-next's commit 77da18de55ac6.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH] hung_task: Dump blocker task if it is not hung
  2025-07-30 13:28         ` Sergey Senozhatsky
@ 2025-07-30 13:55           ` Masami Hiramatsu
  0 siblings, 0 replies; 30+ messages in thread
From: Masami Hiramatsu @ 2025-07-30 13:55 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton,
	Boqun Feng, Waiman Long, Joel Granados, Anna Schumaker,
	Lance Yang, Kent Overstreet, Yongliang Gao, Steven Rostedt,
	Tomasz Figa, linux-kernel

On Wed, 30 Jul 2025 22:28:45 +0900
Sergey Senozhatsky <senozhatsky@chromium.org> wrote:

> On (25/07/30 18:53), Masami Hiramatsu (Google) wrote:
> > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > 
> > Dump the lock blocker task if it is not hung because if the blocker
> > task is also hung, it should be dumped by the detector. This will
> > de-duplicate the same stackdumps if the blocker task is also blocked
> > by another task (and hung).
> 
> [..]
> 
> >  #ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> > -static void debug_show_blocker(struct task_struct *task)
> > +static void debug_show_blocker(struct task_struct *task, unsigned long timeout)
> >  {
> >  	struct task_struct *g, *t;
> >  	unsigned long owner, blocker, blocker_type;
> > @@ -153,41 +193,21 @@ static void debug_show_blocker(struct task_struct *task)
> >  			       task->comm, task->pid, t->comm, t->pid);
> >  			break;
> >  		}
> > -		sched_show_task(t);
> > +		/* Avoid duplicated task dump, skip if the task is also hung. */
> > +		if (!task_is_hung(t, timeout))
> > +			sched_show_task(t);
> >  		return;
> >  	}
> 
> This patch seems to be against the tree that is significantly
> behind the current linux-next.  Namely it's in conflict with
> linux-next's commit 77da18de55ac6.

Ah, yes. I just used v6.16 for testing. OK, let me update it
against the linux-next.

Thank you,

-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH] hung_task: Dump blocker task if it is not hung
  2025-07-30  9:53       ` [RFC PATCH] hung_task: Dump blocker task if it is not hung Masami Hiramatsu (Google)
  2025-07-30 13:28         ` Sergey Senozhatsky
@ 2025-07-30 13:46         ` Lance Yang
  2025-07-30 21:50           ` Masami Hiramatsu
  1 sibling, 1 reply; 30+ messages in thread
From: Lance Yang @ 2025-07-30 13:46 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton,
	Boqun Feng, Waiman Long, Joel Granados, Anna Schumaker,
	Lance Yang, Kent Overstreet, Yongliang Gao, Steven Rostedt,
	Tomasz Figa, linux-kernel, Sergey Senozhatsky



On 2025/7/30 17:53, Masami Hiramatsu (Google) wrote:
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> Dump the lock blocker task if it is not hung because if the blocker
> task is also hung, it should be dumped by the detector. This will
> de-duplicate the same stackdumps if the blocker task is also blocked
> by another task (and hung).

Makes sense to me ;)

> 
> Suggested-by: Sergey Senozhatsky <senozhatsky@chromium.org>
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> ---
>   kernel/hung_task.c |   86 ++++++++++++++++++++++++++++++----------------------
>   1 file changed, 49 insertions(+), 37 deletions(-)
> 
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index d2432df2b905..52d72beb2233 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -94,9 +94,49 @@ static struct notifier_block panic_block = {
>   	.notifier_call = hung_task_panic,
>   };
>   
> +static bool task_is_hung(struct task_struct *t, unsigned long timeout)
> +{
> +	unsigned long switch_count = t->nvcsw + t->nivcsw;
> +	unsigned int state;
> +
> +	/*
> +	 * skip the TASK_KILLABLE tasks -- these can be killed
> +	 * skip the TASK_IDLE tasks -- those are genuinely idle
> +	 */
> +	state = READ_ONCE(t->__state);
> +	if (!(state & TASK_UNINTERRUPTIBLE) ||
> +		(state & TASK_WAKEKILL) ||
> +		(state & TASK_NOLOAD))
> +		return false;
> +
> +	/*
> +	 * Ensure the task is not frozen.
> +	 * Also, skip vfork and any other user process that freezer should skip.
> +	 */
> +	if (unlikely(READ_ONCE(t->__state) & TASK_FROZEN))
> +		return false;


Nit: the two separate checks on t->__state could be combined into
a single read and one conditional check ;)

Something like:

	state = READ_ONCE(t->__state);

	if (!(state & TASK_UNINTERRUPTIBLE) ||
		(state & (TASK_WAKEKILL | TASK_NOLOAD | TASK_FROZEN)))
			return false;


Otherwise, looks good to me:
Acked-by: Lance Yang <lance.yang@linux.dev>

Thanks,
Lance

> +
> +	/*
> +	 * When a freshly created task is scheduled once, changes its state to
> +	 * TASK_UNINTERRUPTIBLE without having ever been switched out once, it
> +	 * musn't be checked.
> +	 */
> +	if (unlikely(!switch_count))
> +		return false;
> +
> +	if (switch_count != t->last_switch_count) {
> +		t->last_switch_count = switch_count;
> +		t->last_switch_time = jiffies;
> +		return false;
> +	}
> +	if (time_is_after_jiffies(t->last_switch_time + timeout * HZ))
> +		return false;
> +
> +	return true;
> +}
>   
>   #ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> -static void debug_show_blocker(struct task_struct *task)
> +static void debug_show_blocker(struct task_struct *task, unsigned long timeout)
>   {
>   	struct task_struct *g, *t;
>   	unsigned long owner, blocker, blocker_type;
> @@ -153,41 +193,21 @@ static void debug_show_blocker(struct task_struct *task)
>   			       task->comm, task->pid, t->comm, t->pid);
>   			break;
>   		}
> -		sched_show_task(t);
> +		/* Avoid duplicated task dump, skip if the task is also hung. */
> +		if (!task_is_hung(t, timeout))
> +			sched_show_task(t);
>   		return;
>   	}
>   }
>   #else
> -static inline void debug_show_blocker(struct task_struct *task)
> +static inline void debug_show_blocker(struct task_struct *task, unsigned long timeout)
>   {
>   }
>   #endif
>   
>   static void check_hung_task(struct task_struct *t, unsigned long timeout)
>   {
> -	unsigned long switch_count = t->nvcsw + t->nivcsw;
> -
> -	/*
> -	 * Ensure the task is not frozen.
> -	 * Also, skip vfork and any other user process that freezer should skip.
> -	 */
> -	if (unlikely(READ_ONCE(t->__state) & TASK_FROZEN))
> -		return;
> -
> -	/*
> -	 * When a freshly created task is scheduled once, changes its state to
> -	 * TASK_UNINTERRUPTIBLE without having ever been switched out once, it
> -	 * musn't be checked.
> -	 */
> -	if (unlikely(!switch_count))
> -		return;
> -
> -	if (switch_count != t->last_switch_count) {
> -		t->last_switch_count = switch_count;
> -		t->last_switch_time = jiffies;
> -		return;
> -	}
> -	if (time_is_after_jiffies(t->last_switch_time + timeout * HZ))
> +	if (!task_is_hung(t, timeout))
>   		return;
>   
>   	/*
> @@ -222,7 +242,7 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
>   		pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\""
>   			" disables this message.\n");
>   		sched_show_task(t);
> -		debug_show_blocker(t);
> +		debug_show_blocker(t, timeout);
>   		hung_task_show_lock = true;
>   
>   		if (sysctl_hung_task_all_cpu_backtrace)
> @@ -278,7 +298,6 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
>   	hung_task_show_lock = false;
>   	rcu_read_lock();
>   	for_each_process_thread(g, t) {
> -		unsigned int state;
>   
>   		if (!max_count--)
>   			goto unlock;
> @@ -287,15 +306,8 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
>   				goto unlock;
>   			last_break = jiffies;
>   		}
> -		/*
> -		 * skip the TASK_KILLABLE tasks -- these can be killed
> -		 * skip the TASK_IDLE tasks -- those are genuinely idle
> -		 */
> -		state = READ_ONCE(t->__state);
> -		if ((state & TASK_UNINTERRUPTIBLE) &&
> -		    !(state & TASK_WAKEKILL) &&
> -		    !(state & TASK_NOLOAD))
> -			check_hung_task(t, timeout);
> +
> +		check_hung_task(t, timeout);
>   	}
>    unlock:
>   	rcu_read_unlock();
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH] hung_task: Dump blocker task if it is not hung
  2025-07-30 13:46         ` Lance Yang
@ 2025-07-30 21:50           ` Masami Hiramatsu
  0 siblings, 0 replies; 30+ messages in thread
From: Masami Hiramatsu @ 2025-07-30 21:50 UTC (permalink / raw)
  To: Lance Yang
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton,
	Boqun Feng, Waiman Long, Joel Granados, Anna Schumaker,
	Lance Yang, Kent Overstreet, Yongliang Gao, Steven Rostedt,
	Tomasz Figa, linux-kernel, Sergey Senozhatsky

On Wed, 30 Jul 2025 21:46:16 +0800
Lance Yang <lance.yang@linux.dev> wrote:

> 
> 
> On 2025/7/30 17:53, Masami Hiramatsu (Google) wrote:
> > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > 
> > Dump the lock blocker task if it is not hung because if the blocker
> > task is also hung, it should be dumped by the detector. This will
> > de-duplicate the same stackdumps if the blocker task is also blocked
> > by another task (and hung).
> 
> Makes sense to me ;)
> 
> > 
> > Suggested-by: Sergey Senozhatsky <senozhatsky@chromium.org>
> > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > ---
> >   kernel/hung_task.c |   86 ++++++++++++++++++++++++++++++----------------------
> >   1 file changed, 49 insertions(+), 37 deletions(-)
> > 
> > diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> > index d2432df2b905..52d72beb2233 100644
> > --- a/kernel/hung_task.c
> > +++ b/kernel/hung_task.c
> > @@ -94,9 +94,49 @@ static struct notifier_block panic_block = {
> >   	.notifier_call = hung_task_panic,
> >   };
> >   
> > +static bool task_is_hung(struct task_struct *t, unsigned long timeout)
> > +{
> > +	unsigned long switch_count = t->nvcsw + t->nivcsw;
> > +	unsigned int state;
> > +
> > +	/*
> > +	 * skip the TASK_KILLABLE tasks -- these can be killed
> > +	 * skip the TASK_IDLE tasks -- those are genuinely idle
> > +	 */
> > +	state = READ_ONCE(t->__state);
> > +	if (!(state & TASK_UNINTERRUPTIBLE) ||
> > +		(state & TASK_WAKEKILL) ||
> > +		(state & TASK_NOLOAD))
> > +		return false;
> > +
> > +	/*
> > +	 * Ensure the task is not frozen.
> > +	 * Also, skip vfork and any other user process that freezer should skip.
> > +	 */
> > +	if (unlikely(READ_ONCE(t->__state) & TASK_FROZEN))
> > +		return false;
> 
> 
> Nit: the two separate checks on t->__state could be combined into
> a single read and one conditional check ;)
> 
> Something like:
> 
> 	state = READ_ONCE(t->__state);
> 
> 	if (!(state & TASK_UNINTERRUPTIBLE) ||
> 		(state & (TASK_WAKEKILL | TASK_NOLOAD | TASK_FROZEN)))
> 			return false;

Ah, Indeed.

> 
> 
> Otherwise, looks good to me:
> Acked-by: Lance Yang <lance.yang@linux.dev>

Thanks, let me update it. (also on the next tree)

Thank you!

> 
> Thanks,
> Lance
> 
> > +
> > +	/*
> > +	 * When a freshly created task is scheduled once, changes its state to
> > +	 * TASK_UNINTERRUPTIBLE without having ever been switched out once, it
> > +	 * musn't be checked.
> > +	 */
> > +	if (unlikely(!switch_count))
> > +		return false;
> > +
> > +	if (switch_count != t->last_switch_count) {
> > +		t->last_switch_count = switch_count;
> > +		t->last_switch_time = jiffies;
> > +		return false;
> > +	}
> > +	if (time_is_after_jiffies(t->last_switch_time + timeout * HZ))
> > +		return false;
> > +
> > +	return true;
> > +}
> >   
> >   #ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> > -static void debug_show_blocker(struct task_struct *task)
> > +static void debug_show_blocker(struct task_struct *task, unsigned long timeout)
> >   {
> >   	struct task_struct *g, *t;
> >   	unsigned long owner, blocker, blocker_type;
> > @@ -153,41 +193,21 @@ static void debug_show_blocker(struct task_struct *task)
> >   			       task->comm, task->pid, t->comm, t->pid);
> >   			break;
> >   		}
> > -		sched_show_task(t);
> > +		/* Avoid duplicated task dump, skip if the task is also hung. */
> > +		if (!task_is_hung(t, timeout))
> > +			sched_show_task(t);
> >   		return;
> >   	}
> >   }
> >   #else
> > -static inline void debug_show_blocker(struct task_struct *task)
> > +static inline void debug_show_blocker(struct task_struct *task, unsigned long timeout)
> >   {
> >   }
> >   #endif
> >   
> >   static void check_hung_task(struct task_struct *t, unsigned long timeout)
> >   {
> > -	unsigned long switch_count = t->nvcsw + t->nivcsw;
> > -
> > -	/*
> > -	 * Ensure the task is not frozen.
> > -	 * Also, skip vfork and any other user process that freezer should skip.
> > -	 */
> > -	if (unlikely(READ_ONCE(t->__state) & TASK_FROZEN))
> > -		return;
> > -
> > -	/*
> > -	 * When a freshly created task is scheduled once, changes its state to
> > -	 * TASK_UNINTERRUPTIBLE without having ever been switched out once, it
> > -	 * musn't be checked.
> > -	 */
> > -	if (unlikely(!switch_count))
> > -		return;
> > -
> > -	if (switch_count != t->last_switch_count) {
> > -		t->last_switch_count = switch_count;
> > -		t->last_switch_time = jiffies;
> > -		return;
> > -	}
> > -	if (time_is_after_jiffies(t->last_switch_time + timeout * HZ))
> > +	if (!task_is_hung(t, timeout))
> >   		return;
> >   
> >   	/*
> > @@ -222,7 +242,7 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
> >   		pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\""
> >   			" disables this message.\n");
> >   		sched_show_task(t);
> > -		debug_show_blocker(t);
> > +		debug_show_blocker(t, timeout);
> >   		hung_task_show_lock = true;
> >   
> >   		if (sysctl_hung_task_all_cpu_backtrace)
> > @@ -278,7 +298,6 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
> >   	hung_task_show_lock = false;
> >   	rcu_read_lock();
> >   	for_each_process_thread(g, t) {
> > -		unsigned int state;
> >   
> >   		if (!max_count--)
> >   			goto unlock;
> > @@ -287,15 +306,8 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
> >   				goto unlock;
> >   			last_break = jiffies;
> >   		}
> > -		/*
> > -		 * skip the TASK_KILLABLE tasks -- these can be killed
> > -		 * skip the TASK_IDLE tasks -- those are genuinely idle
> > -		 */
> > -		state = READ_ONCE(t->__state);
> > -		if ((state & TASK_UNINTERRUPTIBLE) &&
> > -		    !(state & TASK_WAKEKILL) &&
> > -		    !(state & TASK_NOLOAD))
> > -			check_hung_task(t, timeout);
> > +
> > +		check_hung_task(t, timeout);
> >   	}
> >    unlock:
> >   	rcu_read_unlock();
> >
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
  2025-07-30  8:51     ` Masami Hiramatsu
  2025-07-30  9:36       ` Lance Yang
  2025-07-30  9:53       ` [RFC PATCH] hung_task: Dump blocker task if it is not hung Masami Hiramatsu (Google)
@ 2025-07-30  9:56       ` Sergey Senozhatsky
  2 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2025-07-30  9:56 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Sergey Senozhatsky, Peter Zijlstra, Ingo Molnar, Will Deacon,
	Andrew Morton, Boqun Feng, Waiman Long, Joel Granados,
	Anna Schumaker, Lance Yang, Kent Overstreet, Yongliang Gao,
	Steven Rostedt, Tomasz Figa, linux-kernel

On (25/07/30 17:51), Masami Hiramatsu wrote:
[..]
> > Notice how each task is backtraced twice.  I wonder if it's worth it
> > to de-dup the backtraces.  E.g. in
> > 
> > 	task cat:115 is blocked on a mutex likely owned by task cat:114
> > 
> > if we know that cat:114 is also blocked on a lock, then we probably
> > can just say "is blocked on a mutex likely owned by task cat:114" and
> > continue iterating through tasks.  That "cat:114" will be backtraced
> > individually later, as it's also blocked on a lock, owned by another
> > task.
> > 
> > Does this make any sense?
> 
> Hrm, OK. So what about dump the blocker task only if that task is
> NOT blocked? (because if the task is blocked, it should be dumped
> afterwards (or already))

Yes, I think this is precisely what I tried to suggest.

I'm not saying that we should fix it, I just noticed that while looking
at some crash reports I found this
   "wait.. how is this possible... aah, same PID, so I saw that
   backtrace already"
a little inconvenient.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
  2025-07-30  7:59   ` Sergey Senozhatsky
  2025-07-30  8:51     ` Masami Hiramatsu
@ 2025-07-30  9:22     ` Lance Yang
  2025-07-30  9:46       ` Sergey Senozhatsky
  1 sibling, 1 reply; 30+ messages in thread
From: Lance Yang @ 2025-07-30  9:22 UTC (permalink / raw)
  To: Sergey Senozhatsky, Masami Hiramatsu (Google)
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton,
	Boqun Feng, Waiman Long, Joel Granados, Anna Schumaker,
	Lance Yang, Kent Overstreet, Yongliang Gao, Steven Rostedt,
	Tomasz Figa, linux-kernel



On 2025/7/30 15:59, Sergey Senozhatsky wrote:
> On (25/02/25 16:02), Masami Hiramatsu (Google) wrote:
>> The "hung_task" shows a long-time uninterruptible slept task, but most
>> often, it's blocked on a mutex acquired by another task. Without
>> dumping such a task, investigating the root cause of the hung task
>> problem is very difficult.
>>
>> This introduce task_struct::blocker_mutex to point the mutex lock
>> which this task is waiting for. Since the mutex has "owner"
>> information, we can find the owner task and dump it with hung tasks.
>>
>> Note: the owner can be changed while dumping the owner task, so
>> this is "likely" the owner of the mutex.
>>
>> With this change, the hung task shows blocker task's info like below;
>>
>>   INFO: task cat:115 blocked for more than 122 seconds.
>>         Not tainted 6.14.0-rc3-00003-ga8946be3de00 #156
>>   "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>   task:cat             state:D stack:13432 pid:115   tgid:115   ppid:106    task_flags:0x400100 flags:0x00000002
>>   Call Trace:
>>    <TASK>
>>    __schedule+0x731/0x960
>>    ? schedule_preempt_disabled+0x54/0xa0
>>    schedule+0xb7/0x140
>>    ? __mutex_lock+0x51b/0xa60
>>    ? __mutex_lock+0x51b/0xa60
>>    schedule_preempt_disabled+0x54/0xa0
>>    __mutex_lock+0x51b/0xa60
>>    read_dummy+0x23/0x70
>>    full_proxy_read+0x6a/0xc0
>>    vfs_read+0xc2/0x340
>>    ? __pfx_direct_file_splice_eof+0x10/0x10
>>    ? do_sendfile+0x1bd/0x2e0
>>    ksys_read+0x76/0xe0
>>    do_syscall_64+0xe3/0x1c0
>>    ? exc_page_fault+0xa9/0x1d0
>>    entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>   RIP: 0033:0x4840cd
>>   RSP: 002b:00007ffe99071828 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
>>   RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
>>   RDX: 0000000000001000 RSI: 00007ffe99071870 RDI: 0000000000000003
>>   RBP: 00007ffe99071870 R08: 0000000000000000 R09: 0000000000000000
>>   R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
>>   R13: 00000000132fd3a0 R14: 0000000000000001 R15: ffffffffffffffff
>>    </TASK>
>>   INFO: task cat:115 is blocked on a mutex likely owned by task cat:114.
>>   task:cat             state:S stack:13432 pid:114   tgid:114   ppid:106    task_flags:0x400100 flags:0x00000002
>>   Call Trace:
>>    <TASK>
>>    __schedule+0x731/0x960
>>    ? schedule_timeout+0xa8/0x120
>>    schedule+0xb7/0x140
>>    schedule_timeout+0xa8/0x120
>>    ? __pfx_process_timeout+0x10/0x10
>>    msleep_interruptible+0x3e/0x60
>>    read_dummy+0x2d/0x70
>>    full_proxy_read+0x6a/0xc0
>>    vfs_read+0xc2/0x340
>>    ? __pfx_direct_file_splice_eof+0x10/0x10
>>    ? do_sendfile+0x1bd/0x2e0
>>    ksys_read+0x76/0xe0
>>    do_syscall_64+0xe3/0x1c0
>>    ? exc_page_fault+0xa9/0x1d0
>>    entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>   RIP: 0033:0x4840cd
>>   RSP: 002b:00007ffe3e0147b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
>>   RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
>>   RDX: 0000000000001000 RSI: 00007ffe3e014800 RDI: 0000000000000003
>>   RBP: 00007ffe3e014800 R08: 0000000000000000 R09: 0000000000000000
>>   R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
>>   R13: 000000001a0a93a0 R14: 0000000000000001 R15: ffffffffffffffff
>>    </TASK>
> 
> One thing that gives me a bit of "inconvenience" is that in certain
> cases this significantly increases the amount of stack traces to go
> through.  A distilled real life example:
> - task T1 acquires lock L1, attempts to acquire L2
> - task T2 acquires lock L2, attempts to acquire L3
> - task T3 acquires lock L3, attempts to acquire L1
> 
> So we'd now see:
> - a backtrace of T1, followed by a backtrace of T2 (owner of L2)
> - a backtrace of T2, followed by a backtrace of T3 (owner of L3)
> - a backtrace of T3, followed by a backtrace of T1 (owner of L1)
> 
> Notice how each task is backtraced twice.  I wonder if it's worth it
> to de-dup the backtraces.  E.g. in
> 
> 	task cat:115 is blocked on a mutex likely owned by task cat:114
> 
> if we know that cat:114 is also blocked on a lock, then we probably
> can just say "is blocked on a mutex likely owned by task cat:114" and
> continue iterating through tasks.  That "cat:114" will be backtraced
> individually later, as it's also blocked on a lock, owned by another
> task.
> 
> Does this make any sense?

Good spot! There is room for improvement.

In a deadlock chain like T1->T2->T3->T1, by definition, T1, T2, and T3
are all hung tasks, and the detector's primary responsibility is to
generate a report for each of them. The current implementation, when
reporting on one task, also dumps the backtrace of its blocker.

This results in a task's backtrace being printed twice — once as a
blocker and again as a primary hung task.

Regarding the de-duplication idea: it is elegant, but it does introduce
more complexity into the detector. We should also consider that in many
real-world cases, the blocking chain is just one level deep, where this
isn't an issue, IHMO ;)

Thanks,
Lance



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
  2025-07-30  9:22     ` Lance Yang
@ 2025-07-30  9:46       ` Sergey Senozhatsky
  0 siblings, 0 replies; 30+ messages in thread
From: Sergey Senozhatsky @ 2025-07-30  9:46 UTC (permalink / raw)
  To: Lance Yang
  Cc: Sergey Senozhatsky, Masami Hiramatsu (Google), Peter Zijlstra,
	Ingo Molnar, Will Deacon, Andrew Morton, Boqun Feng, Waiman Long,
	Joel Granados, Anna Schumaker, Lance Yang, Kent Overstreet,
	Yongliang Gao, Steven Rostedt, Tomasz Figa, linux-kernel

On (25/07/30 17:22), Lance Yang wrote:
[..]
> We should also consider that in many real-world cases, the blocking
> chain is just one level deep

I don't know if this is the case, but consider example:

- task T1 owns lock L1 and is blocked (e.g. on a very huge/slow I/O)
- tasks T2..TN are blocked on L1

There will be N backtraces of T1, with every T2..TN backtrace, which
on a system with small pstore or very slow serial console can in theory
be a little problematic.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v4 2/2] samples: Add hung_task detector mutex blocking sample
  2025-02-25  7:02 [PATCH v4 0/2] hung_task: Dump the blocking task stacktrace Masami Hiramatsu (Google)
  2025-02-25  7:02 ` [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex Masami Hiramatsu (Google)
@ 2025-02-25  7:02 ` Masami Hiramatsu (Google)
  2025-02-26  1:50   ` Lance Yang
  1 sibling, 1 reply; 30+ messages in thread
From: Masami Hiramatsu (Google) @ 2025-02-25  7:02 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton
  Cc: Boqun Feng, Waiman Long, Joel Granados, Masami Hiramatsu,
	Anna Schumaker, Lance Yang, Kent Overstreet, Yongliang Gao,
	Steven Rostedt, Tomasz Figa, Sergey Senozhatsky, linux-kernel

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Add a hung_task detector mutex blocking test sample code.

This module will create a dummy file on the debugfs. That file will
cause the read process to sleep for enough long time (256 seconds)
while holding a mutex. As a result, the second process will wait on
the mutex for a prolonged duration and be detected by the hung_task
detector.

Usage is;

 > cd /sys/kernel/debug/hung_task
 > cat mutex & cat mutex

and wait for hung_task message.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 samples/Kconfig                     |    9 +++++
 samples/Makefile                    |    1 +
 samples/hung_task/Makefile          |    2 +
 samples/hung_task/hung_task_mutex.c |   66 +++++++++++++++++++++++++++++++++++
 4 files changed, 78 insertions(+)
 create mode 100644 samples/hung_task/Makefile
 create mode 100644 samples/hung_task/hung_task_mutex.c

diff --git a/samples/Kconfig b/samples/Kconfig
index 820e00b2ed68..09011be2391a 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -300,6 +300,15 @@ config SAMPLE_CHECK_EXEC
 	  demonstrate how they should be used with execveat(2) +
 	  AT_EXECVE_CHECK.
 
+config SAMPLE_HUNG_TASK
+	tristate "Hung task detector test code"
+	depends on DETECT_HUNG_TASK && DEBUG_FS
+	help
+	  Build a module which provide a simple debugfs file. If user reads
+	  the file, it will sleep long time (256 seconds) with holding a
+	  mutex. Thus if there are 2 or more processes read this file, it
+	  will be detected by the hung_task watchdog.
+
 source "samples/rust/Kconfig"
 
 source "samples/damon/Kconfig"
diff --git a/samples/Makefile b/samples/Makefile
index f24cd0d72dd0..bf6e6fca5410 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -42,3 +42,4 @@ obj-$(CONFIG_SAMPLE_FPROBE)		+= fprobe/
 obj-$(CONFIG_SAMPLES_RUST)		+= rust/
 obj-$(CONFIG_SAMPLE_DAMON_WSSE)		+= damon/
 obj-$(CONFIG_SAMPLE_DAMON_PRCL)		+= damon/
+obj-$(CONFIG_SAMPLE_HUNG_TASK)		+= hung_task/
diff --git a/samples/hung_task/Makefile b/samples/hung_task/Makefile
new file mode 100644
index 000000000000..fe9dde799880
--- /dev/null
+++ b/samples/hung_task/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+obj-$(CONFIG_SAMPLE_HUNG_TASK) += hung_task_mutex.o
\ No newline at end of file
diff --git a/samples/hung_task/hung_task_mutex.c b/samples/hung_task/hung_task_mutex.c
new file mode 100644
index 000000000000..7a29f2246d22
--- /dev/null
+++ b/samples/hung_task/hung_task_mutex.c
@@ -0,0 +1,66 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * hung_task_mutex.c - Sample code which causes hung task by mutex
+ *
+ * Usage: load this module and read `<debugfs>/hung_task/mutex`
+ *        by 2 or more processes.
+ *
+ * This is for testing kernel hung_task error message.
+ * Note that this will make your system freeze and maybe
+ * cause panic. So do not use this except for the test.
+ */
+
+#include <linux/debugfs.h>
+#include <linux/delay.h>
+#include <linux/fs.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+
+#define HUNG_TASK_DIR   "hung_task"
+#define HUNG_TASK_FILE  "mutex"
+#define SLEEP_SECOND 256
+
+static const char dummy_string[] = "This is a dummy string.";
+static DEFINE_MUTEX(dummy_mutex);
+struct dentry *hung_task_dir;
+
+static ssize_t read_dummy(struct file *file, char __user *user_buf,
+			  size_t count, loff_t *ppos)
+{
+	/* If the second task waits on the lock, it is uninterruptible sleep. */
+	guard(mutex)(&dummy_mutex);
+
+	/* When the first task sleep here, it is interruptible. */
+	msleep_interruptible(SLEEP_SECOND * 1000);
+
+	return simple_read_from_buffer(user_buf, count, ppos,
+				dummy_string, sizeof(dummy_string));
+}
+
+static const struct file_operations hung_task_fops = {
+	.read = read_dummy,
+};
+
+static int __init hung_task_sample_init(void)
+{
+	hung_task_dir = debugfs_create_dir(HUNG_TASK_DIR, NULL);
+	if (IS_ERR(hung_task_dir))
+		return PTR_ERR(hung_task_dir);
+
+	debugfs_create_file(HUNG_TASK_FILE, 0400, hung_task_dir,
+			    NULL, &hung_task_fops);
+
+	return 0;
+}
+
+static void __exit hung_task_sample_exit(void)
+{
+	debugfs_remove_recursive(hung_task_dir);
+}
+
+module_init(hung_task_sample_init);
+module_exit(hung_task_sample_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Masami Hiramatsu");
+MODULE_DESCRIPTION("Simple sleep under mutex file for testing hung task");


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 2/2] samples: Add hung_task detector mutex blocking sample
  2025-02-25  7:02 ` [PATCH v4 2/2] samples: Add hung_task detector mutex blocking sample Masami Hiramatsu (Google)
@ 2025-02-26  1:50   ` Lance Yang
  2025-02-26  7:04     ` Masami Hiramatsu
  0 siblings, 1 reply; 30+ messages in thread
From: Lance Yang @ 2025-02-26  1:50 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton,
	Boqun Feng, Waiman Long, Joel Granados, Anna Schumaker,
	Kent Overstreet, Yongliang Gao, Steven Rostedt, Tomasz Figa,
	Sergey Senozhatsky, linux-kernel

On Tue, Feb 25, 2025 at 3:02 PM Masami Hiramatsu (Google)
<mhiramat@kernel.org> wrote:
>
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
> Add a hung_task detector mutex blocking test sample code.
>
> This module will create a dummy file on the debugfs. That file will
> cause the read process to sleep for enough long time (256 seconds)
> while holding a mutex. As a result, the second process will wait on
> the mutex for a prolonged duration and be detected by the hung_task
> detector.
>
> Usage is;
>
>  > cd /sys/kernel/debug/hung_task
>  > cat mutex & cat mutex
>
> and wait for hung_task message.
>
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> ---
>  samples/Kconfig                     |    9 +++++
>  samples/Makefile                    |    1 +
>  samples/hung_task/Makefile          |    2 +
>  samples/hung_task/hung_task_mutex.c |   66 +++++++++++++++++++++++++++++++++++
>  4 files changed, 78 insertions(+)
>  create mode 100644 samples/hung_task/Makefile
>  create mode 100644 samples/hung_task/hung_task_mutex.c
>
> diff --git a/samples/Kconfig b/samples/Kconfig
> index 820e00b2ed68..09011be2391a 100644
> --- a/samples/Kconfig
> +++ b/samples/Kconfig
> @@ -300,6 +300,15 @@ config SAMPLE_CHECK_EXEC
>           demonstrate how they should be used with execveat(2) +
>           AT_EXECVE_CHECK.
>
> +config SAMPLE_HUNG_TASK
> +       tristate "Hung task detector test code"
> +       depends on DETECT_HUNG_TASK && DEBUG_FS
> +       help
> +         Build a module which provide a simple debugfs file. If user reads
> +         the file, it will sleep long time (256 seconds) with holding a
> +         mutex. Thus if there are 2 or more processes read this file, it
> +         will be detected by the hung_task watchdog.
> +
>  source "samples/rust/Kconfig"

I'm just wondering if we should also make the SAMPLE_HUNG_TASK
depend on !PREEMPT_RT for now?

Thanks,
Lance

>
>  source "samples/damon/Kconfig"
> diff --git a/samples/Makefile b/samples/Makefile
> index f24cd0d72dd0..bf6e6fca5410 100644
> --- a/samples/Makefile
> +++ b/samples/Makefile
> @@ -42,3 +42,4 @@ obj-$(CONFIG_SAMPLE_FPROBE)           += fprobe/
>  obj-$(CONFIG_SAMPLES_RUST)             += rust/
>  obj-$(CONFIG_SAMPLE_DAMON_WSSE)                += damon/
>  obj-$(CONFIG_SAMPLE_DAMON_PRCL)                += damon/
> +obj-$(CONFIG_SAMPLE_HUNG_TASK)         += hung_task/
> diff --git a/samples/hung_task/Makefile b/samples/hung_task/Makefile
> new file mode 100644
> index 000000000000..fe9dde799880
> --- /dev/null
> +++ b/samples/hung_task/Makefile
> @@ -0,0 +1,2 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +obj-$(CONFIG_SAMPLE_HUNG_TASK) += hung_task_mutex.o
> \ No newline at end of file
> diff --git a/samples/hung_task/hung_task_mutex.c b/samples/hung_task/hung_task_mutex.c
> new file mode 100644
> index 000000000000..7a29f2246d22
> --- /dev/null
> +++ b/samples/hung_task/hung_task_mutex.c
> @@ -0,0 +1,66 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * hung_task_mutex.c - Sample code which causes hung task by mutex
> + *
> + * Usage: load this module and read `<debugfs>/hung_task/mutex`
> + *        by 2 or more processes.
> + *
> + * This is for testing kernel hung_task error message.
> + * Note that this will make your system freeze and maybe
> + * cause panic. So do not use this except for the test.
> + */
> +
> +#include <linux/debugfs.h>
> +#include <linux/delay.h>
> +#include <linux/fs.h>
> +#include <linux/module.h>
> +#include <linux/mutex.h>
> +
> +#define HUNG_TASK_DIR   "hung_task"
> +#define HUNG_TASK_FILE  "mutex"
> +#define SLEEP_SECOND 256
> +
> +static const char dummy_string[] = "This is a dummy string.";
> +static DEFINE_MUTEX(dummy_mutex);
> +struct dentry *hung_task_dir;
> +
> +static ssize_t read_dummy(struct file *file, char __user *user_buf,
> +                         size_t count, loff_t *ppos)
> +{
> +       /* If the second task waits on the lock, it is uninterruptible sleep. */
> +       guard(mutex)(&dummy_mutex);
> +
> +       /* When the first task sleep here, it is interruptible. */
> +       msleep_interruptible(SLEEP_SECOND * 1000);
> +
> +       return simple_read_from_buffer(user_buf, count, ppos,
> +                               dummy_string, sizeof(dummy_string));
> +}
> +
> +static const struct file_operations hung_task_fops = {
> +       .read = read_dummy,
> +};
> +
> +static int __init hung_task_sample_init(void)
> +{
> +       hung_task_dir = debugfs_create_dir(HUNG_TASK_DIR, NULL);
> +       if (IS_ERR(hung_task_dir))
> +               return PTR_ERR(hung_task_dir);
> +
> +       debugfs_create_file(HUNG_TASK_FILE, 0400, hung_task_dir,
> +                           NULL, &hung_task_fops);
> +
> +       return 0;
> +}
> +
> +static void __exit hung_task_sample_exit(void)
> +{
> +       debugfs_remove_recursive(hung_task_dir);
> +}
> +
> +module_init(hung_task_sample_init);
> +module_exit(hung_task_sample_exit);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Masami Hiramatsu");
> +MODULE_DESCRIPTION("Simple sleep under mutex file for testing hung task");
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 2/2] samples: Add hung_task detector mutex blocking sample
  2025-02-26  1:50   ` Lance Yang
@ 2025-02-26  7:04     ` Masami Hiramatsu
  2025-02-26 11:58       ` Lance Yang
  0 siblings, 1 reply; 30+ messages in thread
From: Masami Hiramatsu @ 2025-02-26  7:04 UTC (permalink / raw)
  To: Lance Yang
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton,
	Boqun Feng, Waiman Long, Joel Granados, Anna Schumaker,
	Kent Overstreet, Yongliang Gao, Steven Rostedt, Tomasz Figa,
	Sergey Senozhatsky, linux-kernel

On Wed, 26 Feb 2025 09:50:32 +0800
Lance Yang <ioworker0@gmail.com> wrote:

> On Tue, Feb 25, 2025 at 3:02 PM Masami Hiramatsu (Google)
> <mhiramat@kernel.org> wrote:
> >
> > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> >
> > Add a hung_task detector mutex blocking test sample code.
> >
> > This module will create a dummy file on the debugfs. That file will
> > cause the read process to sleep for enough long time (256 seconds)
> > while holding a mutex. As a result, the second process will wait on
> > the mutex for a prolonged duration and be detected by the hung_task
> > detector.
> >
> > Usage is;
> >
> >  > cd /sys/kernel/debug/hung_task
> >  > cat mutex & cat mutex
> >
> > and wait for hung_task message.
> >
> > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > ---
> >  samples/Kconfig                     |    9 +++++
> >  samples/Makefile                    |    1 +
> >  samples/hung_task/Makefile          |    2 +
> >  samples/hung_task/hung_task_mutex.c |   66 +++++++++++++++++++++++++++++++++++
> >  4 files changed, 78 insertions(+)
> >  create mode 100644 samples/hung_task/Makefile
> >  create mode 100644 samples/hung_task/hung_task_mutex.c
> >
> > diff --git a/samples/Kconfig b/samples/Kconfig
> > index 820e00b2ed68..09011be2391a 100644
> > --- a/samples/Kconfig
> > +++ b/samples/Kconfig
> > @@ -300,6 +300,15 @@ config SAMPLE_CHECK_EXEC
> >           demonstrate how they should be used with execveat(2) +
> >           AT_EXECVE_CHECK.
> >
> > +config SAMPLE_HUNG_TASK
> > +       tristate "Hung task detector test code"
> > +       depends on DETECT_HUNG_TASK && DEBUG_FS
> > +       help
> > +         Build a module which provide a simple debugfs file. If user reads
> > +         the file, it will sleep long time (256 seconds) with holding a
> > +         mutex. Thus if there are 2 or more processes read this file, it
> > +         will be detected by the hung_task watchdog.
> > +
> >  source "samples/rust/Kconfig"
> 
> I'm just wondering if we should also make the SAMPLE_HUNG_TASK
> depend on !PREEMPT_RT for now?

Yeah, I also have a question. This does not check the blocker,
but just providing the testing interface with mutex. So there is
no direct connection with DETECT_HUNG_TASK_BLOCKER. Thus I didn't
add the dependency.

Thank you,

> 
> Thanks,
> Lance
> 
> >
> >  source "samples/damon/Kconfig"
> > diff --git a/samples/Makefile b/samples/Makefile
> > index f24cd0d72dd0..bf6e6fca5410 100644
> > --- a/samples/Makefile
> > +++ b/samples/Makefile
> > @@ -42,3 +42,4 @@ obj-$(CONFIG_SAMPLE_FPROBE)           += fprobe/
> >  obj-$(CONFIG_SAMPLES_RUST)             += rust/
> >  obj-$(CONFIG_SAMPLE_DAMON_WSSE)                += damon/
> >  obj-$(CONFIG_SAMPLE_DAMON_PRCL)                += damon/
> > +obj-$(CONFIG_SAMPLE_HUNG_TASK)         += hung_task/
> > diff --git a/samples/hung_task/Makefile b/samples/hung_task/Makefile
> > new file mode 100644
> > index 000000000000..fe9dde799880
> > --- /dev/null
> > +++ b/samples/hung_task/Makefile
> > @@ -0,0 +1,2 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +obj-$(CONFIG_SAMPLE_HUNG_TASK) += hung_task_mutex.o
> > \ No newline at end of file
> > diff --git a/samples/hung_task/hung_task_mutex.c b/samples/hung_task/hung_task_mutex.c
> > new file mode 100644
> > index 000000000000..7a29f2246d22
> > --- /dev/null
> > +++ b/samples/hung_task/hung_task_mutex.c
> > @@ -0,0 +1,66 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +/*
> > + * hung_task_mutex.c - Sample code which causes hung task by mutex
> > + *
> > + * Usage: load this module and read `<debugfs>/hung_task/mutex`
> > + *        by 2 or more processes.
> > + *
> > + * This is for testing kernel hung_task error message.
> > + * Note that this will make your system freeze and maybe
> > + * cause panic. So do not use this except for the test.
> > + */
> > +
> > +#include <linux/debugfs.h>
> > +#include <linux/delay.h>
> > +#include <linux/fs.h>
> > +#include <linux/module.h>
> > +#include <linux/mutex.h>
> > +
> > +#define HUNG_TASK_DIR   "hung_task"
> > +#define HUNG_TASK_FILE  "mutex"
> > +#define SLEEP_SECOND 256
> > +
> > +static const char dummy_string[] = "This is a dummy string.";
> > +static DEFINE_MUTEX(dummy_mutex);
> > +struct dentry *hung_task_dir;
> > +
> > +static ssize_t read_dummy(struct file *file, char __user *user_buf,
> > +                         size_t count, loff_t *ppos)
> > +{
> > +       /* If the second task waits on the lock, it is uninterruptible sleep. */
> > +       guard(mutex)(&dummy_mutex);
> > +
> > +       /* When the first task sleep here, it is interruptible. */
> > +       msleep_interruptible(SLEEP_SECOND * 1000);
> > +
> > +       return simple_read_from_buffer(user_buf, count, ppos,
> > +                               dummy_string, sizeof(dummy_string));
> > +}
> > +
> > +static const struct file_operations hung_task_fops = {
> > +       .read = read_dummy,
> > +};
> > +
> > +static int __init hung_task_sample_init(void)
> > +{
> > +       hung_task_dir = debugfs_create_dir(HUNG_TASK_DIR, NULL);
> > +       if (IS_ERR(hung_task_dir))
> > +               return PTR_ERR(hung_task_dir);
> > +
> > +       debugfs_create_file(HUNG_TASK_FILE, 0400, hung_task_dir,
> > +                           NULL, &hung_task_fops);
> > +
> > +       return 0;
> > +}
> > +
> > +static void __exit hung_task_sample_exit(void)
> > +{
> > +       debugfs_remove_recursive(hung_task_dir);
> > +}
> > +
> > +module_init(hung_task_sample_init);
> > +module_exit(hung_task_sample_exit);
> > +
> > +MODULE_LICENSE("GPL");
> > +MODULE_AUTHOR("Masami Hiramatsu");
> > +MODULE_DESCRIPTION("Simple sleep under mutex file for testing hung task");
> >


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 2/2] samples: Add hung_task detector mutex blocking sample
  2025-02-26  7:04     ` Masami Hiramatsu
@ 2025-02-26 11:58       ` Lance Yang
  0 siblings, 0 replies; 30+ messages in thread
From: Lance Yang @ 2025-02-26 11:58 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton,
	Boqun Feng, Waiman Long, Joel Granados, Anna Schumaker,
	Kent Overstreet, Yongliang Gao, Steven Rostedt, Tomasz Figa,
	Sergey Senozhatsky, linux-kernel

On Wed, Feb 26, 2025 at 3:04 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Wed, 26 Feb 2025 09:50:32 +0800
> Lance Yang <ioworker0@gmail.com> wrote:
>
> > On Tue, Feb 25, 2025 at 3:02 PM Masami Hiramatsu (Google)
> > <mhiramat@kernel.org> wrote:
> > >
> > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > >
> > > Add a hung_task detector mutex blocking test sample code.
> > >
> > > This module will create a dummy file on the debugfs. That file will
> > > cause the read process to sleep for enough long time (256 seconds)
> > > while holding a mutex. As a result, the second process will wait on
> > > the mutex for a prolonged duration and be detected by the hung_task
> > > detector.
> > >
> > > Usage is;
> > >
> > >  > cd /sys/kernel/debug/hung_task
> > >  > cat mutex & cat mutex
> > >
> > > and wait for hung_task message.
> > >
> > > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > > ---
> > >  samples/Kconfig                     |    9 +++++
> > >  samples/Makefile                    |    1 +
> > >  samples/hung_task/Makefile          |    2 +
> > >  samples/hung_task/hung_task_mutex.c |   66 +++++++++++++++++++++++++++++++++++
> > >  4 files changed, 78 insertions(+)
> > >  create mode 100644 samples/hung_task/Makefile
> > >  create mode 100644 samples/hung_task/hung_task_mutex.c
> > >
> > > diff --git a/samples/Kconfig b/samples/Kconfig
> > > index 820e00b2ed68..09011be2391a 100644
> > > --- a/samples/Kconfig
> > > +++ b/samples/Kconfig
> > > @@ -300,6 +300,15 @@ config SAMPLE_CHECK_EXEC
> > >           demonstrate how they should be used with execveat(2) +
> > >           AT_EXECVE_CHECK.
> > >
> > > +config SAMPLE_HUNG_TASK
> > > +       tristate "Hung task detector test code"
> > > +       depends on DETECT_HUNG_TASK && DEBUG_FS
> > > +       help
> > > +         Build a module which provide a simple debugfs file. If user reads
> > > +         the file, it will sleep long time (256 seconds) with holding a
> > > +         mutex. Thus if there are 2 or more processes read this file, it
> > > +         will be detected by the hung_task watchdog.
> > > +
> > >  source "samples/rust/Kconfig"
> >
> > I'm just wondering if we should also make the SAMPLE_HUNG_TASK
> > depend on !PREEMPT_RT for now?
>
> Yeah, I also have a question. This does not check the blocker,
> but just providing the testing interface with mutex. So there is
> no direct connection with DETECT_HUNG_TASK_BLOCKER. Thus I didn't
> add the dependency.

Yep, whether we add it or not is not a big deal ;)

Thanks,
Lance

>
> Thank you,
>
> >
> > Thanks,
> > Lance
> >
> > >
> > >  source "samples/damon/Kconfig"
> > > diff --git a/samples/Makefile b/samples/Makefile
> > > index f24cd0d72dd0..bf6e6fca5410 100644
> > > --- a/samples/Makefile
> > > +++ b/samples/Makefile
> > > @@ -42,3 +42,4 @@ obj-$(CONFIG_SAMPLE_FPROBE)           += fprobe/
> > >  obj-$(CONFIG_SAMPLES_RUST)             += rust/
> > >  obj-$(CONFIG_SAMPLE_DAMON_WSSE)                += damon/
> > >  obj-$(CONFIG_SAMPLE_DAMON_PRCL)                += damon/
> > > +obj-$(CONFIG_SAMPLE_HUNG_TASK)         += hung_task/
> > > diff --git a/samples/hung_task/Makefile b/samples/hung_task/Makefile
> > > new file mode 100644
> > > index 000000000000..fe9dde799880
> > > --- /dev/null
> > > +++ b/samples/hung_task/Makefile
> > > @@ -0,0 +1,2 @@
> > > +# SPDX-License-Identifier: GPL-2.0-only
> > > +obj-$(CONFIG_SAMPLE_HUNG_TASK) += hung_task_mutex.o
> > > \ No newline at end of file
> > > diff --git a/samples/hung_task/hung_task_mutex.c b/samples/hung_task/hung_task_mutex.c
> > > new file mode 100644
> > > index 000000000000..7a29f2246d22
> > > --- /dev/null
> > > +++ b/samples/hung_task/hung_task_mutex.c
> > > @@ -0,0 +1,66 @@
> > > +// SPDX-License-Identifier: GPL-2.0-or-later
> > > +/*
> > > + * hung_task_mutex.c - Sample code which causes hung task by mutex
> > > + *
> > > + * Usage: load this module and read `<debugfs>/hung_task/mutex`
> > > + *        by 2 or more processes.
> > > + *
> > > + * This is for testing kernel hung_task error message.
> > > + * Note that this will make your system freeze and maybe
> > > + * cause panic. So do not use this except for the test.
> > > + */
> > > +
> > > +#include <linux/debugfs.h>
> > > +#include <linux/delay.h>
> > > +#include <linux/fs.h>
> > > +#include <linux/module.h>
> > > +#include <linux/mutex.h>
> > > +
> > > +#define HUNG_TASK_DIR   "hung_task"
> > > +#define HUNG_TASK_FILE  "mutex"
> > > +#define SLEEP_SECOND 256
> > > +
> > > +static const char dummy_string[] = "This is a dummy string.";
> > > +static DEFINE_MUTEX(dummy_mutex);
> > > +struct dentry *hung_task_dir;
> > > +
> > > +static ssize_t read_dummy(struct file *file, char __user *user_buf,
> > > +                         size_t count, loff_t *ppos)
> > > +{
> > > +       /* If the second task waits on the lock, it is uninterruptible sleep. */
> > > +       guard(mutex)(&dummy_mutex);
> > > +
> > > +       /* When the first task sleep here, it is interruptible. */
> > > +       msleep_interruptible(SLEEP_SECOND * 1000);
> > > +
> > > +       return simple_read_from_buffer(user_buf, count, ppos,
> > > +                               dummy_string, sizeof(dummy_string));
> > > +}
> > > +
> > > +static const struct file_operations hung_task_fops = {
> > > +       .read = read_dummy,
> > > +};
> > > +
> > > +static int __init hung_task_sample_init(void)
> > > +{
> > > +       hung_task_dir = debugfs_create_dir(HUNG_TASK_DIR, NULL);
> > > +       if (IS_ERR(hung_task_dir))
> > > +               return PTR_ERR(hung_task_dir);
> > > +
> > > +       debugfs_create_file(HUNG_TASK_FILE, 0400, hung_task_dir,
> > > +                           NULL, &hung_task_fops);
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static void __exit hung_task_sample_exit(void)
> > > +{
> > > +       debugfs_remove_recursive(hung_task_dir);
> > > +}
> > > +
> > > +module_init(hung_task_sample_init);
> > > +module_exit(hung_task_sample_exit);
> > > +
> > > +MODULE_LICENSE("GPL");
> > > +MODULE_AUTHOR("Masami Hiramatsu");
> > > +MODULE_DESCRIPTION("Simple sleep under mutex file for testing hung task");
> > >
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2025-07-30 21:50 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-25  7:02 [PATCH v4 0/2] hung_task: Dump the blocking task stacktrace Masami Hiramatsu (Google)
2025-02-25  7:02 ` [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex Masami Hiramatsu (Google)
2025-02-26  1:23   ` Waiman Long
2025-03-06  2:32     ` Masami Hiramatsu
     [not found]       ` <5f7bc403-be75-4ae3-b6ff-5ff0673847f9@redhat.com>
2025-03-06  3:10         ` Waiman Long
2025-03-13  3:55           ` Masami Hiramatsu
2025-02-26  1:44   ` Lance Yang
2025-02-26  4:38   ` Sergey Senozhatsky
2025-02-26 15:07     ` Steven Rostedt
2025-03-13 22:29   ` Andrew Morton
2025-03-14  3:57     ` Masami Hiramatsu
2025-07-30  7:59   ` Sergey Senozhatsky
2025-07-30  8:51     ` Masami Hiramatsu
2025-07-30  9:36       ` Lance Yang
2025-07-30 10:01         ` Masami Hiramatsu
2025-07-30 10:42           ` Lance Yang
2025-07-30 10:16         ` Sergey Senozhatsky
2025-07-30 10:40           ` Lance Yang
2025-07-30  9:53       ` [RFC PATCH] hung_task: Dump blocker task if it is not hung Masami Hiramatsu (Google)
2025-07-30 13:28         ` Sergey Senozhatsky
2025-07-30 13:55           ` Masami Hiramatsu
2025-07-30 13:46         ` Lance Yang
2025-07-30 21:50           ` Masami Hiramatsu
2025-07-30  9:56       ` [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex Sergey Senozhatsky
2025-07-30  9:22     ` Lance Yang
2025-07-30  9:46       ` Sergey Senozhatsky
2025-02-25  7:02 ` [PATCH v4 2/2] samples: Add hung_task detector mutex blocking sample Masami Hiramatsu (Google)
2025-02-26  1:50   ` Lance Yang
2025-02-26  7:04     ` Masami Hiramatsu
2025-02-26 11:58       ` Lance Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).