* [PATCH RFC 0/3] extend hung task blocker tracking to rwsems
@ 2025-06-12 4:19 Lance Yang
2025-06-12 4:19 ` [PATCH RFC 1/3] locking/rwsem: make owner helpers globally available Lance Yang
` (3 more replies)
0 siblings, 4 replies; 15+ messages in thread
From: Lance Yang @ 2025-06-12 4:19 UTC (permalink / raw)
To: akpm
Cc: zi.li, anna.schumaker, boqun.feng, joel.granados, jstultz,
kent.overstreet, leonylgao, linux-kernel, longman, mhiramat,
mingo, mingzhe.yang, peterz, rostedt, senozhatsky, tfiga, will
Hi all,
Inspired by mutex blocker tracking[1], and having already extended it to
semaphores, let's now add support for reader-writer semaphores (rwsems).
The approach is simple: when a task enters TASK_UNINTERRUPTIBLE while
waiting for an rwsem, we just call hung_task_set_blocker(). The hung task
detector can then query the rwsem's owner to identify the lock holder.
Tracking works reliably for writers, as there can only be a single writer
holding the lock, and its task struct is stored in the owner field.
The main challenge lies with readers. The owner field points to only one
of many concurrent readers, so we might lose track of the blocker if that
specific reader unlocks, even while others remain. This is not a
significant issue, however. In practice, long-lasting lock contention is
almost always caused by a writer. Therefore, reliably tracking the writer
is the primary goal of this patch series ;)
With this change, the hung task detector can now show blocker task's info
like below:
[Thu Jun 12 11:01:33 2025] INFO: task rw_sem_thread2:36526 blocked for more than 122 seconds.
[Thu Jun 12 11:01:33 2025] Tainted: G S O 6.16.0-rc1 #1
[Thu Jun 12 11:01:33 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Thu Jun 12 11:01:33 2025] task:rw_sem_thread2 state:D stack:0 pid:36526 tgid:36526 ppid:2 task_flags:0x208040 flags:0x00004000
[Thu Jun 12 11:01:33 2025] Call Trace:
[Thu Jun 12 11:01:33 2025] <TASK>
[Thu Jun 12 11:01:33 2025] __schedule+0x7c7/0x1930
[Thu Jun 12 11:01:33 2025] ? __pfx___schedule+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? _raw_spin_lock_irq+0x8a/0xe0
[Thu Jun 12 11:01:33 2025] ? __pfx__raw_spin_lock_irq+0x10/0x10
[Thu Jun 12 11:01:33 2025] schedule+0x6a/0x180
[Thu Jun 12 11:01:33 2025] schedule_preempt_disabled+0x15/0x30
[Thu Jun 12 11:01:33 2025] rwsem_down_write_slowpath+0x447/0x1090
[Thu Jun 12 11:01:33 2025] ? __pfx_rwsem_down_write_slowpath+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? __pfx___schedule+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? __pfx___might_resched+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? __pfx_thread2_func+0x10/0x10 [rw_sem_test_2]
[Thu Jun 12 11:01:33 2025] down_write+0x125/0x140
[Thu Jun 12 11:01:33 2025] ? __pfx_down_write+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? msleep+0x91/0xf0
[Thu Jun 12 11:01:33 2025] ? __raw_spin_lock_irqsave+0x8c/0xf0
[Thu Jun 12 11:01:33 2025] thread2_func+0x37/0x70 [rw_sem_test_2]
[Thu Jun 12 11:01:33 2025] kthread+0x39f/0x750
[Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? __pfx__raw_spin_lock_irq+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
[Thu Jun 12 11:01:33 2025] ret_from_fork+0x25d/0x320
[Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
[Thu Jun 12 11:01:33 2025] ret_from_fork_asm+0x1a/0x30
[Thu Jun 12 11:01:33 2025] </TASK>
[Thu Jun 12 11:01:33 2025] INFO: task rw_sem_thread2:36526 <writer> blocked on an rw-semaphore likely owned by task rw_sem_thread1:36525 <writer>
[Thu Jun 12 11:01:33 2025] task:rw_sem_thread1 state:S stack:0 pid:36525 tgid:36525 ppid:2 task_flags:0x208040 flags:0x00004000
[Thu Jun 12 11:01:33 2025] Call Trace:
[Thu Jun 12 11:01:33 2025] <TASK>
[Thu Jun 12 11:01:33 2025] __schedule+0x7c7/0x1930
[Thu Jun 12 11:01:33 2025] ? __pfx___schedule+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? __mod_timer+0x304/0xa80
[Thu Jun 12 11:01:33 2025] ? irq_work_queue+0x6a/0xa0
[Thu Jun 12 11:01:33 2025] ? __pfx_vprintk_emit+0x10/0x10
[Thu Jun 12 11:01:33 2025] schedule+0x6a/0x180
[Thu Jun 12 11:01:33 2025] schedule_timeout+0xfb/0x230
[Thu Jun 12 11:01:33 2025] ? __pfx_schedule_timeout+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? __pfx_process_timeout+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? down_write+0xc4/0x140
[Thu Jun 12 11:01:33 2025] msleep_interruptible+0xbe/0x150
[Thu Jun 12 11:01:33 2025] ? __pfx_thread1_func+0x10/0x10 [rw_sem_test_2]
[Thu Jun 12 11:01:33 2025] thread1_func+0x37/0x60 [rw_sem_test_2]
[Thu Jun 12 11:01:33 2025] kthread+0x39f/0x750
[Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? __pfx__raw_spin_lock_irq+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
[Thu Jun 12 11:01:33 2025] ret_from_fork+0x25d/0x320
[Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
[Thu Jun 12 11:01:33 2025] ret_from_fork_asm+0x1a/0x30
[Thu Jun 12 11:01:33 2025] </TASK>
[1] https://lore.kernel.org/all/174046694331.2194069.15472952050240807469.stgit@mhiramat.tok.corp.google.com/
Thanks,
Lance
Lance Yang (3):
locking/rwsem: make owner helpers globally available
locking/rwsem: clear reader-owner on unlock to reduce false positives
hung_task: extend hung task blocker tracking to rwsems
include/linux/hung_task.h | 18 +++++++++---------
include/linux/rwsem.h | 12 ++++++++++++
kernel/hung_task.c | 29 +++++++++++++++++++++++++----
kernel/locking/rwsem.c | 31 +++++++++++++++++++++++--------
4 files changed, 69 insertions(+), 21 deletions(-)
--
2.49.0
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH RFC 1/3] locking/rwsem: make owner helpers globally available
2025-06-12 4:19 [PATCH RFC 0/3] extend hung task blocker tracking to rwsems Lance Yang
@ 2025-06-12 4:19 ` Lance Yang
2025-06-24 0:17 ` Masami Hiramatsu
2025-06-12 4:19 ` [PATCH RFC 2/3] locking/rwsem: clear reader-owner on unlock to reduce false positives Lance Yang
` (2 subsequent siblings)
3 siblings, 1 reply; 15+ messages in thread
From: Lance Yang @ 2025-06-12 4:19 UTC (permalink / raw)
To: akpm
Cc: zi.li, anna.schumaker, boqun.feng, joel.granados, jstultz,
kent.overstreet, leonylgao, linux-kernel, longman, mhiramat,
mingo, mingzhe.yang, peterz, rostedt, senozhatsky, tfiga, will,
Lance Yang
From: Lance Yang <lance.yang@linux.dev>
In preparation for extending blocker tracking to support rwsems, make the
rwsem_owner() and is_rwsem_reader_owned() helpers globally available for
determining if the blocker is a writer or one of the readers.
Signed-off-by: Lance Yang <lance.yang@linux.dev>
---
include/linux/rwsem.h | 12 ++++++++++++
kernel/locking/rwsem.c | 8 +++++---
2 files changed, 17 insertions(+), 3 deletions(-)
diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h
index c8b543d428b0..544853bed5b9 100644
--- a/include/linux/rwsem.h
+++ b/include/linux/rwsem.h
@@ -132,6 +132,18 @@ static inline int rwsem_is_contended(struct rw_semaphore *sem)
return !list_empty(&sem->wait_list);
}
+#if defined(CONFIG_DEBUG_RWSEMS) || defined(CONFIG_DETECT_HUNG_TASK_BLOCKER)
+/*
+ * Return just the real task structure pointer of the owner
+ */
+extern struct task_struct *rwsem_owner(struct rw_semaphore *sem);
+
+/*
+ * Return true if the rwsem is owned by a reader.
+ */
+extern bool is_rwsem_reader_owned(struct rw_semaphore *sem);
+#endif
+
#else /* !CONFIG_PREEMPT_RT */
#include <linux/rwbase_rt.h>
diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index 2ddb827e3bea..6cb29442d4fc 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -181,11 +181,11 @@ static inline void rwsem_set_reader_owned(struct rw_semaphore *sem)
__rwsem_set_reader_owned(sem, current);
}
-#ifdef CONFIG_DEBUG_RWSEMS
+#if defined(CONFIG_DEBUG_RWSEMS) || defined(CONFIG_DETECT_HUNG_TASK_BLOCKER)
/*
* Return just the real task structure pointer of the owner
*/
-static inline struct task_struct *rwsem_owner(struct rw_semaphore *sem)
+struct task_struct *rwsem_owner(struct rw_semaphore *sem)
{
return (struct task_struct *)
(atomic_long_read(&sem->owner) & ~RWSEM_OWNER_FLAGS_MASK);
@@ -194,7 +194,7 @@ static inline struct task_struct *rwsem_owner(struct rw_semaphore *sem)
/*
* Return true if the rwsem is owned by a reader.
*/
-static inline bool is_rwsem_reader_owned(struct rw_semaphore *sem)
+bool is_rwsem_reader_owned(struct rw_semaphore *sem)
{
/*
* Check the count to see if it is write-locked.
@@ -205,7 +205,9 @@ static inline bool is_rwsem_reader_owned(struct rw_semaphore *sem)
return false;
return rwsem_test_oflags(sem, RWSEM_READER_OWNED);
}
+#endif
+#ifdef CONFIG_DEBUG_RWSEMS
/*
* With CONFIG_DEBUG_RWSEMS configured, it will make sure that if there
* is a task pointer in owner of a reader-owned rwsem, it will be the
--
2.49.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH RFC 2/3] locking/rwsem: clear reader-owner on unlock to reduce false positives
2025-06-12 4:19 [PATCH RFC 0/3] extend hung task blocker tracking to rwsems Lance Yang
2025-06-12 4:19 ` [PATCH RFC 1/3] locking/rwsem: make owner helpers globally available Lance Yang
@ 2025-06-12 4:19 ` Lance Yang
2025-06-24 0:26 ` Masami Hiramatsu
2025-06-12 4:19 ` [PATCH RFC 3/3] hung_task: extend hung task blocker tracking to rwsems Lance Yang
2025-06-23 12:33 ` [PATCH RFC 0/3] " Lance Yang
3 siblings, 1 reply; 15+ messages in thread
From: Lance Yang @ 2025-06-12 4:19 UTC (permalink / raw)
To: akpm
Cc: zi.li, anna.schumaker, boqun.feng, joel.granados, jstultz,
kent.overstreet, leonylgao, linux-kernel, longman, mhiramat,
mingo, mingzhe.yang, peterz, rostedt, senozhatsky, tfiga, will,
Lance Yang
From: Lance Yang <lance.yang@linux.dev>
When CONFIG_DETECT_HUNG_TASK_BLOCKER is enabled, a stale owner pointer in a
reader-owned rwsem can lead to false positives in blocker tracking.
To mitigate this, let’s try to clear the owner field on unlock, as a NULL
owner is better than a stale one for diagnostics.
Signed-off-by: Lance Yang <lance.yang@linux.dev>
---
kernel/locking/rwsem.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index 6cb29442d4fc..a310eb9896de 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -205,14 +205,12 @@ bool is_rwsem_reader_owned(struct rw_semaphore *sem)
return false;
return rwsem_test_oflags(sem, RWSEM_READER_OWNED);
}
-#endif
-#ifdef CONFIG_DEBUG_RWSEMS
/*
- * With CONFIG_DEBUG_RWSEMS configured, it will make sure that if there
- * is a task pointer in owner of a reader-owned rwsem, it will be the
- * real owner or one of the real owners. The only exception is when the
- * unlock is done by up_read_non_owner().
+ * With CONFIG_DEBUG_RWSEMS or CONFIG_DETECT_HUNG_TASK_BLOCKER configured,
+ * it will make sure that the owner field of a reader-owned rwsem either
+ * points to a real reader-owner(s) or gets cleared. The only exception is
+ * when the unlock is done by up_read_non_owner().
*/
static inline void rwsem_clear_reader_owned(struct rw_semaphore *sem)
{
--
2.49.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH RFC 3/3] hung_task: extend hung task blocker tracking to rwsems
2025-06-12 4:19 [PATCH RFC 0/3] extend hung task blocker tracking to rwsems Lance Yang
2025-06-12 4:19 ` [PATCH RFC 1/3] locking/rwsem: make owner helpers globally available Lance Yang
2025-06-12 4:19 ` [PATCH RFC 2/3] locking/rwsem: clear reader-owner on unlock to reduce false positives Lance Yang
@ 2025-06-12 4:19 ` Lance Yang
2025-06-24 3:57 ` Masami Hiramatsu
2025-06-23 12:33 ` [PATCH RFC 0/3] " Lance Yang
3 siblings, 1 reply; 15+ messages in thread
From: Lance Yang @ 2025-06-12 4:19 UTC (permalink / raw)
To: akpm
Cc: zi.li, anna.schumaker, boqun.feng, joel.granados, jstultz,
kent.overstreet, leonylgao, linux-kernel, longman, mhiramat,
mingo, mingzhe.yang, peterz, rostedt, senozhatsky, tfiga, will,
Lance Yang
From: Lance Yang <lance.yang@linux.dev>
Inspired by mutex blocker tracking[1], and having already extended it to
semaphores, let's now add support for reader-writer semaphores (rwsems).
The approach is simple: when a task enters TASK_UNINTERRUPTIBLE while
waiting for an rwsem, we just call hung_task_set_blocker(). The hung task
detector can then query the rwsem's owner to identify the lock holder.
Tracking works reliably for writers, as there can only be a single writer
holding the lock, and its task struct is stored in the owner field.
The main challenge lies with readers. The owner field points to only one
of many concurrent readers, so we might lose track of the blocker if that
specific reader unlocks, even while others remain. This is not a
significant issue, however. In practice, long-lasting lock contention is
almost always caused by a writer. Therefore, reliably tracking the writer
is the primary goal of this patch series ;)
With this change, the hung task detector can now show blocker task's info
like below:
[Thu Jun 12 11:01:33 2025] INFO: task rw_sem_thread2:36526 blocked for more than 122 seconds.
[Thu Jun 12 11:01:33 2025] Tainted: G S O 6.16.0-rc1 #1
[Thu Jun 12 11:01:33 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Thu Jun 12 11:01:33 2025] task:rw_sem_thread2 state:D stack:0 pid:36526 tgid:36526 ppid:2 task_flags:0x208040 flags:0x00004000
[Thu Jun 12 11:01:33 2025] Call Trace:
[Thu Jun 12 11:01:33 2025] <TASK>
[Thu Jun 12 11:01:33 2025] __schedule+0x7c7/0x1930
[Thu Jun 12 11:01:33 2025] ? __pfx___schedule+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? _raw_spin_lock_irq+0x8a/0xe0
[Thu Jun 12 11:01:33 2025] ? __pfx__raw_spin_lock_irq+0x10/0x10
[Thu Jun 12 11:01:33 2025] schedule+0x6a/0x180
[Thu Jun 12 11:01:33 2025] schedule_preempt_disabled+0x15/0x30
[Thu Jun 12 11:01:33 2025] rwsem_down_write_slowpath+0x447/0x1090
[Thu Jun 12 11:01:33 2025] ? __pfx_rwsem_down_write_slowpath+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? __pfx___schedule+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? __pfx___might_resched+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? __pfx_thread2_func+0x10/0x10 [rw_sem_test_2]
[Thu Jun 12 11:01:33 2025] down_write+0x125/0x140
[Thu Jun 12 11:01:33 2025] ? __pfx_down_write+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? msleep+0x91/0xf0
[Thu Jun 12 11:01:33 2025] ? __raw_spin_lock_irqsave+0x8c/0xf0
[Thu Jun 12 11:01:33 2025] thread2_func+0x37/0x70 [rw_sem_test_2]
[Thu Jun 12 11:01:33 2025] kthread+0x39f/0x750
[Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? __pfx__raw_spin_lock_irq+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
[Thu Jun 12 11:01:33 2025] ret_from_fork+0x25d/0x320
[Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
[Thu Jun 12 11:01:33 2025] ret_from_fork_asm+0x1a/0x30
[Thu Jun 12 11:01:33 2025] </TASK>
[Thu Jun 12 11:01:33 2025] INFO: task rw_sem_thread2:36526 <writer> blocked on an rw-semaphore likely owned by task rw_sem_thread1:36525 <writer>
[Thu Jun 12 11:01:33 2025] task:rw_sem_thread1 state:S stack:0 pid:36525 tgid:36525 ppid:2 task_flags:0x208040 flags:0x00004000
[Thu Jun 12 11:01:33 2025] Call Trace:
[Thu Jun 12 11:01:33 2025] <TASK>
[Thu Jun 12 11:01:33 2025] __schedule+0x7c7/0x1930
[Thu Jun 12 11:01:33 2025] ? __pfx___schedule+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? __mod_timer+0x304/0xa80
[Thu Jun 12 11:01:33 2025] ? irq_work_queue+0x6a/0xa0
[Thu Jun 12 11:01:33 2025] ? __pfx_vprintk_emit+0x10/0x10
[Thu Jun 12 11:01:33 2025] schedule+0x6a/0x180
[Thu Jun 12 11:01:33 2025] schedule_timeout+0xfb/0x230
[Thu Jun 12 11:01:33 2025] ? __pfx_schedule_timeout+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? __pfx_process_timeout+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? down_write+0xc4/0x140
[Thu Jun 12 11:01:33 2025] msleep_interruptible+0xbe/0x150
[Thu Jun 12 11:01:33 2025] ? __pfx_thread1_func+0x10/0x10 [rw_sem_test_2]
[Thu Jun 12 11:01:33 2025] thread1_func+0x37/0x60 [rw_sem_test_2]
[Thu Jun 12 11:01:33 2025] kthread+0x39f/0x750
[Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? __pfx__raw_spin_lock_irq+0x10/0x10
[Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
[Thu Jun 12 11:01:33 2025] ret_from_fork+0x25d/0x320
[Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
[Thu Jun 12 11:01:33 2025] ret_from_fork_asm+0x1a/0x30
[Thu Jun 12 11:01:33 2025] </TASK>
[1] https://lore.kernel.org/all/174046694331.2194069.15472952050240807469.stgit@mhiramat.tok.corp.google.com/
Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Lance Yang <lance.yang@linux.dev>
---
include/linux/hung_task.h | 18 +++++++++---------
kernel/hung_task.c | 29 +++++++++++++++++++++++++----
kernel/locking/rwsem.c | 17 ++++++++++++++++-
3 files changed, 50 insertions(+), 14 deletions(-)
diff --git a/include/linux/hung_task.h b/include/linux/hung_task.h
index 1bc2b3244613..34e615c76ca5 100644
--- a/include/linux/hung_task.h
+++ b/include/linux/hung_task.h
@@ -21,17 +21,17 @@
* type.
*
* Type encoding:
- * 00 - Blocked on mutex (BLOCKER_TYPE_MUTEX)
- * 01 - Blocked on semaphore (BLOCKER_TYPE_SEM)
- * 10 - Blocked on rt-mutex (BLOCKER_TYPE_RTMUTEX)
- * 11 - Blocked on rw-semaphore (BLOCKER_TYPE_RWSEM)
+ * 00 - Blocked on mutex (BLOCKER_TYPE_MUTEX)
+ * 01 - Blocked on semaphore (BLOCKER_TYPE_SEM)
+ * 10 - Blocked on rw-semaphore as READER (BLOCKER_TYPE_RWSEM_READER)
+ * 11 - Blocked on rw-semaphore as WRITER (BLOCKER_TYPE_RWSEM_WRITER)
*/
-#define BLOCKER_TYPE_MUTEX 0x00UL
-#define BLOCKER_TYPE_SEM 0x01UL
-#define BLOCKER_TYPE_RTMUTEX 0x02UL
-#define BLOCKER_TYPE_RWSEM 0x03UL
+#define BLOCKER_TYPE_MUTEX 0x00UL
+#define BLOCKER_TYPE_SEM 0x01UL
+#define BLOCKER_TYPE_RWSEM_READER 0x02UL
+#define BLOCKER_TYPE_RWSEM_WRITER 0x03UL
-#define BLOCKER_TYPE_MASK 0x03UL
+#define BLOCKER_TYPE_MASK 0x03UL
#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
static inline void hung_task_set_blocker(void *lock, unsigned long type)
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index d2432df2b905..8708a1205f82 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -23,6 +23,7 @@
#include <linux/sched/debug.h>
#include <linux/sched/sysctl.h>
#include <linux/hung_task.h>
+#include <linux/rwsem.h>
#include <trace/events/sched.h>
@@ -100,6 +101,7 @@ static void debug_show_blocker(struct task_struct *task)
{
struct task_struct *g, *t;
unsigned long owner, blocker, blocker_type;
+ const char *rwsem_blocked_by, *rwsem_blocked_as;
RCU_LOCKDEP_WARN(!rcu_read_lock_held(), "No rcu lock held");
@@ -111,12 +113,20 @@ static void debug_show_blocker(struct task_struct *task)
switch (blocker_type) {
case BLOCKER_TYPE_MUTEX:
- owner = mutex_get_owner(
- (struct mutex *)hung_task_blocker_to_lock(blocker));
+ owner = mutex_get_owner(hung_task_blocker_to_lock(blocker));
break;
case BLOCKER_TYPE_SEM:
- owner = sem_last_holder(
- (struct semaphore *)hung_task_blocker_to_lock(blocker));
+ owner = sem_last_holder(hung_task_blocker_to_lock(blocker));
+ break;
+ case BLOCKER_TYPE_RWSEM_READER:
+ case BLOCKER_TYPE_RWSEM_WRITER:
+ owner = (unsigned long)rwsem_owner(
+ hung_task_blocker_to_lock(blocker));
+ rwsem_blocked_as = (blocker_type == BLOCKER_TYPE_RWSEM_READER) ?
+ "reader" : "writer";
+ rwsem_blocked_by = is_rwsem_reader_owned(
+ hung_task_blocker_to_lock(blocker)) ?
+ "reader" : "writer";
break;
default:
WARN_ON_ONCE(1);
@@ -134,6 +144,11 @@ static void debug_show_blocker(struct task_struct *task)
pr_err("INFO: task %s:%d is blocked on a semaphore, but the last holder is not found.\n",
task->comm, task->pid);
break;
+ case BLOCKER_TYPE_RWSEM_READER:
+ case BLOCKER_TYPE_RWSEM_WRITER:
+ pr_err("INFO: task %s:%d is blocked on an rw-semaphore, but the owner is not found.\n",
+ task->comm, task->pid);
+ break;
}
return;
}
@@ -152,6 +167,12 @@ static void debug_show_blocker(struct task_struct *task)
pr_err("INFO: task %s:%d blocked on a semaphore likely last held by task %s:%d\n",
task->comm, task->pid, t->comm, t->pid);
break;
+ case BLOCKER_TYPE_RWSEM_READER:
+ case BLOCKER_TYPE_RWSEM_WRITER:
+ pr_err("INFO: task %s:%d <%s> blocked on an rw-semaphore likely owned by task %s:%d <%s>\n",
+ task->comm, task->pid, rwsem_blocked_as, t->comm,
+ t->pid, rwsem_blocked_by);
+ break;
}
sched_show_task(t);
return;
diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index a310eb9896de..92c6332da401 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -27,6 +27,7 @@
#include <linux/export.h>
#include <linux/rwsem.h>
#include <linux/atomic.h>
+#include <linux/hung_task.h>
#include <trace/events/lock.h>
#ifndef CONFIG_PREEMPT_RT
@@ -1065,10 +1066,13 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
wake_up_q(&wake_q);
trace_contention_begin(sem, LCB_F_READ);
+ set_current_state(state);
+
+ if (state == TASK_UNINTERRUPTIBLE)
+ hung_task_set_blocker(sem, BLOCKER_TYPE_RWSEM_READER);
/* wait to be given the lock */
for (;;) {
- set_current_state(state);
if (!smp_load_acquire(&waiter.task)) {
/* Matches rwsem_mark_wake()'s smp_store_release(). */
break;
@@ -1083,8 +1087,12 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
}
schedule_preempt_disabled();
lockevent_inc(rwsem_sleep_reader);
+ set_current_state(state);
}
+ if (state == TASK_UNINTERRUPTIBLE)
+ hung_task_clear_blocker();
+
__set_current_state(TASK_RUNNING);
lockevent_inc(rwsem_rlock);
trace_contention_end(sem, 0);
@@ -1146,6 +1154,9 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
set_current_state(state);
trace_contention_begin(sem, LCB_F_WRITE);
+ if (state == TASK_UNINTERRUPTIBLE)
+ hung_task_set_blocker(sem, BLOCKER_TYPE_RWSEM_WRITER);
+
for (;;) {
if (rwsem_try_write_lock(sem, &waiter)) {
/* rwsem_try_write_lock() implies ACQUIRE on success */
@@ -1179,6 +1190,10 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
trylock_again:
raw_spin_lock_irq(&sem->wait_lock);
}
+
+ if (state == TASK_UNINTERRUPTIBLE)
+ hung_task_clear_blocker();
+
__set_current_state(TASK_RUNNING);
raw_spin_unlock_irq(&sem->wait_lock);
lockevent_inc(rwsem_wlock);
--
2.49.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH RFC 0/3] extend hung task blocker tracking to rwsems
2025-06-12 4:19 [PATCH RFC 0/3] extend hung task blocker tracking to rwsems Lance Yang
` (2 preceding siblings ...)
2025-06-12 4:19 ` [PATCH RFC 3/3] hung_task: extend hung task blocker tracking to rwsems Lance Yang
@ 2025-06-23 12:33 ` Lance Yang
3 siblings, 0 replies; 15+ messages in thread
From: Lance Yang @ 2025-06-23 12:33 UTC (permalink / raw)
To: akpm
Cc: zi.li, anna.schumaker, boqun.feng, joel.granados, jstultz,
kent.overstreet, leonylgao, linux-kernel, longman, mhiramat,
mingo, mingzhe.yang, peterz, rostedt, senozhatsky, tfiga, will
Gently ping ;p
On Thu, Jun 12, 2025 at 12:20 PM Lance Yang <ioworker0@gmail.com> wrote:
>
> Hi all,
>
> Inspired by mutex blocker tracking[1], and having already extended it to
> semaphores, let's now add support for reader-writer semaphores (rwsems).
>
> The approach is simple: when a task enters TASK_UNINTERRUPTIBLE while
> waiting for an rwsem, we just call hung_task_set_blocker(). The hung task
> detector can then query the rwsem's owner to identify the lock holder.
>
> Tracking works reliably for writers, as there can only be a single writer
> holding the lock, and its task struct is stored in the owner field.
>
> The main challenge lies with readers. The owner field points to only one
> of many concurrent readers, so we might lose track of the blocker if that
> specific reader unlocks, even while others remain. This is not a
> significant issue, however. In practice, long-lasting lock contention is
> almost always caused by a writer. Therefore, reliably tracking the writer
> is the primary goal of this patch series ;)
>
> With this change, the hung task detector can now show blocker task's info
> like below:
>
> [Thu Jun 12 11:01:33 2025] INFO: task rw_sem_thread2:36526 blocked for more than 122 seconds.
> [Thu Jun 12 11:01:33 2025] Tainted: G S O 6.16.0-rc1 #1
> [Thu Jun 12 11:01:33 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [Thu Jun 12 11:01:33 2025] task:rw_sem_thread2 state:D stack:0 pid:36526 tgid:36526 ppid:2 task_flags:0x208040 flags:0x00004000
> [Thu Jun 12 11:01:33 2025] Call Trace:
> [Thu Jun 12 11:01:33 2025] <TASK>
> [Thu Jun 12 11:01:33 2025] __schedule+0x7c7/0x1930
> [Thu Jun 12 11:01:33 2025] ? __pfx___schedule+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? _raw_spin_lock_irq+0x8a/0xe0
> [Thu Jun 12 11:01:33 2025] ? __pfx__raw_spin_lock_irq+0x10/0x10
> [Thu Jun 12 11:01:33 2025] schedule+0x6a/0x180
> [Thu Jun 12 11:01:33 2025] schedule_preempt_disabled+0x15/0x30
> [Thu Jun 12 11:01:33 2025] rwsem_down_write_slowpath+0x447/0x1090
> [Thu Jun 12 11:01:33 2025] ? __pfx_rwsem_down_write_slowpath+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? __pfx___schedule+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? __pfx___might_resched+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? __pfx_thread2_func+0x10/0x10 [rw_sem_test_2]
> [Thu Jun 12 11:01:33 2025] down_write+0x125/0x140
> [Thu Jun 12 11:01:33 2025] ? __pfx_down_write+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? msleep+0x91/0xf0
> [Thu Jun 12 11:01:33 2025] ? __raw_spin_lock_irqsave+0x8c/0xf0
> [Thu Jun 12 11:01:33 2025] thread2_func+0x37/0x70 [rw_sem_test_2]
> [Thu Jun 12 11:01:33 2025] kthread+0x39f/0x750
> [Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? __pfx__raw_spin_lock_irq+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ret_from_fork+0x25d/0x320
> [Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ret_from_fork_asm+0x1a/0x30
> [Thu Jun 12 11:01:33 2025] </TASK>
> [Thu Jun 12 11:01:33 2025] INFO: task rw_sem_thread2:36526 <writer> blocked on an rw-semaphore likely owned by task rw_sem_thread1:36525 <writer>
> [Thu Jun 12 11:01:33 2025] task:rw_sem_thread1 state:S stack:0 pid:36525 tgid:36525 ppid:2 task_flags:0x208040 flags:0x00004000
> [Thu Jun 12 11:01:33 2025] Call Trace:
> [Thu Jun 12 11:01:33 2025] <TASK>
> [Thu Jun 12 11:01:33 2025] __schedule+0x7c7/0x1930
> [Thu Jun 12 11:01:33 2025] ? __pfx___schedule+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? __mod_timer+0x304/0xa80
> [Thu Jun 12 11:01:33 2025] ? irq_work_queue+0x6a/0xa0
> [Thu Jun 12 11:01:33 2025] ? __pfx_vprintk_emit+0x10/0x10
> [Thu Jun 12 11:01:33 2025] schedule+0x6a/0x180
> [Thu Jun 12 11:01:33 2025] schedule_timeout+0xfb/0x230
> [Thu Jun 12 11:01:33 2025] ? __pfx_schedule_timeout+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? __pfx_process_timeout+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? down_write+0xc4/0x140
> [Thu Jun 12 11:01:33 2025] msleep_interruptible+0xbe/0x150
> [Thu Jun 12 11:01:33 2025] ? __pfx_thread1_func+0x10/0x10 [rw_sem_test_2]
> [Thu Jun 12 11:01:33 2025] thread1_func+0x37/0x60 [rw_sem_test_2]
> [Thu Jun 12 11:01:33 2025] kthread+0x39f/0x750
> [Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? __pfx__raw_spin_lock_irq+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ret_from_fork+0x25d/0x320
> [Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ret_from_fork_asm+0x1a/0x30
> [Thu Jun 12 11:01:33 2025] </TASK>
>
> [1] https://lore.kernel.org/all/174046694331.2194069.15472952050240807469.stgit@mhiramat.tok.corp.google.com/
>
> Thanks,
> Lance
>
> Lance Yang (3):
> locking/rwsem: make owner helpers globally available
> locking/rwsem: clear reader-owner on unlock to reduce false positives
> hung_task: extend hung task blocker tracking to rwsems
>
> include/linux/hung_task.h | 18 +++++++++---------
> include/linux/rwsem.h | 12 ++++++++++++
> kernel/hung_task.c | 29 +++++++++++++++++++++++++----
> kernel/locking/rwsem.c | 31 +++++++++++++++++++++++--------
> 4 files changed, 69 insertions(+), 21 deletions(-)
>
> --
> 2.49.0
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RFC 1/3] locking/rwsem: make owner helpers globally available
2025-06-12 4:19 ` [PATCH RFC 1/3] locking/rwsem: make owner helpers globally available Lance Yang
@ 2025-06-24 0:17 ` Masami Hiramatsu
2025-06-24 1:35 ` Lance Yang
0 siblings, 1 reply; 15+ messages in thread
From: Masami Hiramatsu @ 2025-06-24 0:17 UTC (permalink / raw)
To: Lance Yang
Cc: akpm, zi.li, anna.schumaker, boqun.feng, joel.granados, jstultz,
kent.overstreet, leonylgao, linux-kernel, longman, mhiramat,
mingo, mingzhe.yang, peterz, rostedt, senozhatsky, tfiga, will,
Lance Yang
On Thu, 12 Jun 2025 12:19:24 +0800
Lance Yang <ioworker0@gmail.com> wrote:
> From: Lance Yang <lance.yang@linux.dev>
>
> In preparation for extending blocker tracking to support rwsems, make the
> rwsem_owner() and is_rwsem_reader_owned() helpers globally available for
> determining if the blocker is a writer or one of the readers.
>
Looks good to me.
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Thanks!
> Signed-off-by: Lance Yang <lance.yang@linux.dev>
> ---
> include/linux/rwsem.h | 12 ++++++++++++
> kernel/locking/rwsem.c | 8 +++++---
> 2 files changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h
> index c8b543d428b0..544853bed5b9 100644
> --- a/include/linux/rwsem.h
> +++ b/include/linux/rwsem.h
> @@ -132,6 +132,18 @@ static inline int rwsem_is_contended(struct rw_semaphore *sem)
> return !list_empty(&sem->wait_list);
> }
>
> +#if defined(CONFIG_DEBUG_RWSEMS) || defined(CONFIG_DETECT_HUNG_TASK_BLOCKER)
> +/*
> + * Return just the real task structure pointer of the owner
> + */
> +extern struct task_struct *rwsem_owner(struct rw_semaphore *sem);
> +
> +/*
> + * Return true if the rwsem is owned by a reader.
> + */
> +extern bool is_rwsem_reader_owned(struct rw_semaphore *sem);
> +#endif
> +
> #else /* !CONFIG_PREEMPT_RT */
>
> #include <linux/rwbase_rt.h>
> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
> index 2ddb827e3bea..6cb29442d4fc 100644
> --- a/kernel/locking/rwsem.c
> +++ b/kernel/locking/rwsem.c
> @@ -181,11 +181,11 @@ static inline void rwsem_set_reader_owned(struct rw_semaphore *sem)
> __rwsem_set_reader_owned(sem, current);
> }
>
> -#ifdef CONFIG_DEBUG_RWSEMS
> +#if defined(CONFIG_DEBUG_RWSEMS) || defined(CONFIG_DETECT_HUNG_TASK_BLOCKER)
> /*
> * Return just the real task structure pointer of the owner
> */
> -static inline struct task_struct *rwsem_owner(struct rw_semaphore *sem)
> +struct task_struct *rwsem_owner(struct rw_semaphore *sem)
> {
> return (struct task_struct *)
> (atomic_long_read(&sem->owner) & ~RWSEM_OWNER_FLAGS_MASK);
> @@ -194,7 +194,7 @@ static inline struct task_struct *rwsem_owner(struct rw_semaphore *sem)
> /*
> * Return true if the rwsem is owned by a reader.
> */
> -static inline bool is_rwsem_reader_owned(struct rw_semaphore *sem)
> +bool is_rwsem_reader_owned(struct rw_semaphore *sem)
> {
> /*
> * Check the count to see if it is write-locked.
> @@ -205,7 +205,9 @@ static inline bool is_rwsem_reader_owned(struct rw_semaphore *sem)
> return false;
> return rwsem_test_oflags(sem, RWSEM_READER_OWNED);
> }
> +#endif
>
> +#ifdef CONFIG_DEBUG_RWSEMS
> /*
> * With CONFIG_DEBUG_RWSEMS configured, it will make sure that if there
> * is a task pointer in owner of a reader-owned rwsem, it will be the
> --
> 2.49.0
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RFC 2/3] locking/rwsem: clear reader-owner on unlock to reduce false positives
2025-06-12 4:19 ` [PATCH RFC 2/3] locking/rwsem: clear reader-owner on unlock to reduce false positives Lance Yang
@ 2025-06-24 0:26 ` Masami Hiramatsu
2025-06-24 1:44 ` Lance Yang
0 siblings, 1 reply; 15+ messages in thread
From: Masami Hiramatsu @ 2025-06-24 0:26 UTC (permalink / raw)
To: Lance Yang
Cc: akpm, zi.li, anna.schumaker, boqun.feng, joel.granados, jstultz,
kent.overstreet, leonylgao, linux-kernel, longman, mhiramat,
mingo, mingzhe.yang, peterz, rostedt, senozhatsky, tfiga, will,
Lance Yang
On Thu, 12 Jun 2025 12:19:25 +0800
Lance Yang <ioworker0@gmail.com> wrote:
> From: Lance Yang <lance.yang@linux.dev>
>
> When CONFIG_DETECT_HUNG_TASK_BLOCKER is enabled, a stale owner pointer in a
> reader-owned rwsem can lead to false positives in blocker tracking.
>
> To mitigate this, let’s try to clear the owner field on unlock, as a NULL
> owner is better than a stale one for diagnostics.
Can we merge this to [PATCH 1/3]? It seems that you removed #ifdef and
remove it. This means in anyway we need the feature enabled by DEBUG_RWSEMS.
Thanks,
>
> Signed-off-by: Lance Yang <lance.yang@linux.dev>
> ---
> kernel/locking/rwsem.c | 10 ++++------
> 1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
> index 6cb29442d4fc..a310eb9896de 100644
> --- a/kernel/locking/rwsem.c
> +++ b/kernel/locking/rwsem.c
> @@ -205,14 +205,12 @@ bool is_rwsem_reader_owned(struct rw_semaphore *sem)
> return false;
> return rwsem_test_oflags(sem, RWSEM_READER_OWNED);
> }
> -#endif
>
> -#ifdef CONFIG_DEBUG_RWSEMS
> /*
> - * With CONFIG_DEBUG_RWSEMS configured, it will make sure that if there
> - * is a task pointer in owner of a reader-owned rwsem, it will be the
> - * real owner or one of the real owners. The only exception is when the
> - * unlock is done by up_read_non_owner().
> + * With CONFIG_DEBUG_RWSEMS or CONFIG_DETECT_HUNG_TASK_BLOCKER configured,
> + * it will make sure that the owner field of a reader-owned rwsem either
> + * points to a real reader-owner(s) or gets cleared. The only exception is
> + * when the unlock is done by up_read_non_owner().
> */
> static inline void rwsem_clear_reader_owned(struct rw_semaphore *sem)
> {
> --
> 2.49.0
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RFC 1/3] locking/rwsem: make owner helpers globally available
2025-06-24 0:17 ` Masami Hiramatsu
@ 2025-06-24 1:35 ` Lance Yang
0 siblings, 0 replies; 15+ messages in thread
From: Lance Yang @ 2025-06-24 1:35 UTC (permalink / raw)
To: Masami Hiramatsu (Google)
Cc: akpm, zi.li, anna.schumaker, boqun.feng, joel.granados, jstultz,
kent.overstreet, leonylgao, linux-kernel, longman, mingo,
mingzhe.yang, peterz, rostedt, senozhatsky, tfiga, will,
Lance Yang
On 2025/6/24 08:17, Masami Hiramatsu (Google) wrote:
> On Thu, 12 Jun 2025 12:19:24 +0800
> Lance Yang <ioworker0@gmail.com> wrote:
>
>> From: Lance Yang <lance.yang@linux.dev>
>>
>> In preparation for extending blocker tracking to support rwsems, make the
>> rwsem_owner() and is_rwsem_reader_owned() helpers globally available for
>> determining if the blocker is a writer or one of the readers.
>>
>
> Looks good to me.
>
> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
> Thanks!
Thanks for taking the time to review!
Lance
>
>> Signed-off-by: Lance Yang <lance.yang@linux.dev>
>> ---
>> include/linux/rwsem.h | 12 ++++++++++++
>> kernel/locking/rwsem.c | 8 +++++---
>> 2 files changed, 17 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h
>> index c8b543d428b0..544853bed5b9 100644
>> --- a/include/linux/rwsem.h
>> +++ b/include/linux/rwsem.h
>> @@ -132,6 +132,18 @@ static inline int rwsem_is_contended(struct rw_semaphore *sem)
>> return !list_empty(&sem->wait_list);
>> }
>>
>> +#if defined(CONFIG_DEBUG_RWSEMS) || defined(CONFIG_DETECT_HUNG_TASK_BLOCKER)
>> +/*
>> + * Return just the real task structure pointer of the owner
>> + */
>> +extern struct task_struct *rwsem_owner(struct rw_semaphore *sem);
>> +
>> +/*
>> + * Return true if the rwsem is owned by a reader.
>> + */
>> +extern bool is_rwsem_reader_owned(struct rw_semaphore *sem);
>> +#endif
>> +
>> #else /* !CONFIG_PREEMPT_RT */
>>
>> #include <linux/rwbase_rt.h>
>> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
>> index 2ddb827e3bea..6cb29442d4fc 100644
>> --- a/kernel/locking/rwsem.c
>> +++ b/kernel/locking/rwsem.c
>> @@ -181,11 +181,11 @@ static inline void rwsem_set_reader_owned(struct rw_semaphore *sem)
>> __rwsem_set_reader_owned(sem, current);
>> }
>>
>> -#ifdef CONFIG_DEBUG_RWSEMS
>> +#if defined(CONFIG_DEBUG_RWSEMS) || defined(CONFIG_DETECT_HUNG_TASK_BLOCKER)
>> /*
>> * Return just the real task structure pointer of the owner
>> */
>> -static inline struct task_struct *rwsem_owner(struct rw_semaphore *sem)
>> +struct task_struct *rwsem_owner(struct rw_semaphore *sem)
>> {
>> return (struct task_struct *)
>> (atomic_long_read(&sem->owner) & ~RWSEM_OWNER_FLAGS_MASK);
>> @@ -194,7 +194,7 @@ static inline struct task_struct *rwsem_owner(struct rw_semaphore *sem)
>> /*
>> * Return true if the rwsem is owned by a reader.
>> */
>> -static inline bool is_rwsem_reader_owned(struct rw_semaphore *sem)
>> +bool is_rwsem_reader_owned(struct rw_semaphore *sem)
>> {
>> /*
>> * Check the count to see if it is write-locked.
>> @@ -205,7 +205,9 @@ static inline bool is_rwsem_reader_owned(struct rw_semaphore *sem)
>> return false;
>> return rwsem_test_oflags(sem, RWSEM_READER_OWNED);
>> }
>> +#endif
>>
>> +#ifdef CONFIG_DEBUG_RWSEMS
>> /*
>> * With CONFIG_DEBUG_RWSEMS configured, it will make sure that if there
>> * is a task pointer in owner of a reader-owned rwsem, it will be the
>> --
>> 2.49.0
>>
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RFC 2/3] locking/rwsem: clear reader-owner on unlock to reduce false positives
2025-06-24 0:26 ` Masami Hiramatsu
@ 2025-06-24 1:44 ` Lance Yang
2025-06-24 3:53 ` Masami Hiramatsu
0 siblings, 1 reply; 15+ messages in thread
From: Lance Yang @ 2025-06-24 1:44 UTC (permalink / raw)
To: Masami Hiramatsu (Google)
Cc: akpm, zi.li, anna.schumaker, boqun.feng, joel.granados, jstultz,
kent.overstreet, leonylgao, linux-kernel, longman, mingo,
mingzhe.yang, peterz, rostedt, senozhatsky, tfiga, will,
Lance Yang
On 2025/6/24 08:26, Masami Hiramatsu (Google) wrote:
> On Thu, 12 Jun 2025 12:19:25 +0800
> Lance Yang <ioworker0@gmail.com> wrote:
>
>> From: Lance Yang <lance.yang@linux.dev>
>>
>> When CONFIG_DETECT_HUNG_TASK_BLOCKER is enabled, a stale owner pointer in a
>> reader-owned rwsem can lead to false positives in blocker tracking.
>>
>> To mitigate this, let’s try to clear the owner field on unlock, as a NULL
>> owner is better than a stale one for diagnostics.
>
> Can we merge this to [PATCH 1/3]? It seems that you removed #ifdef and
> remove it. This means in anyway we need the feature enabled by DEBUG_RWSEMS.
Thanks for the feedback! I see your point about the dependency ;)
Personlly, I'd perfer to keep them separate. The reasoning is that
they addreess two distinct things, and I think splitting them makes
this series clearer and easier to review ;)
Patch #1 focuses on "ownership tracking": Its only job is to make
the existing owner-related helpers (rwsem_owner(), is_rwsem_reader_owned())
globally available when blocker tracking is enabled.
Patch #2, on the other hand, is about "reader-owner cleanup": It
introduces a functional change to the unlock path, trying to clear
the owner field for reader-owned rwsems.
Does this reasoning make sense to you?
Thanks,
Lance
>
> Thanks,
>
>>
>> Signed-off-by: Lance Yang <lance.yang@linux.dev>
>> ---
>> kernel/locking/rwsem.c | 10 ++++------
>> 1 file changed, 4 insertions(+), 6 deletions(-)
>>
>> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
>> index 6cb29442d4fc..a310eb9896de 100644
>> --- a/kernel/locking/rwsem.c
>> +++ b/kernel/locking/rwsem.c
>> @@ -205,14 +205,12 @@ bool is_rwsem_reader_owned(struct rw_semaphore *sem)
>> return false;
>> return rwsem_test_oflags(sem, RWSEM_READER_OWNED);
>> }
>> -#endif
>>
>> -#ifdef CONFIG_DEBUG_RWSEMS
>> /*
>> - * With CONFIG_DEBUG_RWSEMS configured, it will make sure that if there
>> - * is a task pointer in owner of a reader-owned rwsem, it will be the
>> - * real owner or one of the real owners. The only exception is when the
>> - * unlock is done by up_read_non_owner().
>> + * With CONFIG_DEBUG_RWSEMS or CONFIG_DETECT_HUNG_TASK_BLOCKER configured,
>> + * it will make sure that the owner field of a reader-owned rwsem either
>> + * points to a real reader-owner(s) or gets cleared. The only exception is
>> + * when the unlock is done by up_read_non_owner().
>> */
>> static inline void rwsem_clear_reader_owned(struct rw_semaphore *sem)
>> {
>> --
>> 2.49.0
>>
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RFC 2/3] locking/rwsem: clear reader-owner on unlock to reduce false positives
2025-06-24 1:44 ` Lance Yang
@ 2025-06-24 3:53 ` Masami Hiramatsu
2025-06-24 5:02 ` Lance Yang
0 siblings, 1 reply; 15+ messages in thread
From: Masami Hiramatsu @ 2025-06-24 3:53 UTC (permalink / raw)
To: Lance Yang
Cc: akpm, zi.li, anna.schumaker, boqun.feng, joel.granados, jstultz,
kent.overstreet, leonylgao, linux-kernel, longman, mingo,
mingzhe.yang, peterz, rostedt, senozhatsky, tfiga, will,
Lance Yang
On Tue, 24 Jun 2025 09:44:55 +0800
Lance Yang <lance.yang@linux.dev> wrote:
>
>
> On 2025/6/24 08:26, Masami Hiramatsu (Google) wrote:
> > On Thu, 12 Jun 2025 12:19:25 +0800
> > Lance Yang <ioworker0@gmail.com> wrote:
> >
> >> From: Lance Yang <lance.yang@linux.dev>
> >>
> >> When CONFIG_DETECT_HUNG_TASK_BLOCKER is enabled, a stale owner pointer in a
> >> reader-owned rwsem can lead to false positives in blocker tracking.
> >>
> >> To mitigate this, let’s try to clear the owner field on unlock, as a NULL
> >> owner is better than a stale one for diagnostics.
> >
> > Can we merge this to [PATCH 1/3]? It seems that you removed #ifdef and
> > remove it. This means in anyway we need the feature enabled by DEBUG_RWSEMS.
>
> Thanks for the feedback! I see your point about the dependency ;)
>
> Personlly, I'd perfer to keep them separate. The reasoning is that
> they addreess two distinct things, and I think splitting them makes
> this series clearer and easier to review ;)
>
> Patch #1 focuses on "ownership tracking": Its only job is to make
> the existing owner-related helpers (rwsem_owner(), is_rwsem_reader_owned())
> globally available when blocker tracking is enabled.
>
> Patch #2, on the other hand, is about "reader-owner cleanup": It
> introduces a functional change to the unlock path, trying to clear
> the owner field for reader-owned rwsems.
But without clearing the owner, the owner information can be
broken, right? Since CONFIG_DEBUG_RWSEMS is working as it is,
I think those cannot be decoupled. For example, comparing the
result of both DETECT_HUNG_TASK_BLOCKER and DEBUG_RWSEMS are
enabled and only DETECT_HUNG_TASK_BLOCKER is enabled, the
result is different.
>
> Does this reasoning make sense to you?
Sorry, no. I think "reader-owner cleanup" is a part of "ownership
tracking" as DEBUG_RWSEMS does (and that keeps consistency of
the ownership tracking behavior same as DEBUG_RWSEM).
Thank you,
>
> Thanks,
> Lance
>
> >
> > Thanks,
> >
> >>
> >> Signed-off-by: Lance Yang <lance.yang@linux.dev>
> >> ---
> >> kernel/locking/rwsem.c | 10 ++++------
> >> 1 file changed, 4 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
> >> index 6cb29442d4fc..a310eb9896de 100644
> >> --- a/kernel/locking/rwsem.c
> >> +++ b/kernel/locking/rwsem.c
> >> @@ -205,14 +205,12 @@ bool is_rwsem_reader_owned(struct rw_semaphore *sem)
> >> return false;
> >> return rwsem_test_oflags(sem, RWSEM_READER_OWNED);
> >> }
> >> -#endif
> >>
> >> -#ifdef CONFIG_DEBUG_RWSEMS
> >> /*
> >> - * With CONFIG_DEBUG_RWSEMS configured, it will make sure that if there
> >> - * is a task pointer in owner of a reader-owned rwsem, it will be the
> >> - * real owner or one of the real owners. The only exception is when the
> >> - * unlock is done by up_read_non_owner().
> >> + * With CONFIG_DEBUG_RWSEMS or CONFIG_DETECT_HUNG_TASK_BLOCKER configured,
> >> + * it will make sure that the owner field of a reader-owned rwsem either
> >> + * points to a real reader-owner(s) or gets cleared. The only exception is
> >> + * when the unlock is done by up_read_non_owner().
> >> */
> >> static inline void rwsem_clear_reader_owned(struct rw_semaphore *sem)
> >> {
> >> --
> >> 2.49.0
> >>
> >
> >
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RFC 3/3] hung_task: extend hung task blocker tracking to rwsems
2025-06-12 4:19 ` [PATCH RFC 3/3] hung_task: extend hung task blocker tracking to rwsems Lance Yang
@ 2025-06-24 3:57 ` Masami Hiramatsu
2025-06-24 4:25 ` Lance Yang
0 siblings, 1 reply; 15+ messages in thread
From: Masami Hiramatsu @ 2025-06-24 3:57 UTC (permalink / raw)
To: Lance Yang
Cc: akpm, zi.li, anna.schumaker, boqun.feng, joel.granados, jstultz,
kent.overstreet, leonylgao, linux-kernel, longman, mhiramat,
mingo, mingzhe.yang, peterz, rostedt, senozhatsky, tfiga, will,
Lance Yang
On Thu, 12 Jun 2025 12:19:26 +0800
Lance Yang <ioworker0@gmail.com> wrote:
> From: Lance Yang <lance.yang@linux.dev>
>
> Inspired by mutex blocker tracking[1], and having already extended it to
> semaphores, let's now add support for reader-writer semaphores (rwsems).
>
> The approach is simple: when a task enters TASK_UNINTERRUPTIBLE while
> waiting for an rwsem, we just call hung_task_set_blocker(). The hung task
> detector can then query the rwsem's owner to identify the lock holder.
>
> Tracking works reliably for writers, as there can only be a single writer
> holding the lock, and its task struct is stored in the owner field.
>
> The main challenge lies with readers. The owner field points to only one
> of many concurrent readers, so we might lose track of the blocker if that
> specific reader unlocks, even while others remain. This is not a
> significant issue, however. In practice, long-lasting lock contention is
> almost always caused by a writer. Therefore, reliably tracking the writer
> is the primary goal of this patch series ;)
I think as far as it is reliable, it is better than nothing :) and that
can help us to debug some part of kernel crashes.
>
> With this change, the hung task detector can now show blocker task's info
> like below:
>
> [Thu Jun 12 11:01:33 2025] INFO: task rw_sem_thread2:36526 blocked for more than 122 seconds.
> [Thu Jun 12 11:01:33 2025] Tainted: G S O 6.16.0-rc1 #1
> [Thu Jun 12 11:01:33 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [Thu Jun 12 11:01:33 2025] task:rw_sem_thread2 state:D stack:0 pid:36526 tgid:36526 ppid:2 task_flags:0x208040 flags:0x00004000
> [Thu Jun 12 11:01:33 2025] Call Trace:
> [Thu Jun 12 11:01:33 2025] <TASK>
> [Thu Jun 12 11:01:33 2025] __schedule+0x7c7/0x1930
> [Thu Jun 12 11:01:33 2025] ? __pfx___schedule+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? _raw_spin_lock_irq+0x8a/0xe0
> [Thu Jun 12 11:01:33 2025] ? __pfx__raw_spin_lock_irq+0x10/0x10
> [Thu Jun 12 11:01:33 2025] schedule+0x6a/0x180
> [Thu Jun 12 11:01:33 2025] schedule_preempt_disabled+0x15/0x30
> [Thu Jun 12 11:01:33 2025] rwsem_down_write_slowpath+0x447/0x1090
> [Thu Jun 12 11:01:33 2025] ? __pfx_rwsem_down_write_slowpath+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? __pfx___schedule+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? __pfx___might_resched+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? __pfx_thread2_func+0x10/0x10 [rw_sem_test_2]
> [Thu Jun 12 11:01:33 2025] down_write+0x125/0x140
> [Thu Jun 12 11:01:33 2025] ? __pfx_down_write+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? msleep+0x91/0xf0
> [Thu Jun 12 11:01:33 2025] ? __raw_spin_lock_irqsave+0x8c/0xf0
> [Thu Jun 12 11:01:33 2025] thread2_func+0x37/0x70 [rw_sem_test_2]
> [Thu Jun 12 11:01:33 2025] kthread+0x39f/0x750
> [Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? __pfx__raw_spin_lock_irq+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ret_from_fork+0x25d/0x320
> [Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ret_from_fork_asm+0x1a/0x30
> [Thu Jun 12 11:01:33 2025] </TASK>
> [Thu Jun 12 11:01:33 2025] INFO: task rw_sem_thread2:36526 <writer> blocked on an rw-semaphore likely owned by task rw_sem_thread1:36525 <writer>
> [Thu Jun 12 11:01:33 2025] task:rw_sem_thread1 state:S stack:0 pid:36525 tgid:36525 ppid:2 task_flags:0x208040 flags:0x00004000
> [Thu Jun 12 11:01:33 2025] Call Trace:
> [Thu Jun 12 11:01:33 2025] <TASK>
> [Thu Jun 12 11:01:33 2025] __schedule+0x7c7/0x1930
> [Thu Jun 12 11:01:33 2025] ? __pfx___schedule+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? __mod_timer+0x304/0xa80
> [Thu Jun 12 11:01:33 2025] ? irq_work_queue+0x6a/0xa0
> [Thu Jun 12 11:01:33 2025] ? __pfx_vprintk_emit+0x10/0x10
> [Thu Jun 12 11:01:33 2025] schedule+0x6a/0x180
> [Thu Jun 12 11:01:33 2025] schedule_timeout+0xfb/0x230
> [Thu Jun 12 11:01:33 2025] ? __pfx_schedule_timeout+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? __pfx_process_timeout+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? down_write+0xc4/0x140
> [Thu Jun 12 11:01:33 2025] msleep_interruptible+0xbe/0x150
> [Thu Jun 12 11:01:33 2025] ? __pfx_thread1_func+0x10/0x10 [rw_sem_test_2]
> [Thu Jun 12 11:01:33 2025] thread1_func+0x37/0x60 [rw_sem_test_2]
> [Thu Jun 12 11:01:33 2025] kthread+0x39f/0x750
> [Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? __pfx__raw_spin_lock_irq+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ret_from_fork+0x25d/0x320
> [Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
> [Thu Jun 12 11:01:33 2025] ret_from_fork_asm+0x1a/0x30
> [Thu Jun 12 11:01:33 2025] </TASK>
>
> [1] https://lore.kernel.org/all/174046694331.2194069.15472952050240807469.stgit@mhiramat.tok.corp.google.com/
>
> Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> Signed-off-by: Lance Yang <lance.yang@linux.dev>
Looks good to me.
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
BTW, can you also add a patch to extend the test module
(samples/hung_task/hung_task_tests.c) to handle the rwsem
blocking case ? (reader and writer side)
Thank you,
> ---
> include/linux/hung_task.h | 18 +++++++++---------
> kernel/hung_task.c | 29 +++++++++++++++++++++++++----
> kernel/locking/rwsem.c | 17 ++++++++++++++++-
> 3 files changed, 50 insertions(+), 14 deletions(-)
>
> diff --git a/include/linux/hung_task.h b/include/linux/hung_task.h
> index 1bc2b3244613..34e615c76ca5 100644
> --- a/include/linux/hung_task.h
> +++ b/include/linux/hung_task.h
> @@ -21,17 +21,17 @@
> * type.
> *
> * Type encoding:
> - * 00 - Blocked on mutex (BLOCKER_TYPE_MUTEX)
> - * 01 - Blocked on semaphore (BLOCKER_TYPE_SEM)
> - * 10 - Blocked on rt-mutex (BLOCKER_TYPE_RTMUTEX)
> - * 11 - Blocked on rw-semaphore (BLOCKER_TYPE_RWSEM)
> + * 00 - Blocked on mutex (BLOCKER_TYPE_MUTEX)
> + * 01 - Blocked on semaphore (BLOCKER_TYPE_SEM)
> + * 10 - Blocked on rw-semaphore as READER (BLOCKER_TYPE_RWSEM_READER)
> + * 11 - Blocked on rw-semaphore as WRITER (BLOCKER_TYPE_RWSEM_WRITER)
> */
> -#define BLOCKER_TYPE_MUTEX 0x00UL
> -#define BLOCKER_TYPE_SEM 0x01UL
> -#define BLOCKER_TYPE_RTMUTEX 0x02UL
> -#define BLOCKER_TYPE_RWSEM 0x03UL
> +#define BLOCKER_TYPE_MUTEX 0x00UL
> +#define BLOCKER_TYPE_SEM 0x01UL
> +#define BLOCKER_TYPE_RWSEM_READER 0x02UL
> +#define BLOCKER_TYPE_RWSEM_WRITER 0x03UL
>
> -#define BLOCKER_TYPE_MASK 0x03UL
> +#define BLOCKER_TYPE_MASK 0x03UL
>
> #ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
> static inline void hung_task_set_blocker(void *lock, unsigned long type)
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index d2432df2b905..8708a1205f82 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -23,6 +23,7 @@
> #include <linux/sched/debug.h>
> #include <linux/sched/sysctl.h>
> #include <linux/hung_task.h>
> +#include <linux/rwsem.h>
>
> #include <trace/events/sched.h>
>
> @@ -100,6 +101,7 @@ static void debug_show_blocker(struct task_struct *task)
> {
> struct task_struct *g, *t;
> unsigned long owner, blocker, blocker_type;
> + const char *rwsem_blocked_by, *rwsem_blocked_as;
>
> RCU_LOCKDEP_WARN(!rcu_read_lock_held(), "No rcu lock held");
>
> @@ -111,12 +113,20 @@ static void debug_show_blocker(struct task_struct *task)
>
> switch (blocker_type) {
> case BLOCKER_TYPE_MUTEX:
> - owner = mutex_get_owner(
> - (struct mutex *)hung_task_blocker_to_lock(blocker));
> + owner = mutex_get_owner(hung_task_blocker_to_lock(blocker));
> break;
> case BLOCKER_TYPE_SEM:
> - owner = sem_last_holder(
> - (struct semaphore *)hung_task_blocker_to_lock(blocker));
> + owner = sem_last_holder(hung_task_blocker_to_lock(blocker));
> + break;
> + case BLOCKER_TYPE_RWSEM_READER:
> + case BLOCKER_TYPE_RWSEM_WRITER:
> + owner = (unsigned long)rwsem_owner(
> + hung_task_blocker_to_lock(blocker));
> + rwsem_blocked_as = (blocker_type == BLOCKER_TYPE_RWSEM_READER) ?
> + "reader" : "writer";
> + rwsem_blocked_by = is_rwsem_reader_owned(
> + hung_task_blocker_to_lock(blocker)) ?
> + "reader" : "writer";
> break;
> default:
> WARN_ON_ONCE(1);
> @@ -134,6 +144,11 @@ static void debug_show_blocker(struct task_struct *task)
> pr_err("INFO: task %s:%d is blocked on a semaphore, but the last holder is not found.\n",
> task->comm, task->pid);
> break;
> + case BLOCKER_TYPE_RWSEM_READER:
> + case BLOCKER_TYPE_RWSEM_WRITER:
> + pr_err("INFO: task %s:%d is blocked on an rw-semaphore, but the owner is not found.\n",
> + task->comm, task->pid);
> + break;
> }
> return;
> }
> @@ -152,6 +167,12 @@ static void debug_show_blocker(struct task_struct *task)
> pr_err("INFO: task %s:%d blocked on a semaphore likely last held by task %s:%d\n",
> task->comm, task->pid, t->comm, t->pid);
> break;
> + case BLOCKER_TYPE_RWSEM_READER:
> + case BLOCKER_TYPE_RWSEM_WRITER:
> + pr_err("INFO: task %s:%d <%s> blocked on an rw-semaphore likely owned by task %s:%d <%s>\n",
> + task->comm, task->pid, rwsem_blocked_as, t->comm,
> + t->pid, rwsem_blocked_by);
> + break;
> }
> sched_show_task(t);
> return;
> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
> index a310eb9896de..92c6332da401 100644
> --- a/kernel/locking/rwsem.c
> +++ b/kernel/locking/rwsem.c
> @@ -27,6 +27,7 @@
> #include <linux/export.h>
> #include <linux/rwsem.h>
> #include <linux/atomic.h>
> +#include <linux/hung_task.h>
> #include <trace/events/lock.h>
>
> #ifndef CONFIG_PREEMPT_RT
> @@ -1065,10 +1066,13 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
> wake_up_q(&wake_q);
>
> trace_contention_begin(sem, LCB_F_READ);
> + set_current_state(state);
> +
> + if (state == TASK_UNINTERRUPTIBLE)
> + hung_task_set_blocker(sem, BLOCKER_TYPE_RWSEM_READER);
>
> /* wait to be given the lock */
> for (;;) {
> - set_current_state(state);
> if (!smp_load_acquire(&waiter.task)) {
> /* Matches rwsem_mark_wake()'s smp_store_release(). */
> break;
> @@ -1083,8 +1087,12 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
> }
> schedule_preempt_disabled();
> lockevent_inc(rwsem_sleep_reader);
> + set_current_state(state);
> }
>
> + if (state == TASK_UNINTERRUPTIBLE)
> + hung_task_clear_blocker();
> +
> __set_current_state(TASK_RUNNING);
> lockevent_inc(rwsem_rlock);
> trace_contention_end(sem, 0);
> @@ -1146,6 +1154,9 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
> set_current_state(state);
> trace_contention_begin(sem, LCB_F_WRITE);
>
> + if (state == TASK_UNINTERRUPTIBLE)
> + hung_task_set_blocker(sem, BLOCKER_TYPE_RWSEM_WRITER);
> +
> for (;;) {
> if (rwsem_try_write_lock(sem, &waiter)) {
> /* rwsem_try_write_lock() implies ACQUIRE on success */
> @@ -1179,6 +1190,10 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
> trylock_again:
> raw_spin_lock_irq(&sem->wait_lock);
> }
> +
> + if (state == TASK_UNINTERRUPTIBLE)
> + hung_task_clear_blocker();
> +
> __set_current_state(TASK_RUNNING);
> raw_spin_unlock_irq(&sem->wait_lock);
> lockevent_inc(rwsem_wlock);
> --
> 2.49.0
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RFC 3/3] hung_task: extend hung task blocker tracking to rwsems
2025-06-24 3:57 ` Masami Hiramatsu
@ 2025-06-24 4:25 ` Lance Yang
0 siblings, 0 replies; 15+ messages in thread
From: Lance Yang @ 2025-06-24 4:25 UTC (permalink / raw)
To: Masami Hiramatsu (Google)
Cc: akpm, zi.li, anna.schumaker, boqun.feng, joel.granados, jstultz,
kent.overstreet, leonylgao, linux-kernel, longman, mingo,
mingzhe.yang, peterz, rostedt, senozhatsky, tfiga, will,
Lance Yang
On 2025/6/24 11:57, Masami Hiramatsu (Google) wrote:
> On Thu, 12 Jun 2025 12:19:26 +0800
> Lance Yang <ioworker0@gmail.com> wrote:
>
>> From: Lance Yang <lance.yang@linux.dev>
>>
>> Inspired by mutex blocker tracking[1], and having already extended it to
>> semaphores, let's now add support for reader-writer semaphores (rwsems).
>>
>> The approach is simple: when a task enters TASK_UNINTERRUPTIBLE while
>> waiting for an rwsem, we just call hung_task_set_blocker(). The hung task
>> detector can then query the rwsem's owner to identify the lock holder.
>>
>> Tracking works reliably for writers, as there can only be a single writer
>> holding the lock, and its task struct is stored in the owner field.
>>
>> The main challenge lies with readers. The owner field points to only one
>> of many concurrent readers, so we might lose track of the blocker if that
>> specific reader unlocks, even while others remain. This is not a
>> significant issue, however. In practice, long-lasting lock contention is
>> almost always caused by a writer. Therefore, reliably tracking the writer
>> is the primary goal of this patch series ;)
>
> I think as far as it is reliable, it is better than nothing :) and that
> can help us to debug some part of kernel crashes.
Yeah! That was the goal. Even if it's not a perfect solution for all
reader scenarios, it provides much-needed visibility for writes and
is a good improvement ;)
>
>>
>> With this change, the hung task detector can now show blocker task's info
>> like below:
>>
>> [Thu Jun 12 11:01:33 2025] INFO: task rw_sem_thread2:36526 blocked for more than 122 seconds.
>> [Thu Jun 12 11:01:33 2025] Tainted: G S O 6.16.0-rc1 #1
>> [Thu Jun 12 11:01:33 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> [Thu Jun 12 11:01:33 2025] task:rw_sem_thread2 state:D stack:0 pid:36526 tgid:36526 ppid:2 task_flags:0x208040 flags:0x00004000
>> [Thu Jun 12 11:01:33 2025] Call Trace:
>> [Thu Jun 12 11:01:33 2025] <TASK>
>> [Thu Jun 12 11:01:33 2025] __schedule+0x7c7/0x1930
>> [Thu Jun 12 11:01:33 2025] ? __pfx___schedule+0x10/0x10
>> [Thu Jun 12 11:01:33 2025] ? _raw_spin_lock_irq+0x8a/0xe0
>> [Thu Jun 12 11:01:33 2025] ? __pfx__raw_spin_lock_irq+0x10/0x10
>> [Thu Jun 12 11:01:33 2025] schedule+0x6a/0x180
>> [Thu Jun 12 11:01:33 2025] schedule_preempt_disabled+0x15/0x30
>> [Thu Jun 12 11:01:33 2025] rwsem_down_write_slowpath+0x447/0x1090
>> [Thu Jun 12 11:01:33 2025] ? __pfx_rwsem_down_write_slowpath+0x10/0x10
>> [Thu Jun 12 11:01:33 2025] ? __pfx___schedule+0x10/0x10
>> [Thu Jun 12 11:01:33 2025] ? __pfx___might_resched+0x10/0x10
>> [Thu Jun 12 11:01:33 2025] ? __pfx_thread2_func+0x10/0x10 [rw_sem_test_2]
>> [Thu Jun 12 11:01:33 2025] down_write+0x125/0x140
>> [Thu Jun 12 11:01:33 2025] ? __pfx_down_write+0x10/0x10
>> [Thu Jun 12 11:01:33 2025] ? msleep+0x91/0xf0
>> [Thu Jun 12 11:01:33 2025] ? __raw_spin_lock_irqsave+0x8c/0xf0
>> [Thu Jun 12 11:01:33 2025] thread2_func+0x37/0x70 [rw_sem_test_2]
>> [Thu Jun 12 11:01:33 2025] kthread+0x39f/0x750
>> [Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
>> [Thu Jun 12 11:01:33 2025] ? __pfx__raw_spin_lock_irq+0x10/0x10
>> [Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
>> [Thu Jun 12 11:01:33 2025] ret_from_fork+0x25d/0x320
>> [Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
>> [Thu Jun 12 11:01:33 2025] ret_from_fork_asm+0x1a/0x30
>> [Thu Jun 12 11:01:33 2025] </TASK>
>> [Thu Jun 12 11:01:33 2025] INFO: task rw_sem_thread2:36526 <writer> blocked on an rw-semaphore likely owned by task rw_sem_thread1:36525 <writer>
>> [Thu Jun 12 11:01:33 2025] task:rw_sem_thread1 state:S stack:0 pid:36525 tgid:36525 ppid:2 task_flags:0x208040 flags:0x00004000
>> [Thu Jun 12 11:01:33 2025] Call Trace:
>> [Thu Jun 12 11:01:33 2025] <TASK>
>> [Thu Jun 12 11:01:33 2025] __schedule+0x7c7/0x1930
>> [Thu Jun 12 11:01:33 2025] ? __pfx___schedule+0x10/0x10
>> [Thu Jun 12 11:01:33 2025] ? __mod_timer+0x304/0xa80
>> [Thu Jun 12 11:01:33 2025] ? irq_work_queue+0x6a/0xa0
>> [Thu Jun 12 11:01:33 2025] ? __pfx_vprintk_emit+0x10/0x10
>> [Thu Jun 12 11:01:33 2025] schedule+0x6a/0x180
>> [Thu Jun 12 11:01:33 2025] schedule_timeout+0xfb/0x230
>> [Thu Jun 12 11:01:33 2025] ? __pfx_schedule_timeout+0x10/0x10
>> [Thu Jun 12 11:01:33 2025] ? __pfx_process_timeout+0x10/0x10
>> [Thu Jun 12 11:01:33 2025] ? down_write+0xc4/0x140
>> [Thu Jun 12 11:01:33 2025] msleep_interruptible+0xbe/0x150
>> [Thu Jun 12 11:01:33 2025] ? __pfx_thread1_func+0x10/0x10 [rw_sem_test_2]
>> [Thu Jun 12 11:01:33 2025] thread1_func+0x37/0x60 [rw_sem_test_2]
>> [Thu Jun 12 11:01:33 2025] kthread+0x39f/0x750
>> [Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
>> [Thu Jun 12 11:01:33 2025] ? __pfx__raw_spin_lock_irq+0x10/0x10
>> [Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
>> [Thu Jun 12 11:01:33 2025] ret_from_fork+0x25d/0x320
>> [Thu Jun 12 11:01:33 2025] ? __pfx_kthread+0x10/0x10
>> [Thu Jun 12 11:01:33 2025] ret_from_fork_asm+0x1a/0x30
>> [Thu Jun 12 11:01:33 2025] </TASK>
>>
>> [1] https://lore.kernel.org/all/174046694331.2194069.15472952050240807469.stgit@mhiramat.tok.corp.google.com/
>>
>> Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>> Signed-off-by: Lance Yang <lance.yang@linux.dev>
>
> Looks good to me.
>
> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
> BTW, can you also add a patch to extend the test module
> (samples/hung_task/hung_task_tests.c) to handle the rwsem
> blocking case ? (reader and writer side)
Thanks! Will do.
Lance
>
> Thank you,
>
>
>> ---
>> include/linux/hung_task.h | 18 +++++++++---------
>> kernel/hung_task.c | 29 +++++++++++++++++++++++++----
>> kernel/locking/rwsem.c | 17 ++++++++++++++++-
>> 3 files changed, 50 insertions(+), 14 deletions(-)
>>
>> diff --git a/include/linux/hung_task.h b/include/linux/hung_task.h
>> index 1bc2b3244613..34e615c76ca5 100644
>> --- a/include/linux/hung_task.h
>> +++ b/include/linux/hung_task.h
>> @@ -21,17 +21,17 @@
>> * type.
>> *
>> * Type encoding:
>> - * 00 - Blocked on mutex (BLOCKER_TYPE_MUTEX)
>> - * 01 - Blocked on semaphore (BLOCKER_TYPE_SEM)
>> - * 10 - Blocked on rt-mutex (BLOCKER_TYPE_RTMUTEX)
>> - * 11 - Blocked on rw-semaphore (BLOCKER_TYPE_RWSEM)
>> + * 00 - Blocked on mutex (BLOCKER_TYPE_MUTEX)
>> + * 01 - Blocked on semaphore (BLOCKER_TYPE_SEM)
>> + * 10 - Blocked on rw-semaphore as READER (BLOCKER_TYPE_RWSEM_READER)
>> + * 11 - Blocked on rw-semaphore as WRITER (BLOCKER_TYPE_RWSEM_WRITER)
>> */
>> -#define BLOCKER_TYPE_MUTEX 0x00UL
>> -#define BLOCKER_TYPE_SEM 0x01UL
>> -#define BLOCKER_TYPE_RTMUTEX 0x02UL
>> -#define BLOCKER_TYPE_RWSEM 0x03UL
>> +#define BLOCKER_TYPE_MUTEX 0x00UL
>> +#define BLOCKER_TYPE_SEM 0x01UL
>> +#define BLOCKER_TYPE_RWSEM_READER 0x02UL
>> +#define BLOCKER_TYPE_RWSEM_WRITER 0x03UL
>>
>> -#define BLOCKER_TYPE_MASK 0x03UL
>> +#define BLOCKER_TYPE_MASK 0x03UL
>>
>> #ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
>> static inline void hung_task_set_blocker(void *lock, unsigned long type)
>> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
>> index d2432df2b905..8708a1205f82 100644
>> --- a/kernel/hung_task.c
>> +++ b/kernel/hung_task.c
>> @@ -23,6 +23,7 @@
>> #include <linux/sched/debug.h>
>> #include <linux/sched/sysctl.h>
>> #include <linux/hung_task.h>
>> +#include <linux/rwsem.h>
>>
>> #include <trace/events/sched.h>
>>
>> @@ -100,6 +101,7 @@ static void debug_show_blocker(struct task_struct *task)
>> {
>> struct task_struct *g, *t;
>> unsigned long owner, blocker, blocker_type;
>> + const char *rwsem_blocked_by, *rwsem_blocked_as;
>>
>> RCU_LOCKDEP_WARN(!rcu_read_lock_held(), "No rcu lock held");
>>
>> @@ -111,12 +113,20 @@ static void debug_show_blocker(struct task_struct *task)
>>
>> switch (blocker_type) {
>> case BLOCKER_TYPE_MUTEX:
>> - owner = mutex_get_owner(
>> - (struct mutex *)hung_task_blocker_to_lock(blocker));
>> + owner = mutex_get_owner(hung_task_blocker_to_lock(blocker));
>> break;
>> case BLOCKER_TYPE_SEM:
>> - owner = sem_last_holder(
>> - (struct semaphore *)hung_task_blocker_to_lock(blocker));
>> + owner = sem_last_holder(hung_task_blocker_to_lock(blocker));
>> + break;
>> + case BLOCKER_TYPE_RWSEM_READER:
>> + case BLOCKER_TYPE_RWSEM_WRITER:
>> + owner = (unsigned long)rwsem_owner(
>> + hung_task_blocker_to_lock(blocker));
>> + rwsem_blocked_as = (blocker_type == BLOCKER_TYPE_RWSEM_READER) ?
>> + "reader" : "writer";
>> + rwsem_blocked_by = is_rwsem_reader_owned(
>> + hung_task_blocker_to_lock(blocker)) ?
>> + "reader" : "writer";
>> break;
>> default:
>> WARN_ON_ONCE(1);
>> @@ -134,6 +144,11 @@ static void debug_show_blocker(struct task_struct *task)
>> pr_err("INFO: task %s:%d is blocked on a semaphore, but the last holder is not found.\n",
>> task->comm, task->pid);
>> break;
>> + case BLOCKER_TYPE_RWSEM_READER:
>> + case BLOCKER_TYPE_RWSEM_WRITER:
>> + pr_err("INFO: task %s:%d is blocked on an rw-semaphore, but the owner is not found.\n",
>> + task->comm, task->pid);
>> + break;
>> }
>> return;
>> }
>> @@ -152,6 +167,12 @@ static void debug_show_blocker(struct task_struct *task)
>> pr_err("INFO: task %s:%d blocked on a semaphore likely last held by task %s:%d\n",
>> task->comm, task->pid, t->comm, t->pid);
>> break;
>> + case BLOCKER_TYPE_RWSEM_READER:
>> + case BLOCKER_TYPE_RWSEM_WRITER:
>> + pr_err("INFO: task %s:%d <%s> blocked on an rw-semaphore likely owned by task %s:%d <%s>\n",
>> + task->comm, task->pid, rwsem_blocked_as, t->comm,
>> + t->pid, rwsem_blocked_by);
>> + break;
>> }
>> sched_show_task(t);
>> return;
>> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
>> index a310eb9896de..92c6332da401 100644
>> --- a/kernel/locking/rwsem.c
>> +++ b/kernel/locking/rwsem.c
>> @@ -27,6 +27,7 @@
>> #include <linux/export.h>
>> #include <linux/rwsem.h>
>> #include <linux/atomic.h>
>> +#include <linux/hung_task.h>
>> #include <trace/events/lock.h>
>>
>> #ifndef CONFIG_PREEMPT_RT
>> @@ -1065,10 +1066,13 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
>> wake_up_q(&wake_q);
>>
>> trace_contention_begin(sem, LCB_F_READ);
>> + set_current_state(state);
>> +
>> + if (state == TASK_UNINTERRUPTIBLE)
>> + hung_task_set_blocker(sem, BLOCKER_TYPE_RWSEM_READER);
>>
>> /* wait to be given the lock */
>> for (;;) {
>> - set_current_state(state);
>> if (!smp_load_acquire(&waiter.task)) {
>> /* Matches rwsem_mark_wake()'s smp_store_release(). */
>> break;
>> @@ -1083,8 +1087,12 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
>> }
>> schedule_preempt_disabled();
>> lockevent_inc(rwsem_sleep_reader);
>> + set_current_state(state);
>> }
>>
>> + if (state == TASK_UNINTERRUPTIBLE)
>> + hung_task_clear_blocker();
>> +
>> __set_current_state(TASK_RUNNING);
>> lockevent_inc(rwsem_rlock);
>> trace_contention_end(sem, 0);
>> @@ -1146,6 +1154,9 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
>> set_current_state(state);
>> trace_contention_begin(sem, LCB_F_WRITE);
>>
>> + if (state == TASK_UNINTERRUPTIBLE)
>> + hung_task_set_blocker(sem, BLOCKER_TYPE_RWSEM_WRITER);
>> +
>> for (;;) {
>> if (rwsem_try_write_lock(sem, &waiter)) {
>> /* rwsem_try_write_lock() implies ACQUIRE on success */
>> @@ -1179,6 +1190,10 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
>> trylock_again:
>> raw_spin_lock_irq(&sem->wait_lock);
>> }
>> +
>> + if (state == TASK_UNINTERRUPTIBLE)
>> + hung_task_clear_blocker();
>> +
>> __set_current_state(TASK_RUNNING);
>> raw_spin_unlock_irq(&sem->wait_lock);
>> lockevent_inc(rwsem_wlock);
>> --
>> 2.49.0
>>
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RFC 2/3] locking/rwsem: clear reader-owner on unlock to reduce false positives
2025-06-24 3:53 ` Masami Hiramatsu
@ 2025-06-24 5:02 ` Lance Yang
2025-06-24 6:13 ` Masami Hiramatsu
0 siblings, 1 reply; 15+ messages in thread
From: Lance Yang @ 2025-06-24 5:02 UTC (permalink / raw)
To: Masami Hiramatsu (Google)
Cc: akpm, zi.li, anna.schumaker, boqun.feng, joel.granados, jstultz,
kent.overstreet, leonylgao, linux-kernel, longman, mingo,
mingzhe.yang, peterz, rostedt, senozhatsky, tfiga, will,
Lance Yang
On 2025/6/24 11:53, Masami Hiramatsu (Google) wrote:
> On Tue, 24 Jun 2025 09:44:55 +0800
> Lance Yang <lance.yang@linux.dev> wrote:
>
>>
>>
>> On 2025/6/24 08:26, Masami Hiramatsu (Google) wrote:
>>> On Thu, 12 Jun 2025 12:19:25 +0800
>>> Lance Yang <ioworker0@gmail.com> wrote:
>>>
>>>> From: Lance Yang <lance.yang@linux.dev>
>>>>
>>>> When CONFIG_DETECT_HUNG_TASK_BLOCKER is enabled, a stale owner pointer in a
>>>> reader-owned rwsem can lead to false positives in blocker tracking.
>>>>
>>>> To mitigate this, let’s try to clear the owner field on unlock, as a NULL
>>>> owner is better than a stale one for diagnostics.
>>>
>>> Can we merge this to [PATCH 1/3]? It seems that you removed #ifdef and
>>> remove it. This means in anyway we need the feature enabled by DEBUG_RWSEMS.
>>
>> Thanks for the feedback! I see your point about the dependency ;)
>>
>> Personlly, I'd perfer to keep them separate. The reasoning is that
>> they addreess two distinct things, and I think splitting them makes
>> this series clearer and easier to review ;)
>>
>> Patch #1 focuses on "ownership tracking": Its only job is to make
>> the existing owner-related helpers (rwsem_owner(), is_rwsem_reader_owned())
>> globally available when blocker tracking is enabled.
>>
>> Patch #2, on the other hand, is about "reader-owner cleanup": It
>> introduces a functional change to the unlock path, trying to clear
>> the owner field for reader-owned rwsems.
>
> But without clearing the owner, the owner information can be
> broken, right? Since CONFIG_DEBUG_RWSEMS is working as it is,
You're right, the owner info would be broken without the cleanup logic
in patch #2. But ...
> I think those cannot be decoupled. For example, comparing the
> result of both DETECT_HUNG_TASK_BLOCKER and DEBUG_RWSEMS are
> enabled and only DETECT_HUNG_TASK_BLOCKER is enabled, the
> result is different.
The actual blocker tracking for rwsems is only turned on in patch #3.
So, there's no case where the feature is active without the cleanup
logic already being in place.
>
>>
>> Does this reasoning make sense to you?
>
> Sorry, no. I think "reader-owner cleanup" is a part of "ownership
> tracking" as DEBUG_RWSEMS does (and that keeps consistency of
> the ownership tracking behavior same as DEBUG_RWSEM).
I thought this step-by-step approach was a bit cleaner, since there are
currently only two users for these owner helpers (DEBUG_RWSEMS and
DETECT_HUNG_TASK_BLOCKER).
Anyway, if you still feel strongly that they should be merged, I'm happy
to rework the series as you suggested ;p
Thanks,
Lance
>
> Thank you,
>
>>
>> Thanks,
>> Lance
>>
>>>
>>> Thanks,
>>>
>>>>
>>>> Signed-off-by: Lance Yang <lance.yang@linux.dev>
>>>> ---
>>>> kernel/locking/rwsem.c | 10 ++++------
>>>> 1 file changed, 4 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
>>>> index 6cb29442d4fc..a310eb9896de 100644
>>>> --- a/kernel/locking/rwsem.c
>>>> +++ b/kernel/locking/rwsem.c
>>>> @@ -205,14 +205,12 @@ bool is_rwsem_reader_owned(struct rw_semaphore *sem)
>>>> return false;
>>>> return rwsem_test_oflags(sem, RWSEM_READER_OWNED);
>>>> }
>>>> -#endif
>>>>
>>>> -#ifdef CONFIG_DEBUG_RWSEMS
>>>> /*
>>>> - * With CONFIG_DEBUG_RWSEMS configured, it will make sure that if there
>>>> - * is a task pointer in owner of a reader-owned rwsem, it will be the
>>>> - * real owner or one of the real owners. The only exception is when the
>>>> - * unlock is done by up_read_non_owner().
>>>> + * With CONFIG_DEBUG_RWSEMS or CONFIG_DETECT_HUNG_TASK_BLOCKER configured,
>>>> + * it will make sure that the owner field of a reader-owned rwsem either
>>>> + * points to a real reader-owner(s) or gets cleared. The only exception is
>>>> + * when the unlock is done by up_read_non_owner().
>>>> */
>>>> static inline void rwsem_clear_reader_owned(struct rw_semaphore *sem)
>>>> {
>>>> --
>>>> 2.49.0
>>>>
>>>
>>>
>>
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RFC 2/3] locking/rwsem: clear reader-owner on unlock to reduce false positives
2025-06-24 5:02 ` Lance Yang
@ 2025-06-24 6:13 ` Masami Hiramatsu
2025-06-24 7:38 ` Lance Yang
0 siblings, 1 reply; 15+ messages in thread
From: Masami Hiramatsu @ 2025-06-24 6:13 UTC (permalink / raw)
To: Lance Yang
Cc: akpm, zi.li, anna.schumaker, boqun.feng, joel.granados, jstultz,
kent.overstreet, leonylgao, linux-kernel, longman, mingo,
mingzhe.yang, peterz, rostedt, senozhatsky, tfiga, will,
Lance Yang
On Tue, 24 Jun 2025 13:02:31 +0800
Lance Yang <lance.yang@linux.dev> wrote:
>
>
> On 2025/6/24 11:53, Masami Hiramatsu (Google) wrote:
> > On Tue, 24 Jun 2025 09:44:55 +0800
> > Lance Yang <lance.yang@linux.dev> wrote:
> >
> >>
> >>
> >> On 2025/6/24 08:26, Masami Hiramatsu (Google) wrote:
> >>> On Thu, 12 Jun 2025 12:19:25 +0800
> >>> Lance Yang <ioworker0@gmail.com> wrote:
> >>>
> >>>> From: Lance Yang <lance.yang@linux.dev>
> >>>>
> >>>> When CONFIG_DETECT_HUNG_TASK_BLOCKER is enabled, a stale owner pointer in a
> >>>> reader-owned rwsem can lead to false positives in blocker tracking.
> >>>>
> >>>> To mitigate this, let’s try to clear the owner field on unlock, as a NULL
> >>>> owner is better than a stale one for diagnostics.
> >>>
> >>> Can we merge this to [PATCH 1/3]? It seems that you removed #ifdef and
> >>> remove it. This means in anyway we need the feature enabled by DEBUG_RWSEMS.
> >>
> >> Thanks for the feedback! I see your point about the dependency ;)
> >>
> >> Personlly, I'd perfer to keep them separate. The reasoning is that
> >> they addreess two distinct things, and I think splitting them makes
> >> this series clearer and easier to review ;)
> >>
> >> Patch #1 focuses on "ownership tracking": Its only job is to make
> >> the existing owner-related helpers (rwsem_owner(), is_rwsem_reader_owned())
> >> globally available when blocker tracking is enabled.
> >>
> >> Patch #2, on the other hand, is about "reader-owner cleanup": It
> >> introduces a functional change to the unlock path, trying to clear
> >> the owner field for reader-owned rwsems.
> >
> > But without clearing the owner, the owner information can be
> > broken, right? Since CONFIG_DEBUG_RWSEMS is working as it is,
>
> You're right, the owner info would be broken without the cleanup logic
> in patch #2. But ...
>
> > I think those cannot be decoupled. For example, comparing the
> > result of both DETECT_HUNG_TASK_BLOCKER and DEBUG_RWSEMS are
> > enabled and only DETECT_HUNG_TASK_BLOCKER is enabled, the
> > result is different.
>
> The actual blocker tracking for rwsems is only turned on in patch #3.
> So, there's no case where the feature is active without the cleanup
> logic already being in place.
>
> >
> >>
> >> Does this reasoning make sense to you?
> >
> > Sorry, no. I think "reader-owner cleanup" is a part of "ownership
> > tracking" as DEBUG_RWSEMS does (and that keeps consistency of
> > the ownership tracking behavior same as DEBUG_RWSEM).
>
> I thought this step-by-step approach was a bit cleaner, since there are
> currently only two users for these owner helpers (DEBUG_RWSEMS and
> DETECT_HUNG_TASK_BLOCKER).
I think the step-by-step approach fits better if the feature is evolving
(a working feature is already there.) I don't like the intermediate
state which does not work correctly, because if we have a unit test(
like kUnit) it should fail. If you can say "this finds the rwsem
owner as same as what the CONFIG_DEBUG_RWSEM is doing", it is simpler
to explain what you are doing, and easy to understand.
>
> Anyway, if you still feel strongly that they should be merged, I'm happy
> to rework the series as you suggested ;p
Thanks,
>
> Thanks,
> Lance
>
> >
> > Thank you,
> >
> >>
> >> Thanks,
> >> Lance
> >>
> >>>
> >>> Thanks,
> >>>
> >>>>
> >>>> Signed-off-by: Lance Yang <lance.yang@linux.dev>
> >>>> ---
> >>>> kernel/locking/rwsem.c | 10 ++++------
> >>>> 1 file changed, 4 insertions(+), 6 deletions(-)
> >>>>
> >>>> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
> >>>> index 6cb29442d4fc..a310eb9896de 100644
> >>>> --- a/kernel/locking/rwsem.c
> >>>> +++ b/kernel/locking/rwsem.c
> >>>> @@ -205,14 +205,12 @@ bool is_rwsem_reader_owned(struct rw_semaphore *sem)
> >>>> return false;
> >>>> return rwsem_test_oflags(sem, RWSEM_READER_OWNED);
> >>>> }
> >>>> -#endif
> >>>>
> >>>> -#ifdef CONFIG_DEBUG_RWSEMS
> >>>> /*
> >>>> - * With CONFIG_DEBUG_RWSEMS configured, it will make sure that if there
> >>>> - * is a task pointer in owner of a reader-owned rwsem, it will be the
> >>>> - * real owner or one of the real owners. The only exception is when the
> >>>> - * unlock is done by up_read_non_owner().
> >>>> + * With CONFIG_DEBUG_RWSEMS or CONFIG_DETECT_HUNG_TASK_BLOCKER configured,
> >>>> + * it will make sure that the owner field of a reader-owned rwsem either
> >>>> + * points to a real reader-owner(s) or gets cleared. The only exception is
> >>>> + * when the unlock is done by up_read_non_owner().
> >>>> */
> >>>> static inline void rwsem_clear_reader_owned(struct rw_semaphore *sem)
> >>>> {
> >>>> --
> >>>> 2.49.0
> >>>>
> >>>
> >>>
> >>
> >
> >
>
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH RFC 2/3] locking/rwsem: clear reader-owner on unlock to reduce false positives
2025-06-24 6:13 ` Masami Hiramatsu
@ 2025-06-24 7:38 ` Lance Yang
0 siblings, 0 replies; 15+ messages in thread
From: Lance Yang @ 2025-06-24 7:38 UTC (permalink / raw)
To: Masami Hiramatsu (Google)
Cc: akpm, zi.li, anna.schumaker, boqun.feng, joel.granados, jstultz,
kent.overstreet, leonylgao, linux-kernel, longman, mingo,
mingzhe.yang, peterz, rostedt, senozhatsky, tfiga, will,
Lance Yang
On 2025/6/24 14:13, Masami Hiramatsu (Google) wrote:
> On Tue, 24 Jun 2025 13:02:31 +0800
> Lance Yang <lance.yang@linux.dev> wrote:
>
>>
>>
>> On 2025/6/24 11:53, Masami Hiramatsu (Google) wrote:
>>> On Tue, 24 Jun 2025 09:44:55 +0800
>>> Lance Yang <lance.yang@linux.dev> wrote:
>>>
>>>>
>>>>
>>>> On 2025/6/24 08:26, Masami Hiramatsu (Google) wrote:
>>>>> On Thu, 12 Jun 2025 12:19:25 +0800
>>>>> Lance Yang <ioworker0@gmail.com> wrote:
>>>>>
>>>>>> From: Lance Yang <lance.yang@linux.dev>
>>>>>>
>>>>>> When CONFIG_DETECT_HUNG_TASK_BLOCKER is enabled, a stale owner pointer in a
>>>>>> reader-owned rwsem can lead to false positives in blocker tracking.
>>>>>>
>>>>>> To mitigate this, let’s try to clear the owner field on unlock, as a NULL
>>>>>> owner is better than a stale one for diagnostics.
>>>>>
>>>>> Can we merge this to [PATCH 1/3]? It seems that you removed #ifdef and
>>>>> remove it. This means in anyway we need the feature enabled by DEBUG_RWSEMS.
>>>>
>>>> Thanks for the feedback! I see your point about the dependency ;)
>>>>
>>>> Personlly, I'd perfer to keep them separate. The reasoning is that
>>>> they addreess two distinct things, and I think splitting them makes
>>>> this series clearer and easier to review ;)
>>>>
>>>> Patch #1 focuses on "ownership tracking": Its only job is to make
>>>> the existing owner-related helpers (rwsem_owner(), is_rwsem_reader_owned())
>>>> globally available when blocker tracking is enabled.
>>>>
>>>> Patch #2, on the other hand, is about "reader-owner cleanup": It
>>>> introduces a functional change to the unlock path, trying to clear
>>>> the owner field for reader-owned rwsems.
>>>
>>> But without clearing the owner, the owner information can be
>>> broken, right? Since CONFIG_DEBUG_RWSEMS is working as it is,
>>
>> You're right, the owner info would be broken without the cleanup logic
>> in patch #2. But ...
>>
>>> I think those cannot be decoupled. For example, comparing the
>>> result of both DETECT_HUNG_TASK_BLOCKER and DEBUG_RWSEMS are
>>> enabled and only DETECT_HUNG_TASK_BLOCKER is enabled, the
>>> result is different.
>>
>> The actual blocker tracking for rwsems is only turned on in patch #3.
>> So, there's no case where the feature is active without the cleanup
>> logic already being in place.
>>
>>>
>>>>
>>>> Does this reasoning make sense to you?
>>>
>>> Sorry, no. I think "reader-owner cleanup" is a part of "ownership
>>> tracking" as DEBUG_RWSEMS does (and that keeps consistency of
>>> the ownership tracking behavior same as DEBUG_RWSEM).
>>
>> I thought this step-by-step approach was a bit cleaner, since there are
>> currently only two users for these owner helpers (DEBUG_RWSEMS and
>> DETECT_HUNG_TASK_BLOCKER).
>
> I think the step-by-step approach fits better if the feature is evolving
> (a working feature is already there.) I don't like the intermediate
Agreed.
> state which does not work correctly, because if we have a unit test(
> like kUnit) it should fail. If you can say "this finds the rwsem
Ah, I missed that ...
> owner as same as what the CONFIG_DEBUG_RWSEM is doing", it is simpler
> to explain what you are doing, and easy to understand.
Thanks for the lesson! Will rework the series as you suggested ;)
Lance
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-06-24 7:38 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-12 4:19 [PATCH RFC 0/3] extend hung task blocker tracking to rwsems Lance Yang
2025-06-12 4:19 ` [PATCH RFC 1/3] locking/rwsem: make owner helpers globally available Lance Yang
2025-06-24 0:17 ` Masami Hiramatsu
2025-06-24 1:35 ` Lance Yang
2025-06-12 4:19 ` [PATCH RFC 2/3] locking/rwsem: clear reader-owner on unlock to reduce false positives Lance Yang
2025-06-24 0:26 ` Masami Hiramatsu
2025-06-24 1:44 ` Lance Yang
2025-06-24 3:53 ` Masami Hiramatsu
2025-06-24 5:02 ` Lance Yang
2025-06-24 6:13 ` Masami Hiramatsu
2025-06-24 7:38 ` Lance Yang
2025-06-12 4:19 ` [PATCH RFC 3/3] hung_task: extend hung task blocker tracking to rwsems Lance Yang
2025-06-24 3:57 ` Masami Hiramatsu
2025-06-24 4:25 ` Lance Yang
2025-06-23 12:33 ` [PATCH RFC 0/3] " Lance Yang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).