* [PATCH] hung_task: Skip hung task detection during core dump operations
@ 2025-08-13 18:30 Nanji Parmar (he/him)
2025-08-13 22:01 ` Andrew Morton
0 siblings, 1 reply; 5+ messages in thread
From: Nanji Parmar (he/him) @ 2025-08-13 18:30 UTC (permalink / raw)
To: akpm, lance.yang, mhiramat; +Cc: linux-kernel
[-- Attachment #1.1: Type: text/plain, Size: 1981 bytes --]
Hi,
This patch fixes false positive hung task warnings during core dump
operations for processes with large memory footprints.
During testing with processes having GBs or >1TB memory, core dump
generation
takes many minutes, causing hung task detector to incorrectly flag threads
as hung. The fix checks for both PF_DUMPCORE and PF_POSTCOREDUMP flags
before reporting tasks as hung.
Tested on the systems with large memory processes.
Best regards,
Nanji
---
From 45460c6882b602669b25a57f3a2f7ea8a8ea0f84 Mon Sep 17 00:00:00 2001
From: Nanji Parmar <nparmar@purestorage.com>
Date: Wed, 13 Aug 2025 12:14:35 -0600
Subject: [PATCH] hung_task: Exclude core dump tasks from hung task detection
Tasks involved in core dump operations can legitimately block for
extended periods, especially for large memory processes. The hung
task detector should skip tasks with PF_DUMPCORE (main dumping
thread) or PF_POSTCOREDUMP (other threads in the group) flags to
avoid false positive warnings.
This prevents incorrect hung task reports during legitimate core
dump generation that can take xx minutes for large processes.
Signed-off-by: Nanji Parmar <nparmar@purestorage.com>
---
kernel/hung_task.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 8708a1205f82..0fc3352d0f0e 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -217,6 +217,13 @@ static void check_hung_task(struct task_struct *t,
unsigned long timeout)
*/
sysctl_hung_task_detect_count++;
+ /* Skip hung task detection for tasks involved in core dump
operations */
+ if (t->flags & (PF_DUMPCORE | PF_POSTCOREDUMP)) {
+ pr_info("Skipping hung task check for coredump-related task
%s:%d (blocked %ld seconds)\n",
+ t->comm, t->pid, (jiffies - t->last_switch_time) /
HZ);
+ return;
+ }
+
trace_sched_process_hang(t);
if (sysctl_hung_task_panic) {
--
2.50.1
[-- Attachment #1.2: Type: text/html, Size: 2360 bytes --]
[-- Attachment #2: hung_task_fix.patch --]
[-- Type: application/octet-stream, Size: 1434 bytes --]
From 45460c6882b602669b25a57f3a2f7ea8a8ea0f84 Mon Sep 17 00:00:00 2001
From: Nanji Parmar <nparmar@purestorage.com>
Date: Wed, 13 Aug 2025 12:14:35 -0600
Subject: [PATCH] hung_task: Exclude core dump tasks from hung task detection
Tasks involved in core dump operations can legitimately block for
extended periods, especially for large memory processes. The hung
task detector should skip tasks with PF_DUMPCORE (main dumping
thread) or PF_POSTCOREDUMP (other threads in the group) flags to
avoid false positive warnings.
This prevents incorrect hung task reports during legitimate core
dump generation that can take xx minutes for large processes.
Signed-off-by: Nanji Parmar <nparmar@purestorage.com>
---
kernel/hung_task.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 8708a1205f82..0fc3352d0f0e 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -217,6 +217,13 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
*/
sysctl_hung_task_detect_count++;
+ /* Skip hung task detection for tasks involved in core dump operations */
+ if (t->flags & (PF_DUMPCORE | PF_POSTCOREDUMP)) {
+ pr_info("Skipping hung task check for coredump-related task %s:%d (blocked %ld seconds)\n",
+ t->comm, t->pid, (jiffies - t->last_switch_time) / HZ);
+ return;
+ }
+
trace_sched_process_hang(t);
if (sysctl_hung_task_panic) {
--
2.50.1
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] hung_task: Skip hung task detection during core dump operations
2025-08-13 18:30 [PATCH] hung_task: Skip hung task detection during core dump operations Nanji Parmar (he/him)
@ 2025-08-13 22:01 ` Andrew Morton
2025-08-14 3:12 ` Lance Yang
2025-08-14 6:58 ` Masami Hiramatsu
0 siblings, 2 replies; 5+ messages in thread
From: Andrew Morton @ 2025-08-13 22:01 UTC (permalink / raw)
To: Nanji Parmar (he/him); +Cc: lance.yang, mhiramat, linux-kernel
On Wed, 13 Aug 2025 11:30:36 -0700 "Nanji Parmar (he/him)" <nparmar@purestorage.com> wrote:
> Tasks involved in core dump operations can legitimately block for
> extended periods, especially for large memory processes. The hung
> task detector should skip tasks with PF_DUMPCORE (main dumping
> thread) or PF_POSTCOREDUMP (other threads in the group) flags to
> avoid false positive warnings.
>
> This prevents incorrect hung task reports during legitimate core
> dump generation that can take xx minutes for large processes.
It isn't pleasing to be putting coredump special cases into the core of
the hung-task detector. Perhaps the hung task detector should get an
equivalent to touch_softlockup_watchdog(). I'm surprised it doesn't
already have such a thing. Maybe it does and I've forgotten where it is.
Please provide a full description of the problem, mainly the relevant
dmesg output. Please always provide this full description when
addressing kernel issues, thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] hung_task: Skip hung task detection during core dump operations
2025-08-13 22:01 ` Andrew Morton
@ 2025-08-14 3:12 ` Lance Yang
[not found] ` <CAEK+-Od=j88QND5pZ-K_23fwmacy9enxogzNLxH4PjPYotDh9A@mail.gmail.com>
2025-08-14 6:58 ` Masami Hiramatsu
1 sibling, 1 reply; 5+ messages in thread
From: Lance Yang @ 2025-08-14 3:12 UTC (permalink / raw)
To: Nanji Parmar (he/him); +Cc: mhiramat, linux-kernel, Andrew Morton
Hi Nanji,
Thanks for your patch!
On 2025/8/14 06:01, Andrew Morton wrote:
> On Wed, 13 Aug 2025 11:30:36 -0700 "Nanji Parmar (he/him)" <nparmar@purestorage.com> wrote:
>
>> Tasks involved in core dump operations can legitimately block for
>> extended periods, especially for large memory processes. The hung
>> task detector should skip tasks with PF_DUMPCORE (main dumping
>> thread) or PF_POSTCOREDUMP (other threads in the group) flags to
>> avoid false positive warnings.
>>
>> This prevents incorrect hung task reports during legitimate core
>> dump generation that can take xx minutes for large processes.
>
> It isn't pleasing to be putting coredump special cases into the core of
> the hung-task detector. Perhaps the hung task detector should get an
Yeah, adding a special case for coredumps is not a good design ;)
> equivalent to touch_softlockup_watchdog(). I'm surprised it doesn't
> already have such a thing. Maybe it does and I've forgotten where it is.
>
> Please provide a full description of the problem, mainly the relevant
> dmesg output. Please always provide this full description when
> addressing kernel issues, thanks.
Interestingly, I wasn't able to reproduce the hung task warning on my
machine with a SSD, even when generating a 100 GiB coredump. The process
switches between R and D states so fast that it never hits the timeout,
even with hung_task_timeout_secs set as low as 5s ;)
So it seems this isn't a general problem for all coredumps. It look like
it only happens on systems with slow I/O, which can cause a process to
stay in a D-state for a long time.
Anyway, any task *actually* blocked on I/O for that long should be flagged;
that is the hung task detector's job, IMHO.
Thanks,
Lance
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] hung_task: Skip hung task detection during core dump operations
2025-08-13 22:01 ` Andrew Morton
2025-08-14 3:12 ` Lance Yang
@ 2025-08-14 6:58 ` Masami Hiramatsu
1 sibling, 0 replies; 5+ messages in thread
From: Masami Hiramatsu @ 2025-08-14 6:58 UTC (permalink / raw)
To: Andrew Morton; +Cc: Nanji Parmar (he/him), lance.yang, mhiramat, linux-kernel
On Wed, 13 Aug 2025 15:01:55 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:
> On Wed, 13 Aug 2025 11:30:36 -0700 "Nanji Parmar (he/him)" <nparmar@purestorage.com> wrote:
>
> > Tasks involved in core dump operations can legitimately block for
> > extended periods, especially for large memory processes. The hung
> > task detector should skip tasks with PF_DUMPCORE (main dumping
> > thread) or PF_POSTCOREDUMP (other threads in the group) flags to
> > avoid false positive warnings.
> >
> > This prevents incorrect hung task reports during legitimate core
> > dump generation that can take xx minutes for large processes.
>
> It isn't pleasing to be putting coredump special cases into the core of
> the hung-task detector. Perhaps the hung task detector should get an
> equivalent to touch_softlockup_watchdog(). I'm surprised it doesn't
> already have such a thing. Maybe it does and I've forgotten where it is.
Hmm, maybe we can increase nvcsw/nivcsw to reset the hung task checker.
But usually this means the task does context switch while core-dump.
>
> Please provide a full description of the problem, mainly the relevant
> dmesg output. Please always provide this full description when
> addressing kernel issues, thanks.
+1, dmesg will show where (in kernel) we hit the hung_task during core dump.
Thanks,
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-08-14 6:58 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-13 18:30 [PATCH] hung_task: Skip hung task detection during core dump operations Nanji Parmar (he/him)
2025-08-13 22:01 ` Andrew Morton
2025-08-14 3:12 ` Lance Yang
[not found] ` <CAEK+-Od=j88QND5pZ-K_23fwmacy9enxogzNLxH4PjPYotDh9A@mail.gmail.com>
2025-08-14 4:30 ` Lance Yang
2025-08-14 6:58 ` Masami Hiramatsu
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.