All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lance Yang <lance.yang@linux.dev>
To: "Nanji Parmar (he/him)" <nparmar@purestorage.com>
Cc: mhiramat@kernel.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH] hung_task: Skip hung task detection during core dump operations
Date: Thu, 14 Aug 2025 11:12:52 +0800	[thread overview]
Message-ID: <33f995c6-4db7-4e4c-ba12-eb5d05e8521c@linux.dev> (raw)
In-Reply-To: <20250813150155.81680178704c4652fd454a80@linux-foundation.org>

Hi Nanji,

Thanks for your patch!

On 2025/8/14 06:01, Andrew Morton wrote:
> On Wed, 13 Aug 2025 11:30:36 -0700 "Nanji Parmar (he/him)" <nparmar@purestorage.com> wrote:
> 
>> Tasks involved in core dump operations can legitimately block for
>> extended periods, especially for large memory processes. The hung
>> task detector should skip tasks with PF_DUMPCORE (main dumping
>> thread) or PF_POSTCOREDUMP (other threads in the group) flags to
>> avoid false positive warnings.
>>
>> This prevents incorrect hung task reports during legitimate core
>> dump generation that can take xx minutes for large processes.
> 
> It isn't pleasing to be putting coredump special cases into the core of
> the hung-task detector.  Perhaps the hung task detector should get an

Yeah, adding a special case for coredumps is not a good design ;)

> equivalent to touch_softlockup_watchdog().  I'm surprised it doesn't
> already have such a thing.  Maybe it does and I've forgotten where it is.
> 
> Please provide a full description of the problem, mainly the relevant
> dmesg output.  Please always provide this full description when
> addressing kernel issues, thanks.

Interestingly, I wasn't able to reproduce the hung task warning on my
machine with a SSD, even when generating a 100 GiB coredump. The process
switches between R and D states so fast that it never hits the timeout,
even with hung_task_timeout_secs set as low as 5s ;)

So it seems this isn't a general problem for all coredumps. It look like
it only happens on systems with slow I/O, which can cause a process to
stay in a D-state for a long time.

Anyway, any task *actually* blocked on I/O for that long should be flagged;
that is the hung task detector's job, IMHO.

Thanks,
Lance



  reply	other threads:[~2025-08-14  3:13 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-13 18:30 [PATCH] hung_task: Skip hung task detection during core dump operations Nanji Parmar (he/him)
2025-08-13 22:01 ` Andrew Morton
2025-08-14  3:12   ` Lance Yang [this message]
     [not found]     ` <CAEK+-Od=j88QND5pZ-K_23fwmacy9enxogzNLxH4PjPYotDh9A@mail.gmail.com>
2025-08-14  4:30       ` Lance Yang
2025-08-14  6:58   ` Masami Hiramatsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=33f995c6-4db7-4e4c-ba12-eb5d05e8521c@linux.dev \
    --to=lance.yang@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=nparmar@purestorage.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.