From: Mateusz Guzik <mjguzik@gmail.com>
To: Jan Kara <jack@suse.cz>
Cc: Julian Sun <sunjunchao@bytedance.com>,
linux-fsdevel@vger.kernel.org, brauner@kernel.org,
viro@zeniv.linux.org.uk, peterz@infradead.org,
akpm@linux-foundation.org, Lance Yang <lance.yang@linux.dev>
Subject: Re: [PATCH] write-back: Wake up waiting tasks when finishing the writeback of a chunk.
Date: Fri, 26 Sep 2025 18:14:33 +0200 [thread overview]
Message-ID: <CAGudoHEpCYaA9omAXoJWraDTL5hMSfrj3Cufe_W5sEcSuonX_g@mail.gmail.com> (raw)
In-Reply-To: <dvfobm24dgl4hhvirwabai47toypkvrimv7rthevrcvig6xmjf@scpmbv7qucme>
On Fri, Sep 26, 2025 at 5:55 PM Jan Kara <jack@suse.cz> wrote:
>
> On Fri 26-09-25 17:50:43, Mateusz Guzik wrote:
> > On Fri, Sep 26, 2025 at 5:48 PM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Fri 26-09-25 14:05:59, Mateusz Guzik wrote:
> > > > On Fri, Sep 26, 2025 at 1:43 PM Julian Sun <sunjunchao@bytedance.com> wrote:
> > > > >
> > > > > On 9/26/25 7:17 PM, Mateusz Guzik wrote:
> > > > > > On Fri, Sep 26, 2025 at 4:26 AM Julian Sun <sunjunchao@bytedance.com> wrote:
> > > > > >>
> > > > > >> On 9/26/25 1:25 AM, Mateusz Guzik wrote:
> > > > > >>> On Thu, Sep 25, 2025 at 09:22:39PM +0800, Julian Sun wrote:
> > > > > >>>> Writing back a large number of pages can take a lots of time.
> > > > > >>>> This issue is exacerbated when the underlying device is slow or
> > > > > >>>> subject to block layer rate limiting, which in turn triggers
> > > > > >>>> unexpected hung task warnings.
> > > > > >>>>
> > > > > >>>> We can trigger a wake-up once a chunk has been written back and the
> > > > > >>>> waiting time for writeback exceeds half of
> > > > > >>>> sysctl_hung_task_timeout_secs.
> > > > > >>>> This action allows the hung task detector to be aware of the writeback
> > > > > >>>> progress, thereby eliminating these unexpected hung task warnings.
> > > > > >>>>
> > > > > >>>
> > > > > >>> If I'm reading correctly this is also messing with stats how long the
> > > > > >>> thread was stuck to begin with.
> > > > > >>
> > > > > >> IMO, it will not mess up the time. Since it only updates the time when
> > > > > >> we can see progress (which is not a hang). If the task really hangs for
> > > > > >> a long time, then we can't perform the time update—so it will not mess
> > > > > >> up the time.
> > > > > >>
> > > > > >
> > > > > > My point is that if you are stuck in the kernel for so long for the
> > > > > > hung task detector to take notice, that's still something worth
> > > > > > reporting in some way, even if you are making progress. I presume with
> > > > > > the patch at hand this information is lost.
> > > > > >
> > > > > > For example the detector could be extended to drop a one-liner about
> > > > > > encountering a thread which was unable to leave the kernel for a long
> > > > > > time, even though it is making progress. Bonus points if the message
> > > > > > contained info this is i/o and for which device.
> > > > >
> > > > > Let me understand: you want to print logs when writeback is making
> > > > > progress but is so slow that the task can't exit, correct?
> > > > > I see this as a new requirement different from the existing hung task
> > > > > detector: needing to print info when writeback is slow.
> > > > > Indeed, the existing detector prints warnings in two cases: 1) no
> > > > > writeback progress; 2) progress is made but writeback is so slow it will
> > > > > take too long.
> > > >
> > > > I am saying it would be a nice improvement to extend the htd like that.
> > > >
> > > > And that your patch as proposed would avoidably make it harder -- you
> > > > can still get what you are aiming for without the wakeups.
> > > >
> > > > Also note that when looking at a kernel crashdump it may be beneficial
> > > > to know when a particular thread got first stuck in the kernel, which
> > > > is again gone with your patch.
> > >
> > > I understand your concerns but I think it's stretching the goals for this
> > > patch a bit too much. I'm fine with the patch going in as is and if Julian
> > > is willing to work on this additional debug features, then great!
> > >
> >
> > I am not asking the patch does all that work, merely that it gets
> > implemented in a way which wont require a rewrite should the above
> > work get done. Which boils down to storing the timestamp somewhere in
> > task_struct.
>
> Well, but that doesn't really make much sense without the debug patch
> itself, does it? And it could be a potential discussion point. So I think
> the debug patch should just move the timestamp from the wb completion to
> task_struct if that's needed...
>
I don't follow your e-mail, so I'm going to restate how I see this.
There is a timestamp stored somewhere indicating when was the last
time the thread went off CPU.
There are cases where it went off CPU for a long time, despite the
thing it is waiting for making progress.
This can trigger false-positives in hung task detector.
The proposed patch forces wake ups before this happens. This works
around the problem, but also perturbs a datum which would be useful in
debugging. It also seems weirdly disruptive for what it is aiming for.
When looking at a kernel crashdump it may be useful to know how long
the particular thread was off CPU. This will no longer be accessible.
Even for cases where progress is being made, but takes a long time, it
may still be useful to report it as this does not look like the
expected condition. With the change at hand there is no way to do it.
The reported issue can be worked around by adding a timestamp of 'last
known progress' to one of the holes in task_struct. Then hung task
detector can be trivially patched to bugger off based on it.
Later an interested party can extend the current functionality with
the thing I mentioned.
> Honza
>
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
prev parent reply other threads:[~2025-09-26 16:14 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-25 13:22 [PATCH] write-back: Wake up waiting tasks when finishing the writeback of a chunk Julian Sun
2025-09-25 16:31 ` Jan Kara
2025-09-25 17:25 ` Mateusz Guzik
2025-09-26 2:26 ` Julian Sun
2025-09-26 11:17 ` Mateusz Guzik
2025-09-26 11:43 ` Julian Sun
2025-09-26 12:05 ` Mateusz Guzik
2025-09-26 15:48 ` Jan Kara
2025-09-26 15:50 ` Mateusz Guzik
2025-09-26 15:55 ` Jan Kara
2025-09-26 16:14 ` Mateusz Guzik [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAGudoHEpCYaA9omAXoJWraDTL5hMSfrj3Cufe_W5sEcSuonX_g@mail.gmail.com \
--to=mjguzik@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=jack@suse.cz \
--cc=lance.yang@linux.dev \
--cc=linux-fsdevel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=sunjunchao@bytedance.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).