From: Julian Sun <sunjunchao@bytedance.com>
To: Mateusz Guzik <mjguzik@gmail.com>
Cc: linux-fsdevel@vger.kernel.org, jack@suse.cz, brauner@kernel.org,
viro@zeniv.linux.org.uk, peterz@infradead.org,
akpm@linux-foundation.org, Lance Yang <lance.yang@linux.dev>
Subject: Re: [PATCH] write-back: Wake up waiting tasks when finishing the writeback of a chunk.
Date: Fri, 26 Sep 2025 10:26:34 +0800 [thread overview]
Message-ID: <5622443b-b5b4-4b19-8a7b-f3923f822dda@bytedance.com> (raw)
In-Reply-To: <fylfqtj5wob72574qjkm7zizc7y4ieb2tanzqdexy4wcgtgov4@h25bh2fsklfn>
On 9/26/25 1:25 AM, Mateusz Guzik wrote:
> On Thu, Sep 25, 2025 at 09:22:39PM +0800, Julian Sun wrote:
>> Writing back a large number of pages can take a lots of time.
>> This issue is exacerbated when the underlying device is slow or
>> subject to block layer rate limiting, which in turn triggers
>> unexpected hung task warnings.
>>
>> We can trigger a wake-up once a chunk has been written back and the
>> waiting time for writeback exceeds half of
>> sysctl_hung_task_timeout_secs.
>> This action allows the hung task detector to be aware of the writeback
>> progress, thereby eliminating these unexpected hung task warnings.
>>
>
> If I'm reading correctly this is also messing with stats how long the
> thread was stuck to begin with.
IMO, it will not mess up the time. Since it only updates the time when
we can see progress (which is not a hang). If the task really hangs for
a long time, then we can't perform the time update—so it will not mess
up the time.
cc Lance and Andrew.
>
> Perhaps it would be better to have a var in task_struct which would
> serve as an indicator of progress being made (e.g., last time stamp of
> said progress).
>
> task_struct already has numerous holes so this would not have to grow it
> above what it is now.
>
>
>> This patch has passed the xfstests 'check -g quick' test based on ext4,
>> with no additional failures introduced.
>>
>> Signed-off-by: Julian Sun <sunjunchao@bytedance.com>
>> Suggested-by: Peter Zijlstra <peterz@infradead.org>
>> ---
>> fs/fs-writeback.c | 13 +++++++++++--
>> include/linux/backing-dev-defs.h | 1 +
>> 2 files changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
>> index a07b8cf73ae2..475d52abfb3e 100644
>> --- a/fs/fs-writeback.c
>> +++ b/fs/fs-writeback.c
>> @@ -14,6 +14,7 @@
>> * Additions for address_space-based writeback
>> */
>>
>> +#include <linux/sched/sysctl.h>
>> #include <linux/kernel.h>
>> #include <linux/export.h>
>> #include <linux/spinlock.h>
>> @@ -174,9 +175,12 @@ static void finish_writeback_work(struct wb_writeback_work *work)
>> kfree(work);
>> if (done) {
>> wait_queue_head_t *waitq = done->waitq;
>> + /* Report progress to inform the hung task detector of the progress. */
>> + bool force_wake = (jiffies - done->stamp) >
>> + sysctl_hung_task_timeout_secs * HZ / 2;
>>
>> /* @done can't be accessed after the following dec */
>> - if (atomic_dec_and_test(&done->cnt))
>> + if (atomic_dec_and_test(&done->cnt) || force_wake)
>> wake_up_all(waitq);
>> }
>> }
>> @@ -213,7 +217,7 @@ static void wb_queue_work(struct bdi_writeback *wb,
>> void wb_wait_for_completion(struct wb_completion *done)
>> {
>> atomic_dec(&done->cnt); /* put down the initial count */
>> - wait_event(*done->waitq, !atomic_read(&done->cnt));
>> + wait_event(*done->waitq, ({ done->stamp = jiffies; !atomic_read(&done->cnt); }));
>> }
>>
>> #ifdef CONFIG_CGROUP_WRITEBACK
>> @@ -1975,6 +1979,11 @@ static long writeback_sb_inodes(struct super_block *sb,
>> */
>> __writeback_single_inode(inode, &wbc);
>>
>> + /* Report progress to inform the hung task detector of the progress. */
>> + if (work->done && (jiffies - work->done->stamp) >
>> + HZ * sysctl_hung_task_timeout_secs / 2)
>> + wake_up_all(work->done->waitq);
>> +
>> wbc_detach_inode(&wbc);
>> work->nr_pages -= write_chunk - wbc.nr_to_write;
>> wrote = write_chunk - wbc.nr_to_write - wbc.pages_skipped;
>> diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
>> index 2ad261082bba..c37c6bd5ef5c 100644
>> --- a/include/linux/backing-dev-defs.h
>> +++ b/include/linux/backing-dev-defs.h
>> @@ -63,6 +63,7 @@ enum wb_reason {
>> struct wb_completion {
>> atomic_t cnt;
>> wait_queue_head_t *waitq;
>> + unsigned long stamp;
>> };
>>
>> #define __WB_COMPLETION_INIT(_waitq) \
>> --
>> 2.39.5
>>
Thanks,
--
Julian Sun <sunjunchao@bytedance.com>
next prev parent reply other threads:[~2025-09-26 2:26 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-25 13:22 [PATCH] write-back: Wake up waiting tasks when finishing the writeback of a chunk Julian Sun
2025-09-25 16:31 ` Jan Kara
2025-09-25 17:25 ` Mateusz Guzik
2025-09-26 2:26 ` Julian Sun [this message]
2025-09-26 11:17 ` Mateusz Guzik
2025-09-26 11:43 ` Julian Sun
2025-09-26 12:05 ` Mateusz Guzik
2025-09-26 15:48 ` Jan Kara
2025-09-26 15:50 ` Mateusz Guzik
2025-09-26 15:55 ` Jan Kara
2025-09-26 16:14 ` Mateusz Guzik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5622443b-b5b4-4b19-8a7b-f3923f822dda@bytedance.com \
--to=sunjunchao@bytedance.com \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=jack@suse.cz \
--cc=lance.yang@linux.dev \
--cc=linux-fsdevel@vger.kernel.org \
--cc=mjguzik@gmail.com \
--cc=peterz@infradead.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).