From: Yu Kuai <yukuai@kernel.org>
To: Yizhou Tang <tangyeechou@gmail.com>,
Julian Sun <sunjunchao2870@gmail.com>
Cc: linux-block@vger.kernel.org, axboe@kernel.dk,
stable@vger.kernel.org, Julian Sun <sunjunchao@bytedance.com>
Subject: Re: [PATCH] blk-wbt: Fix io starvation in wbt_rqw_done()
Date: Fri, 1 Aug 2025 01:12:55 +0800 [thread overview]
Message-ID: <2ca5109a-341c-497a-9da7-422d56620348@kernel.org> (raw)
In-Reply-To: <CAOB9oOZV5ObqvgNxr9m0ztm7ruM9N9RMi8QHmiG5WL4sNbLxuw@mail.gmail.com>
Hi,
在 2025/7/31 23:40, Yizhou Tang 写道:
> Hi Julian,
>
> On Thu, Jul 31, 2025 at 8:33 PM Julian Sun <sunjunchao2870@gmail.com> wrote:
>> Recently, we encountered the following hungtask:
>>
>> INFO: task kworker/11:2:2981147 blocked for more than 6266 seconds
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> kworker/11:2 D 0 2981147 2 0x80004000
>> Workqueue: cgroup_destroy css_free_rwork_fn
>> Call Trace:
>> __schedule+0x934/0xe10
>> schedule+0x40/0xb0
>> wb_wait_for_completion+0x52/0x80
> I don’t see __wbt_wait() or rq_qos_wait() here, so I suspect this call
> stack is not directly related to wbt.
>
>
>> ? finish_wait+0x80/0x80
>> mem_cgroup_css_free+0x3a/0x1b0
>> css_free_rwork_fn+0x42/0x380
>> process_one_work+0x1a2/0x360
>> worker_thread+0x30/0x390
>> ? create_worker+0x1a0/0x1a0
>> kthread+0x110/0x130
>> ? __kthread_cancel_work+0x40/0x40
>> ret_from_fork+0x1f/0x30
This is writeback cgroup is waiting for writeback to be done, if you
figured out
they are throttled by wbt, you need to explain clearly, and it's very
important to
provide evidence to support your analysis. However, the following
analysis is
a mess :(
>>
>> This is because the writeback thread has been continuously and repeatedly
>> throttled by wbt, but at the same time, the writes of another thread
>> proceed quite smoothly.
>> After debugging, I believe it is caused by the following reasons.
>>
>> When thread A is blocked by wbt, the I/O issued by thread B will
>> use a deeper queue depth(rwb->rq_depth.max_depth) because it
>> meets the conditions of wb_recent_wait(), thus allowing thread B's
>> I/O to be issued smoothly and resulting in the inflight I/O of wbt
>> remaining relatively high.
>>
>> However, when I/O completes, due to the high inflight I/O of wbt,
>> the condition "limit - inflight >= rwb->wb_background / 2"
>> in wbt_rqw_done() cannot be satisfied, causing thread A's I/O
>> to remain unable to be woken up.
> From your description above, it seems you're suggesting that if A is
> throttled by wbt, then a writer B on the same device could
> continuously starve A.
> This situation is not possible — please refer to rq_qos_wait(): if A
> is already sleeping, then when B calls wq_has_sleeper(), it will
> detect A’s presence, meaning B will also be throttled.
Yes, there are three rq_wait in wbt, and each one is FIFO. It will be
possible
if A is backgroup, and B is swap.
>
> Thanks,
> Yi
>
>> Some on-site information:
>>
>>>>> rwb.rq_depth.max_depth
>> (unsigned int)48
>>>>> rqw.inflight.counter.value_()
>> 44
>>>>> rqw.inflight.counter.value_()
>> 35
>>>>> prog['jiffies'] - rwb.rqos.q.backing_dev_info.last_bdp_sleep
>> (unsigned long)3
>>>>> prog['jiffies'] - rwb.rqos.q.backing_dev_info.last_bdp_sleep
>> (unsigned long)2
>>>>> prog['jiffies'] - rwb.rqos.q.backing_dev_info.last_bdp_sleep
>> (unsigned long)20
>>>>> prog['jiffies'] - rwb.rqos.q.backing_dev_info.last_bdp_sleep
>> (unsigned long)12
>>
>> cat wb_normal
>> 24
>> cat wb_background
>> 12
>>
>> To fix this issue, we can use max_depth in wbt_rqw_done(), so that
>> the handling of wb_recent_wait by wbt_rqw_done() and get_limit()
>> will also be consistent, which is more reasonable.
Are you able to reproduce this problem, and give this patch a test before
you send it?
Thanks,
Kuai
>>
>> Signed-off-by: Julian Sun <sunjunchao@bytedance.com>
>> Fixes: e34cbd307477 ("blk-wbt: add general throttling mechanism")
>> ---
>> block/blk-wbt.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/block/blk-wbt.c b/block/blk-wbt.c
>> index a50d4cd55f41..d6a2782d442f 100644
>> --- a/block/blk-wbt.c
>> +++ b/block/blk-wbt.c
>> @@ -210,6 +210,8 @@ static void wbt_rqw_done(struct rq_wb *rwb, struct rq_wait *rqw,
>> else if (blk_queue_write_cache(rwb->rqos.disk->queue) &&
>> !wb_recent_wait(rwb))
>> limit = 0;
>> + else if (wb_recent_wait(rwb))
>> + limit = rwb->rq_depth.max_depth;
>> else
>> limit = rwb->wb_normal;
>>
>> --
>> 2.20.1
>>
>>
next prev parent reply other threads:[~2025-07-31 17:12 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-31 12:33 [PATCH] blk-wbt: Fix io starvation in wbt_rqw_done() Julian Sun
2025-07-31 12:34 ` kernel test robot
2025-07-31 15:40 ` Yizhou Tang
2025-07-31 17:12 ` Yu Kuai [this message]
2025-08-06 7:52 ` Julian Sun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2ca5109a-341c-497a-9da7-422d56620348@kernel.org \
--to=yukuai@kernel.org \
--cc=axboe@kernel.dk \
--cc=linux-block@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=sunjunchao2870@gmail.com \
--cc=sunjunchao@bytedance.com \
--cc=tangyeechou@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox