public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Muchun Song <muchun.song@linux.dev>
To: axboe@kernel.dk
Cc: ming.lei@redhat.com, Muchun Song <songmuchun@bytedance.com>,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 0/3] Fix some starvation problems in block layer
Date: Tue, 24 Sep 2024 14:10:06 +0800	[thread overview]
Message-ID: <C70283DE-699F-41BF-8C64-4155DD00D976@linux.dev> (raw)
In-Reply-To: <20240914072844.18150-1-songmuchun@bytedance.com>



> On Sep 14, 2024, at 15:28, Muchun Song <songmuchun@bytedance.com> wrote:
> 
> We encounter a problem on our servers where hundreds of UNINTERRUPTED
> processes are all waiting in the WBT wait queue. And the IO hung detector
> logged so many messages about "blocked for more than 122 seconds". The
> call trace is as follows:
> 
>    Call Trace:
>        __schedule+0x959/0xee0
>        schedule+0x40/0xb0
>        io_schedule+0x12/0x40
>        rq_qos_wait+0xaf/0x140
>        wbt_wait+0x92/0xc0
>        __rq_qos_throttle+0x20/0x30
>        blk_mq_make_request+0x12a/0x5c0
>        generic_make_request_nocheck+0x172/0x3f0
>        submit_bio+0x42/0x1c0
>        ...
> 
> The WBT module is used to throttle buffered writeback, which will block
> any buffered writeback IO request until the previous inflight IOs have
> been completed. So I checked the inflight IO counter. That was one meaning
> one IO request was submitted to the downstream interface like block core
> layer or device driver (virtio_blk driver in our case). We need to figure
> out why the inflight IO is not completed in time. I confirmed that all
> the virtio ring buffers of virtio_blk are empty and the hardware dispatch
> list had one IO request, so the root cause is not related to the block
> device or the virtio_blk driver since the driver has never received that
> IO request.
> 
> We know that block core layer could submit IO requests to the driver
> through kworker (the callback function is blk_mq_run_work_fn). I thought
> maybe the kworker was blocked by some other resources causing the callback
> to not be evoked in time. So I checked all the kworkers and workqueues and
> confirmed there was no pending work on any kworker or workqueue.
> 
> Integrate all the investigation information, the problem should be in the
> block core layer missing a chance to submit that IO request. After
> some investigation of code, I found some scenarios which could cause the
> problem.
> 
> Changes in v3:
>  - Collect RB tag from Ming Lei.
>  - Adjust text to fit maximum 74 chars per line from Jens Axboe.

Hi Jens,

Friendly ping... Do you have any concerns regarding this version?

Muchun,
Thanks.

> 
> Changes in v2:
>  - Collect RB tag from Ming Lei.
>  - Use barrier-less approach to fix QUEUE_FLAG_QUIESCED ordering problem
>    suggested by Ming Lei.
>  - Apply new approach to fix BLK_MQ_S_STOPPED ordering for easier
>    maintenance.
>  - Add Fixes tag to each patch.
> 
> Muchun Song (3):
>  block: fix missing dispatching request when queue is started or
>    unquiesced
>  block: fix ordering between checking QUEUE_FLAG_QUIESCED and adding
>    requests
>  block: fix ordering between checking BLK_MQ_S_STOPPED and adding
>    requests
> 
> block/blk-mq.c | 55 ++++++++++++++++++++++++++++++++++++++------------
> block/blk-mq.h | 13 ++++++++++++
> 2 files changed, 55 insertions(+), 13 deletions(-)
> 
> -- 
> 2.20.1
> 


      parent reply	other threads:[~2024-09-24  6:10 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-14  7:28 [PATCH v3 0/3] Fix some starvation problems in block layer Muchun Song
2024-09-14  7:28 ` [PATCH v3 1/3] block: fix missing dispatching request when queue is started or unquiesced Muchun Song
2024-09-14  7:28 ` [PATCH v3 2/3] block: fix ordering between checking QUEUE_FLAG_QUIESCED and adding requests Muchun Song
2024-09-14  7:28 ` [PATCH v3 3/3] block: fix ordering between checking BLK_MQ_S_STOPPED " Muchun Song
2024-09-24  6:10 ` Muchun Song [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C70283DE-699F-41BF-8C64-4155DD00D976@linux.dev \
    --to=muchun.song@linux.dev \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=songmuchun@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox