public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] Fix some starvation problems
@ 2024-08-11 10:19 Muchun Song
  2024-08-11 10:19 ` [PATCH 1/4] block: fix request starvation when queue is stopped or quiesced Muchun Song
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Muchun Song @ 2024-08-11 10:19 UTC (permalink / raw)
  To: axboe; +Cc: linux-block, linux-kernel, Muchun Song

We encounter a problem on our servers where there are hundreds of
UNINTERRUPTED processes which are all waiting in the WBT wait queue.
And the IO hung detector logged so many messages about "blocked for
more than 122 seconds". The call trace is as follows:

    Call Trace:
        __schedule+0x959/0xee0
        schedule+0x40/0xb0
        io_schedule+0x12/0x40
        rq_qos_wait+0xaf/0x140
        wbt_wait+0x92/0xc0
        __rq_qos_throttle+0x20/0x30
        blk_mq_make_request+0x12a/0x5c0
        generic_make_request_nocheck+0x172/0x3f0
        submit_bio+0x42/0x1c0
        ...

The WBT module is used to throttle buffered writeback, which will
block any buffered writeback IO request until the previous inflight
IOs have been completed. So I checked the inflight IO counter. That
was one meaning one IO request was submitted to the downstream
interface like block core layer or device driver (virtio_blk driver
in our case). We need to figure out why the inflight IO is not
completed in time. I confirmed that all the virtio ring buffers of
virtio_blk are empty, so the root cause is not related to the block
device or the virtio_blk driver since the driver has never received
that IO request.

We know that block core layer could submit IO requests to the driver
through kworker (the callback function is blk_mq_run_work_fn). I
thought maybe the kworker was blocked by some other resources causing
the callback to not be evoked in time. So I checked all the kworkers
and workqueues and confirmed there was no pending work on any kworker
or workqueue.

Integrate all the investigation information, I guess the problem should
be in block core layer missing a chance to submit an IO request. After
some investigation of code, I found some following scenarios which could
cause similar symptoms. I am not sure whether this is the root cause or
not, but maybe it is a reasonable suspect.

Muchun Song (4):
  block: fix request starvation when queue is stopped or quiesced
  block: fix ordering between checking BLK_MQ_S_STOPPED and adding
    requests to hctx->dispatch
  block: fix missing smp_mb in blk_mq_{delay_}run_hw_queues
  block: fix fix ordering between checking QUEUE_FLAG_QUIESCED and
    adding requests to hctx->dispatch

 block/blk-mq.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 67 insertions(+)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2024-08-29  7:57 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-11 10:19 [PATCH 0/4] Fix some starvation problems Muchun Song
2024-08-11 10:19 ` [PATCH 1/4] block: fix request starvation when queue is stopped or quiesced Muchun Song
2024-08-16  9:14   ` Ming Lei
2024-08-11 10:19 ` [PATCH 2/4] block: fix ordering between checking BLK_MQ_S_STOPPED and adding requests to hctx->dispatch Muchun Song
2024-08-19  2:27   ` Ming Lei
2024-08-19  3:49     ` Muchun Song
2024-08-22  3:54       ` Yu Kuai
2024-08-26  8:35         ` Muchun Song
2024-08-26  8:53           ` Yu Kuai
2024-08-27  7:31             ` Muchun Song
2024-08-29  7:57               ` Yu Kuai
2024-08-11 10:19 ` [PATCH 3/4] block: fix missing smp_mb in blk_mq_{delay_}run_hw_queues Muchun Song
2024-08-11 10:19 ` [PATCH 4/4] block: fix fix ordering between checking QUEUE_FLAG_QUIESCED and adding requests to hctx->dispatch Muchun Song
2024-08-23 11:27   ` Ming Lei
2024-08-26  7:06     ` Muchun Song
2024-08-26  7:33       ` Muchun Song
2024-08-26  9:20         ` Ming Lei
2024-08-27  7:24           ` Muchun Song
2024-08-27  8:16             ` Muchun Song
2024-08-29  2:51               ` Ming Lei
2024-08-29  3:40                 ` Muchun Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox