From: Jens Axboe <axboe@kernel.dk>
To: Aaron Tomlin <atomlin@atomlin.com>,
rostedt@goodmis.org, mhiramat@kernel.org,
mathieu.desnoyers@efficios.com
Cc: bvanassche@acm.org, johannes.thumshirn@wdc.com, kch@nvidia.com,
dlemoal@kernel.org, ritesh.list@gmail.com, loberman@redhat.com,
neelx@suse.com, sean@ashe.io, mproche@gmail.com,
chjohnst@gmail.com, linux-block@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org
Subject: Re: [PATCH v6 0/2] blk-mq: introduce tag starvation observability
Date: Mon, 18 May 2026 07:31:45 -0600 [thread overview]
Message-ID: <189ddfd7-f579-4c86-bcfc-334cf574bdfc@kernel.dk> (raw)
In-Reply-To: <20260517213614.350367-1-atomlin@atomlin.com>
On 5/17/26 3:36 PM, Aaron Tomlin wrote:
> Hi Jens, Steve, Masami,
>
> In high-performance storage environments, particularly when utilising RAID
> controllers with shared tag sets (BLK_MQ_F_TAG_HCTX_SHARED), severe latency
> spikes can occur when fast devices are starved of available tags.
> Currently, diagnosing this specific queue contention requires deploying
> dynamic kprobes or inferring sleep states, which lacks a simple,
> out-of-the-box diagnostic path.
>
> This short series introduces dedicated, low-overhead observability for tag
> exhaustion events in the block layer:
>
> - Patch 1 introduces the "block_rq_tag_wait" tracepoint in the tag
> allocation slow-path to capture precise, event-based starvation.
>
> - Patch 2 complements this by exposing "wait_on_hw_tag" and
> "wait_on_sched_tag" per-CPU counters via debugfs for quick,
> point-in-time cumulative polling.
>
> Together, these provide storage engineers with zero-configuration
> mechanisms to definitively identify shared-tag bottlenecks.
Why not just issue the trace points? Then there's close to zero
overhead, rather than needing to need added counters for this, and the
kernel to keep track. If you just issue the get/put tag kind of traces,
then userspace can keep track. That's what blktrace has done for decades
for things like inflight/queue depth accounting.
IOW, seems to me, this could be done with basically zero kernel
additions outside of perhaps a trace point or two.
--
Jens Axboe
prev parent reply other threads:[~2026-05-18 13:31 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-17 21:36 [PATCH v6 0/2] blk-mq: introduce tag starvation observability Aaron Tomlin
2026-05-17 21:36 ` [PATCH v6 1/2] blk-mq: add tracepoint block_rq_tag_wait Aaron Tomlin
2026-05-17 21:36 ` [PATCH v6 2/2] blk-mq: expose tag starvation counts via debugfs Aaron Tomlin
2026-05-18 8:14 ` John Garry
2026-05-18 13:31 ` Jens Axboe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=189ddfd7-f579-4c86-bcfc-334cf574bdfc@kernel.dk \
--to=axboe@kernel.dk \
--cc=atomlin@atomlin.com \
--cc=bvanassche@acm.org \
--cc=chjohnst@gmail.com \
--cc=dlemoal@kernel.org \
--cc=johannes.thumshirn@wdc.com \
--cc=kch@nvidia.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=loberman@redhat.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=mproche@gmail.com \
--cc=neelx@suse.com \
--cc=ritesh.list@gmail.com \
--cc=rostedt@goodmis.org \
--cc=sean@ashe.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox