From: Damien Le Moal <dlemoal@kernel.org>
To: Bart Van Assche <bvanassche@acm.org>,
Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org
Subject: Re: [PATCH 0/8] Improve zoned (SMR) HDD write throughput
Date: Tue, 24 Feb 2026 10:07:38 +0900 [thread overview]
Message-ID: <99c22bd8-2898-4b72-91bb-e80847cda065@kernel.org> (raw)
In-Reply-To: <3ea1b7da-0639-4cf7-a8b4-132b26eedba8@acm.org>
On 2/24/26 2:03 AM, Bart Van Assche wrote:
> On 2/20/26 4:44 PM, Damien Le Moal wrote:
>> This patch series cleans up the zone write plugging code and introduces
>> the ability to issue all write BIOs from a single context (a kthread)
>> instead of allowing multiple zones to be written at the same time using
>> a per zone work. As shown in patch 6, raw block device tests and XFS
>> tests with an SMR HDD show that this can significantly increase write
>> throughput (up to 40% over the current zone write plugging).
> Hi Damien,
>
> Is a new kthread necessary? Has the following approach been considered?
> * Make the dm drivers that support rotational zoned storage devices
> request-based instead of bio-based.
What you are suggesting would be an enormous amount of work (dm-linear,
dm-flakey, dm-error, dm-crypt) to change generic code into DM targets that
would be very specialized for just SMR HDDs. I do not understand why you think
that would be a good idea.
> * Modify blk_mq_get_tag() such that only one tag can be allocated at a
> time for zoned write requests.
Sure, doing that would limit the number of write requests to zones to 1 at most
at any time, but that would also result in a total loss of control over which
zone write BIO work gets that single tag, meaning that the writes would in the
end be mostly random again, like they are now. So with this solution, I can say
goodbye to the +40% write throughput increase that I am seeing with the kthread.
Also note that this idea of limiting write tags combined with your idea of
using req based DM targets would likely negatively impact dm-crypt performance
as we would lose the ability to encrypt multiple writes in parallel on
different CPUs.
> I think that would be sufficient to serialize zoned writes. Additionally, this
> approach doesn't increase request processing latency
> by forcing a context switch to a kthread.
This point is in my opinion moot because we currently use work items to issue
the write commands. We are scheduling the zone write plugs BIO works in the
submission and completion path, so the context switch overhead is already
there. And would argue that using work items is potentially even more overhead
than using a fixed kthread since the work items need to be assigned to CPUs and
worker threads.
Granted, your point is valid for a QD=1 workload. In this case, I am indeed
introducing a context switch where there is none now. But that is not really
the use case we are looking at here. File system writeback does not happen at
QD=1 per zone. Also, this added overhead does not really matter for HDDs
anyway, and if that is really an issue, the user can enable the legacy zone
write plugging behavior with "echo 0 > /sys/block/sdX/queue/zoned_qd1_writes".
--
Damien Le Moal
Western Digital Research
prev parent reply other threads:[~2026-02-24 1:12 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-21 0:44 [PATCH 0/8] Improve zoned (SMR) HDD write throughput Damien Le Moal
2026-02-21 0:44 ` [PATCH 1/8] block: fix zone write plug removal Damien Le Moal
2026-02-23 11:56 ` Hannes Reinecke
2026-02-23 19:30 ` Bart Van Assche
2026-02-23 20:21 ` Bart Van Assche
2026-02-24 1:57 ` Damien Le Moal
2026-02-21 0:44 ` [PATCH 2/8] block: remove BLK_ZONE_WPLUG_UNHASHED Damien Le Moal
2026-02-23 11:48 ` Hannes Reinecke
2026-02-24 2:04 ` Damien Le Moal
2026-02-21 0:44 ` [PATCH 3/8] block: remove disk_zone_is_full() Damien Le Moal
2026-02-23 11:56 ` Hannes Reinecke
2026-02-24 13:15 ` Johannes Thumshirn
2026-02-21 0:44 ` [PATCH 4/8] block: improve disk_zone_wplug_schedule_bio_work() Damien Le Moal
2026-02-23 11:59 ` Hannes Reinecke
2026-02-23 18:56 ` Bart Van Assche
2026-02-24 2:03 ` Damien Le Moal
2026-02-24 15:00 ` Hannes Reinecke
2026-02-24 15:08 ` Christoph Hellwig
2026-02-24 13:18 ` Johannes Thumshirn
2026-02-21 0:44 ` [PATCH 5/8] block: rename struct gendisk zone_wplugs_lock field Damien Le Moal
2026-02-23 12:00 ` Hannes Reinecke
2026-02-24 13:19 ` Johannes Thumshirn
2026-02-21 0:44 ` [PATCH 6/8] block: allow submitting all zone writes from a single context Damien Le Moal
2026-02-23 12:07 ` Hannes Reinecke
2026-02-24 2:00 ` Damien Le Moal
2026-02-21 0:44 ` [PATCH 7/8] block: default to QD=1 writes for blk-mq rotational zoned devices Damien Le Moal
2026-02-23 12:07 ` Hannes Reinecke
2026-02-21 0:44 ` [PATCH 8/8] Documentation: ABI: stable: document the zoned_qd1_writes attribute Damien Le Moal
2026-02-23 12:07 ` Hannes Reinecke
2026-02-23 17:03 ` [PATCH 0/8] Improve zoned (SMR) HDD write throughput Bart Van Assche
2026-02-24 1:07 ` Damien Le Moal [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=99c22bd8-2898-4b72-91bb-e80847cda065@kernel.org \
--to=dlemoal@kernel.org \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=linux-block@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox