From: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
To: Vincent Fu <vincentfu@gmail.com>
Cc: "fio@vger.kernel.org" <fio@vger.kernel.org>,
Jens Axboe <axboe@kernel.dk>, Damien Le Moal <dlemoal@kernel.org>
Subject: Re: [PATCH 02/12] zbd: set norandommap=1 when zonemode=zbd is specified
Date: Tue, 27 Jan 2026 05:05:53 +0000 [thread overview]
Message-ID: <aXgleDJ44YFYDQeF@shinmob> (raw)
In-Reply-To: <CAOp=CXnX9zorei7vwO54OvxBeFEjsKzgKXZaLqDOmu24-X9bbw@mail.gmail.com>
Vincent, thanks for the comments.
On Jan 26, 2026 / 20:39, Vincent Fu wrote:
[...]
> As you know, we try to avoid breaking existing job files. This change
> breaks 3 test cases.
To be precise, only 1 of the 3 test cases in t/zbd was broken due to the
norandommap=1 restriction (test case 14). The other two were broken due to the
change from zone finish operation to simple writes. (Still your point is valid
for the single test case.)
> Fixing a genuine bug or clarifying previously undefined behavior can
> justify this sort of breaking change.
>
> It seems heavy handed to suddenly prohibit running a test on a zoned
> block device with a random map. Can you provide a stronger
> justification for this change?
There are two justifications to prohibit random map:
1) Random map is not accurate for zoned block devices since zoned block devices
have the restriction that the writes shall be done at write pointer
positions. The random map feature manages write offsets, but the offsets are
modified to the write pointer positions before issuing the write io_u. Still
random map has the effect to balance the write amounts across zones, but it
does not work to ensure each sectors has no overlapped writes.
2) This change to ensure norandomap=1 is required to do writes to zone ends
that have remainders smaller than min_bs. Before this change, fio handled
such remainders by zone finish operations, but it turned out that simple
writes for the remainders are much faster than zone finish operations. This
indicates that users will not use zone finish, and will use writes in their
systems to fill the remainders. Based on this understanding, the current
implementation with zone finish can not show the performance that users
expect (This is the reason I started working on this series). The major use
case of fio is performance measurement, so in that sense, I would say the
current implementation has a performance measurement bug. I thought that the
norandommap=1 restriction can be allowed to fix this bug.
>
> Perhaps you could instead emit a warning when a random job is run with
> a random map and then condition the relevant changes in later patches
> on the absence of a random map.
Actually, I thought about other options as follows to seek for a better
solution:
1) Leave the current remainder handling with zone finish operation, and add the
new handling with simple writes. Choose one of the two handlings by a new
option, and ensure norandommap=1 only for the handling with simple writes.
-> This can keep the current behavior with norandommap=1 workloads, but it
comes with the zone finish operation that shows bad performance. I
thought this leaves complexity for users and in the code.
2) Do not set norandommap=1 always, and set it only when it is required. To be
precise, set norandommap=1 when,
- bs is not aligned to zone size, or,
- initial write pointer positions of a zone is not aligned to bs
-> I guess this can be implemented, and can leave the behavior with
norandaommpa=0 for some workloads. However, still the norandommap=1
override is required when the conditions are not met. I thought this
complexity could confuse users (Users may need to move write pointers
to run their workloads with norandammap=0).
3) Modify axmap and randmap handling to support block size smaller than min_bs.
-> This will be a fundamental change in fio and axmap designs: min_bs is
referred to in many places. I don't think this approach is feasible.
I think this patch is the better than the other solutions above, but I'm open to
other options. Let me know your thoughts on them. I guess your opinion is
similar as the option 2) above.
next prev parent reply other threads:[~2026-01-27 5:05 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-09 2:35 [PATCH 00/12] zbd: fix problems of random write with unaligned block size Shin'ichiro Kawasaki
2026-01-09 2:35 ` [PATCH 01/12] zbd: fix zone selection of random writes Shin'ichiro Kawasaki
2026-01-09 2:35 ` [PATCH 02/12] zbd: set norandommap=1 when zonemode=zbd is specified Shin'ichiro Kawasaki
2026-01-27 1:39 ` Vincent Fu
2026-01-27 5:05 ` Shinichiro Kawasaki [this message]
2026-01-30 20:01 ` Vincent Fu
2026-02-09 23:57 ` Shinichiro Kawasaki
2026-01-09 2:35 ` [PATCH 03/12] zbd: write zone remainders smaller than minimum block size Shin'ichiro Kawasaki
2026-01-09 2:35 ` [PATCH 04/12] zbd: fix write zone accounting Shin'ichiro Kawasaki
2026-01-09 2:35 ` [PATCH 05/12] zbd: remove io_u_quiesce() at write target zone switch Shin'ichiro Kawasaki
2026-01-09 2:35 ` [PATCH 06/12] zbd: remove zbd_finish_zone() Shin'ichiro Kawasaki
2026-01-09 2:35 ` [PATCH 07/12] oslib: remove blkzoned_finish_zone() Shin'ichiro Kawasaki
2026-01-09 2:35 ` [PATCH 08/12] ioengine: remove finish_zone() Shin'ichiro Kawasaki
2026-01-09 2:36 ` [PATCH 09/12] doc: explain norandommap restriction and small remainder of zonemode=zbd Shin'ichiro Kawasaki
2026-01-09 2:36 ` [PATCH 10/12] t/zbd: avoid test case 14 failure due to no randam map Shin'ichiro Kawasaki
2026-01-09 2:36 ` [PATCH 11/12] t/zbd: avoid test case 33 failure due to zone end remainder Shin'ichiro Kawasaki
2026-01-09 2:36 ` [PATCH 12/12] t/zbd: avoid test case 71 " Shin'ichiro Kawasaki
2026-01-09 9:19 ` [PATCH 00/12] zbd: fix problems of random write with unaligned block size fiotestbot
2026-01-26 6:50 ` Damien Le Moal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aXgleDJ44YFYDQeF@shinmob \
--to=shinichiro.kawasaki@wdc.com \
--cc=axboe@kernel.dk \
--cc=dlemoal@kernel.org \
--cc=fio@vger.kernel.org \
--cc=vincentfu@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox