From: Alexey Dobriyan <adobriyan@gmail.com>
To: Damien Le Moal <Damien.LeMoal@wdc.com>
Cc: "axboe@kernel.dk" <axboe@kernel.dk>,
"fio@vger.kernel.org" <fio@vger.kernel.org>
Subject: Re: [PATCH 3/7] zbd: introduce per-device "max_open_zones" limit
Date: Mon, 4 May 2020 15:15:58 +0300 [thread overview]
Message-ID: <20200504121558.GA19268@avx2> (raw)
In-Reply-To: <BY5PR04MB690028920E402F350BD5FDD6E7A60@BY5PR04MB6900.namprd04.prod.outlook.com>
On Mon, May 04, 2020 at 01:41:14AM +0000, Damien Le Moal wrote:
> Alexey,
>
> On 2020/05/02 3:52, Alexey Dobriyan wrote:
> > On Fri, May 01, 2020 at 01:34:32AM +0000, Damien Le Moal wrote:
> >> On 2020/04/30 21:41, Alexey Dobriyan wrote:
> >>> It is not possible to maintain equal per-thread iodepth. The way code
> >>> is written, "max_open_zones" acts as a global limit, and one thread
> >>> opens all "max_open_zones" for itself and others starve for available
> >>> zones and _exit_ prematurely.
> >>>
> >>> This config is guaranteed to make equal number of zone resets/IO now:
> >>> each thread generates identical pattern and doesn't intersect with other
> >>> threads:
> >>>
> >>> zonemode=zbd
> >>> zonesize=...
> >>> rw=write
> >>>
> >>> numjobs=N
> >>> offset_increment=M*zonesize
> >>>
> >>> [j]
> >>> size=M*zonesize
> >>>
> >>> Patch introduces "global_max_open_zones" which is per-device config
> >>> option. "max_open_zones" becomes per-thread limit. Both limits are
> >>> checked for each open zone so one thread can't starve others.
> >>
> >> It makes sense. Nice one.
> >>
> >> But the change as is will break existing test scripts (e.g. lots of SMR drives
> >> are being tested with this).
> >
> > It won't break single-threaded ones, that's for sure.
>
> Yes, but things like:
>
> fio --ioengine=psync --rw=randwr --max_open_zones=128 --numjobs=32
>
> will change behavior. With your change, instead of 32 threads writing randomly
> to a total of 128 zones, you will get 32 threads each writing randomly to 128
> zones, with a total of 32*128=4096 zones.
>
> SMR drives and zonemode=zbd have now been around for a while and there are a lot
> of fio scripts deployed in production for system validation/tests, as well as in
> drive development for testing. If we can avoid breaking that, we absolutely must.
>
> My proposal to keep max_open_zones as the per device maximum and introducing a
> thread_max_open_zones limit keeps backward compatibility with existing scripts
> while still allowing your change.
>
> >
> >> I think we can avoid this breakage simply: leave
> >> max_open_zones option definition as is and add "job_max_open_zones" or
> >> "thread_max_open_zones" option (no strong feelings about the name here, as long
> >> as it is explicit) to define the per thread maximum number of open zones. This
> >> new option could actually default to max_open_zones / numjobs if that is not 0.
> >
> > I'd argue that such scripts are broken.
>
> See the above example. It is a perfectly valid script, not broken at all.
It is broken in the sense that script doesn't test what's author thinks it tests.
max_open_zones= + numjobs= can only be used as random stress smoke test, nothing
more. Patch actually increases stress level :-)
I assume that if open zone command fails due to hardware limitations, thread can
and will exit just as easily.
> Varying the number of max_open_zones allows measuring the performance variation
> of a drive with the number of implicitly open zones. It is a common one that I
> have seen a lot in drive development and production. There are likely other
> valid ones too. Assuming that all current uses of max_open_zones with multi-jobs
> workloads are broken would be a mistake.
>
> >
> > If sustained numjobs*max_open_zones QD is desired than it is not
> > guaranteed as threads will simply exit at indeterminate times,
> > which break LBA space coverage as well.
> >
> > Right now, numjobs= + max_open_zones= means "max open zones by at most
> > "numjobs" threads.
>
> I understand that. And we should keep it that way for the reasons mentioned
> above. Modifying your change with the option thread_max_open_zones will nicely
> enhance. E.g.
>
> fio --ioengine=libaio --iodepth=8 --rw=randwr --thread_max_open_zones=1 --numjobs=8
>
> Will result in 8 threads writing a single randomly chosen zone at QD=8. And that
> is the same as your proposed:
>
> fio --ioengine=libaio --iodepth=8 --rw=randwr --max_open_zones=1 --numjobs=8
>
> but without breaking the existing meaning of max_open_zones as a per drive/file
> limit.
>
> I totally agree with your change. It is a nice one. But let's preserve
> max_open_zones meaning as the per device limit. No need to change it.
OK I'll resend but I'll call it "job_max_open_zones".
It doesn't help that fio doesn't have a notion of per-file/device option.
next prev parent reply other threads:[~2020-05-04 12:15 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-30 12:40 [PATCH 1/7] zbd: bump ZBD_MAX_OPEN_ZONES Alexey Dobriyan
2020-04-30 12:40 ` [PATCH 2/7] zbd: don't lock zones outside working area Alexey Dobriyan
2020-05-01 1:27 ` Damien Le Moal
2020-04-30 12:40 ` [PATCH 3/7] zbd: introduce per-device "max_open_zones" limit Alexey Dobriyan
2020-05-01 1:34 ` Damien Le Moal
2020-05-01 18:52 ` Alexey Dobriyan
2020-05-04 1:41 ` Damien Le Moal
2020-05-04 12:15 ` Alexey Dobriyan [this message]
2020-04-30 12:40 ` [PATCH 4/7] zbd: make zbd_info->mutex non-recursive Alexey Dobriyan
2020-05-01 1:36 ` Damien Le Moal
2020-04-30 12:40 ` [PATCH 5/7] zbd: consolidate zone mutex initialisation Alexey Dobriyan
2020-05-01 1:44 ` Damien Le Moal
2020-05-01 18:37 ` Alexey Dobriyan
2020-05-02 4:39 ` Damien Le Moal
2020-04-30 12:40 ` [PATCH 6/7] fio: parse "io_size=1%" Alexey Dobriyan
2020-05-01 1:51 ` Damien Le Moal
2020-05-01 6:00 ` Sitsofe Wheeler
2020-04-30 12:40 ` [PATCH 7/7] verify: decouple seed generation from buffer fill Alexey Dobriyan
2020-05-01 1:59 ` Damien Le Moal
2020-05-01 1:19 ` [PATCH 1/7] zbd: bump ZBD_MAX_OPEN_ZONES Damien Le Moal
2020-05-01 14:47 ` Alexey Dobriyan
2020-05-02 4:37 ` Damien Le Moal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200504121558.GA19268@avx2 \
--to=adobriyan@gmail.com \
--cc=Damien.LeMoal@wdc.com \
--cc=axboe@kernel.dk \
--cc=fio@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox