From: Alexey Dobriyan <adobriyan@gmail.com>
To: Damien Le Moal <Damien.LeMoal@wdc.com>
Cc: "axboe@kernel.dk" <axboe@kernel.dk>,
"fio@vger.kernel.org" <fio@vger.kernel.org>
Subject: Re: [PATCH 3/7] zbd: introduce per-device "max_open_zones" limit
Date: Mon, 4 May 2020 15:15:58 +0300 [thread overview]
Message-ID: <20200504121558.GA19268@avx2> (raw)
In-Reply-To: <BY5PR04MB690028920E402F350BD5FDD6E7A60@BY5PR04MB6900.namprd04.prod.outlook.com>
On Mon, May 04, 2020 at 01:41:14AM +0000, Damien Le Moal wrote:
> Alexey,
>
> On 2020/05/02 3:52, Alexey Dobriyan wrote:
> > On Fri, May 01, 2020 at 01:34:32AM +0000, Damien Le Moal wrote:
> >> On 2020/04/30 21:41, Alexey Dobriyan wrote:
> >>> It is not possible to maintain equal per-thread iodepth. The way code
> >>> is written, "max_open_zones" acts as a global limit, and one thread
> >>> opens all "max_open_zones" for itself and others starve for available
> >>> zones and _exit_ prematurely.
> >>>
> >>> This config is guaranteed to make equal number of zone resets/IO now:
> >>> each thread generates identical pattern and doesn't intersect with other
> >>> threads:
> >>>
> >>> zonemode=zbd
> >>> zonesize=...
> >>> rw=write
> >>>
> >>> numjobs=N
> >>> offset_increment=M*zonesize
> >>>
> >>> [j]
> >>> size=M*zonesize
> >>>
> >>> Patch introduces "global_max_open_zones" which is per-device config
> >>> option. "max_open_zones" becomes per-thread limit. Both limits are
> >>> checked for each open zone so one thread can't starve others.
> >>
> >> It makes sense. Nice one.
> >>
> >> But the change as is will break existing test scripts (e.g. lots of SMR drives
> >> are being tested with this).
> >
> > It won't break single-threaded ones, that's for sure.
>
> Yes, but things like:
>
> fio --ioengine=psync --rw=randwr --max_open_zones=128 --numjobs=32
>
> will change behavior. With your change, instead of 32 threads writing randomly
> to a total of 128 zones, you will get 32 threads each writing randomly to 128
> zones, with a total of 32*128=4096 zones.
>
> SMR drives and zonemode=zbd have now been around for a while and there are a lot
> of fio scripts deployed in production for system validation/tests, as well as in
> drive development for testing. If we can avoid breaking that, we absolutely must.
>
> My proposal to keep max_open_zones as the per device maximum and introducing a
> thread_max_open_zones limit keeps backward compatibility with existing scripts
> while still allowing your change.
>
> >
> >> I think we can avoid this breakage simply: leave
> >> max_open_zones option definition as is and add "job_max_open_zones" or
> >> "thread_max_open_zones" option (no strong feelings about the name here, as long
> >> as it is explicit) to define the per thread maximum number of open zones. This
> >> new option could actually default to max_open_zones / numjobs if that is not 0.
> >
> > I'd argue that such scripts are broken.
>
> See the above example. It is a perfectly valid script, not broken at all.
It is broken in the sense that script doesn't test what's author thinks it tests.
max_open_zones= + numjobs= can only be used as random stress smoke test, nothing
more. Patch actually increases stress level :-)
I assume that if open zone command fails due to hardware limitations, thread can
and will exit just as easily.
> Varying the number of max_open_zones allows measuring the performance variation
> of a drive with the number of implicitly open zones. It is a common one that I
> have seen a lot in drive development and production. There are likely other
> valid ones too. Assuming that all current uses of max_open_zones with multi-jobs
> workloads are broken would be a mistake.
>
> >
> > If sustained numjobs*max_open_zones QD is desired than it is not
> > guaranteed as threads will simply exit at indeterminate times,
> > which break LBA space coverage as well.
> >
> > Right now, numjobs= + max_open_zones= means "max open zones by at most
> > "numjobs" threads.
>
> I understand that. And we should keep it that way for the reasons mentioned
> above. Modifying your change with the option thread_max_open_zones will nicely
> enhance. E.g.
>
> fio --ioengine=libaio --iodepth=8 --rw=randwr --thread_max_open_zones=1 --numjobs=8
>
> Will result in 8 threads writing a single randomly chosen zone at QD=8. And that
> is the same as your proposed:
>
> fio --ioengine=libaio --iodepth=8 --rw=randwr --max_open_zones=1 --numjobs=8
>
> but without breaking the existing meaning of max_open_zones as a per drive/file
> limit.
>
> I totally agree with your change. It is a nice one. But let's preserve
> max_open_zones meaning as the per device limit. No need to change it.
OK I'll resend but I'll call it "job_max_open_zones".
It doesn't help that fio doesn't have a notion of per-file/device option.
next prev parent reply other threads:[~2020-05-04 12:15 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-30 12:40 [PATCH 1/7] zbd: bump ZBD_MAX_OPEN_ZONES Alexey Dobriyan
2020-04-30 12:40 ` [PATCH 2/7] zbd: don't lock zones outside working area Alexey Dobriyan
2020-05-01 1:27 ` Damien Le Moal
2020-04-30 12:40 ` [PATCH 3/7] zbd: introduce per-device "max_open_zones" limit Alexey Dobriyan
2020-05-01 1:34 ` Damien Le Moal
2020-05-01 18:52 ` Alexey Dobriyan
2020-05-04 1:41 ` Damien Le Moal
2020-05-04 12:15 ` Alexey Dobriyan [this message]
2020-04-30 12:40 ` [PATCH 4/7] zbd: make zbd_info->mutex non-recursive Alexey Dobriyan
2020-05-01 1:36 ` Damien Le Moal
2020-04-30 12:40 ` [PATCH 5/7] zbd: consolidate zone mutex initialisation Alexey Dobriyan
2020-05-01 1:44 ` Damien Le Moal
2020-05-01 18:37 ` Alexey Dobriyan
2020-05-02 4:39 ` Damien Le Moal
2020-04-30 12:40 ` [PATCH 6/7] fio: parse "io_size=1%" Alexey Dobriyan
2020-05-01 1:51 ` Damien Le Moal
2020-05-01 6:00 ` Sitsofe Wheeler
2020-04-30 12:40 ` [PATCH 7/7] verify: decouple seed generation from buffer fill Alexey Dobriyan
2020-05-01 1:59 ` Damien Le Moal
2020-05-01 1:19 ` [PATCH 1/7] zbd: bump ZBD_MAX_OPEN_ZONES Damien Le Moal
2020-05-01 14:47 ` Alexey Dobriyan
2020-05-02 4:37 ` Damien Le Moal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200504121558.GA19268@avx2 \
--to=adobriyan@gmail.com \
--cc=Damien.LeMoal@wdc.com \
--cc=axboe@kernel.dk \
--cc=fio@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.