Re: [PATCH 3/7] zbd: introduce per-device "max_open_zones" limit

Flexible I/O Tester development
 help / color / mirror / Atom feed

From: Alexey Dobriyan <adobriyan@gmail.com>
To: Damien Le Moal <Damien.LeMoal@wdc.com>
Cc: "axboe@kernel.dk" <axboe@kernel.dk>,
	"fio@vger.kernel.org" <fio@vger.kernel.org>
Subject: Re: [PATCH 3/7] zbd: introduce per-device "max_open_zones" limit
Date: Mon, 4 May 2020 15:15:58 +0300	[thread overview]
Message-ID: <20200504121558.GA19268@avx2> (raw)
In-Reply-To: <BY5PR04MB690028920E402F350BD5FDD6E7A60@BY5PR04MB6900.namprd04.prod.outlook.com>

On Mon, May 04, 2020 at 01:41:14AM +0000, Damien Le Moal wrote:
> Alexey,
> 
> On 2020/05/02 3:52, Alexey Dobriyan wrote:
> > On Fri, May 01, 2020 at 01:34:32AM +0000, Damien Le Moal wrote:
> >> On 2020/04/30 21:41, Alexey Dobriyan wrote:
> >>> It is not possible to maintain equal per-thread iodepth. The way code
> >>> is written, "max_open_zones" acts as a global limit, and one thread
> >>> opens all "max_open_zones" for itself and others starve for available
> >>> zones and _exit_ prematurely.
> >>>
> >>> This config is guaranteed to make equal number of zone resets/IO now:
> >>> each thread generates identical pattern and doesn't intersect with other
> >>> threads:
> >>>
> >>> 	zonemode=zbd
> >>> 	zonesize=...
> >>> 	rw=write
> >>>
> >>> 	numjobs=N
> >>> 	offset_increment=M*zonesize
> >>>
> >>> 	[j]
> >>> 	size=M*zonesize
> >>>
> >>> Patch introduces "global_max_open_zones" which is per-device config
> >>> option. "max_open_zones" becomes per-thread limit. Both limits are
> >>> checked for each open zone so one thread can't starve others.
> >>
> >> It makes sense. Nice one.
> >>
> >> But the change as is will break existing test scripts (e.g. lots of SMR drives
> >> are being tested with this).
> > 
> > It won't break single-threaded ones, that's for sure.
> 
> Yes, but things like:
> 
> fio --ioengine=psync --rw=randwr --max_open_zones=128 --numjobs=32
> 
> will change behavior. With your change, instead of 32 threads writing randomly
> to a total of 128 zones, you will get 32 threads each writing randomly to 128
> zones, with a total of 32*128=4096 zones.
> 
> SMR drives and zonemode=zbd have now been around for a while and there are a lot
> of fio scripts deployed in production for system validation/tests, as well as in
> drive development for testing. If we can avoid breaking that, we absolutely must.
> 
> My proposal to keep max_open_zones as the per device maximum and introducing a
> thread_max_open_zones limit keeps backward compatibility with existing scripts
> while still allowing your change.
> 
> > 
> >> I think we can avoid this breakage simply: leave
> >> max_open_zones option definition as is and add "job_max_open_zones" or
> >> "thread_max_open_zones" option (no strong feelings about the name here, as long
> >> as it is explicit) to define the per thread maximum number of open zones. This
> >> new option could actually default to max_open_zones / numjobs if that is not 0.
> > 
> > I'd argue that such scripts are broken.
> 
> See the above example. It is a perfectly valid script, not broken at all.

It is broken in the sense that script doesn't test what's author thinks it tests.
max_open_zones= + numjobs= can only be used as random stress smoke test, nothing
more. Patch actually increases stress level :-)

I assume that if open zone command fails due to hardware limitations, thread can
and will exit just as easily.

> Varying the number of max_open_zones allows measuring the performance variation
> of a drive with the number of implicitly open zones. It is a common one that I
> have seen a lot in drive development and production. There are likely other
> valid ones too. Assuming that all current uses of max_open_zones with multi-jobs
> workloads are broken would be a mistake.
> 
> > 
> > If sustained numjobs*max_open_zones QD is desired than it is not
> > guaranteed as threads will simply exit at indeterminate times,
> > which break LBA space coverage as well.
> > 
> > Right now, numjobs= + max_open_zones= means "max open zones by at most
> > "numjobs" threads.
> 
> I understand that. And we should keep it that way for the reasons mentioned
> above. Modifying your change with the option thread_max_open_zones will nicely
> enhance. E.g.
> 
> fio --ioengine=libaio --iodepth=8 --rw=randwr --thread_max_open_zones=1 --numjobs=8
> 
> Will result in 8 threads writing a single randomly chosen zone at QD=8. And that
> is the same as your proposed:
> 
> fio --ioengine=libaio --iodepth=8 --rw=randwr --max_open_zones=1 --numjobs=8
> 
> but without breaking the existing meaning of max_open_zones as a per drive/file
> limit.
> 
> I totally agree with your change. It is a nice one. But let's preserve
> max_open_zones meaning as the per device limit. No need to change it.

OK I'll resend but I'll call it "job_max_open_zones".
It doesn't help that fio doesn't have a notion of per-file/device option.

next prev parent reply	other threads:[~2020-05-04 12:15 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-30 12:40 [PATCH 1/7] zbd: bump ZBD_MAX_OPEN_ZONES Alexey Dobriyan
2020-04-30 12:40 ` [PATCH 2/7] zbd: don't lock zones outside working area Alexey Dobriyan
2020-05-01  1:27   ` Damien Le Moal
2020-04-30 12:40 ` [PATCH 3/7] zbd: introduce per-device "max_open_zones" limit Alexey Dobriyan
2020-05-01  1:34   ` Damien Le Moal
2020-05-01 18:52     ` Alexey Dobriyan
2020-05-04  1:41       ` Damien Le Moal
2020-05-04 12:15         ` Alexey Dobriyan [this message]
2020-04-30 12:40 ` [PATCH 4/7] zbd: make zbd_info->mutex non-recursive Alexey Dobriyan
2020-05-01  1:36   ` Damien Le Moal
2020-04-30 12:40 ` [PATCH 5/7] zbd: consolidate zone mutex initialisation Alexey Dobriyan
2020-05-01  1:44   ` Damien Le Moal
2020-05-01 18:37     ` Alexey Dobriyan
2020-05-02  4:39       ` Damien Le Moal
2020-04-30 12:40 ` [PATCH 6/7] fio: parse "io_size=1%" Alexey Dobriyan
2020-05-01  1:51   ` Damien Le Moal
2020-05-01  6:00     ` Sitsofe Wheeler
2020-04-30 12:40 ` [PATCH 7/7] verify: decouple seed generation from buffer fill Alexey Dobriyan
2020-05-01  1:59   ` Damien Le Moal
2020-05-01  1:19 ` [PATCH 1/7] zbd: bump ZBD_MAX_OPEN_ZONES Damien Le Moal
2020-05-01 14:47   ` Alexey Dobriyan
2020-05-02  4:37     ` Damien Le Moal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200504121558.GA19268@avx2 \
    --to=adobriyan@gmail.com \
    --cc=Damien.LeMoal@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=fio@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox