Flexible I/O Tester development
 help / color / mirror / Atom feed
From: Niklas Cassel <Niklas.Cassel@wdc.com>
To: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: "axboe@kernel.dk" <axboe@kernel.dk>,
	"fio@vger.kernel.org" <fio@vger.kernel.org>,
	Damien Le Moal <Damien.LeMoal@wdc.com>
Subject: Re: [PATCH 7/8] cmdprio: add mode to make the logic easier to reason about
Date: Wed, 10 Nov 2021 11:34:10 +0000	[thread overview]
Message-ID: <YYuuMU4fte20JA/l@x1-carbon> (raw)
In-Reply-To: <1e4cfd69-2e8a-99c7-a654-d784056c47f8@opensource.wdc.com>

On Wed, Nov 10, 2021 at 02:57:17PM +0900, Damien Le Moal wrote:
> On 2021/11/09 22:29, Niklas Cassel wrote:
> >>> -int fio_cmdprio_init(struct thread_data *td, struct cmdprio *cmdprio,
> >>> -		     bool *has_cmdprio)
> >>> +int fio_cmdprio_init(struct thread_data *td, struct cmdprio *cmdprio)
> >>>  {
> >>>	struct thread_options *to = &td->o;
> >>>	bool has_cmdprio_percentage = false;
> >>> @@ -140,16 +144,12 @@ int fio_cmdprio_init(struct thread_data *td, struct cmdprio *cmdprio,
> >>>	 * is not set, default to RT priority class.
> >>>	 */
> >>>	for (i = 0; i < CMDPRIO_RWDIR_CNT; i++) {
> >>> -		if (cmdprio->percentage[i]) {
> >>> -			if (!cmdprio->class[i])
> >>> -				cmdprio->class[i] = IOPRIO_CLASS_RT;
> >>
> >> Why do you change this ? If cmdprio->percentage[i] is 0, you do not want to set
> >> the calss to RT. Or is it that cmdprio->percentage[i] cannot be 0 ? (I did not
> >> check the range of the option).
> >
> > option cmdprio_percentage has minval 0.
>
> Which could be changed to 1, since having percentage == 0 is (should be)
> equivalent to a "do not set a priority".

We could change the option .minval, but it wouldn't change anything at all.

Since the option is an FIO_OPT_INT, the off1 value will be 0 if the user
didn't specify anything, or if the user specified 0.

Either way, the code has to handle 0, so I propose that we leave it as is.


>
> > option cmdprio_class has minval 1, however it can still be 0 if the user
> > omitted the option and only used option cmdprio to specify a priority level.
>
> Hmmm... class 0 is "NONE", so no priority. In the kernel, that will be changed
> to the default BE class. So maybe we should do the same here to avoid this weird
> case ? (setting a prio level with class NONE does not make any sense).

For os/os-linux.h in fio, this is how ioprio_value() looks:

static inline int ioprio_value(int ioprio_class, int ioprio)
{
	/*
	 * If no class is set, assume BE
	 */
	if (!ioprio_class)
		ioprio_class = IOPRIO_CLASS_BE;

	return (ioprio_class << IOPRIO_CLASS_SHIFT) | ioprio;
}

So it will already use BE class when you don't specify a class.

However, since the man page says that if you omit a class for
cmdprio_percentage/cmdprio_bssplit, the highest priority class (RT)
should be used. So for cmdprio_percentage/cmdprio_bssplit, ioprio_value()
will therefore never get in class==0. (cmdprio has already changed it to RT.)
Therefore, the ioprio_value() function will only change class to BE when
using option prio, without specifying option prioclass.

Either case, since ioprio_value() is used, for prio/prioclass or an IO
overloaded with a cmdprio priority value, fio will never send class==0
to the kernel, so I think that what you are suggesting is already there.


>
>
> >
> > I was thinking that cmdprio_value is only used when the percentage != 0,
> > which is the case in my cmdprio branch that allows you to have different
> > cmdprio class + levels, but you are right, in this specific patch
> > fio_cmdprio_ioprio_was_updated()/fio_cmdprio_set_ioprio() still uses
> > cmdprio->class to determine if IO_U_F_HIGH_PRIO flag should be set,
> > even when fio_cmdprio_percentage() returned zero..
> >
> > The question is, if percentage == 0, for e.g. writes,
> > but the user specified cmdprio=3
> > (which sets cmdprio->level[] for both DDIR read and write),
> > should we set the HIGH_PRIO / LOW_PRIO flag?
>
> That is a non question... If percentage == 0, fio_cmdprio_set_ioprio() should
> NOT try to change the io_u prio at all. Then the io_u HIGH/LOW flag will depend
> on the default priority. That is the same as for the 0 < percentage < 100 case
> when an io_u does not get a "hit" on the percentage and fio_cmdprio_set_ioprio()
> does not change its priority. With percentage == 0, you will never get a "hit".

I agree, as that is exactly what I wrote in the next sentence :)
Here:

> > I think the best way is to simply not set any of the flags,
> > If cmdprio percentage is 0 for a certain ddir, then everything
> > will be sent with the default prio.


>
> For libaio (io_uring is similar), the code in fio_libaio_prio_prep() is:
>
>         if (p && rand_between(&td->prio_state, 0, 99) < p) {
>                 io_u->ioprio = cmdprio_value;
>                 io_u->iocb.aio_reqprio = cmdprio_value;
>                 io_u->iocb.u.c.flags |= IOCB_FLAG_IOPRIO;
>                 if (!td->ioprio || cmdprio_value < td->ioprio) {
>                         /*
>                          * The async IO priority is higher (has a lower value)
>                          * than the default context priority.
>                          */
>                         io_u->flags |= IO_U_F_HIGH_PRIO;
>                 }
>         } else if (td->ioprio && td->ioprio < cmdprio_value) {
>                 /*
>                  * The IO will be executed with the default context priority,
>                  * and this priority is higher (has a lower value) than the
>                  * async IO priority.
>                  */
>                 io_u->flags |= IO_U_F_HIGH_PRIO;
>         }
>
> So if percentage == 0, p is 0 and the io_u flag is set according to the default
> prio. What you are suggesting is already there and should not change.

Agreed, and this is how I decided to do things in my V2 series,
which is already on the fio mailing list and on your opensource.wdc.com email.
Please review! :)


What I don't like with the existing behavior is that even when
fio_cmdprio_percentage() returns zero (p == 0), you will still
perform } else if (td->ioprio && td->ioprio < cmdprio_value) {
and potentially set the HIGH_PRIO flag.

E.g. if you have a bssplit with both reads and writes, and
cmdprio_bssplit for read DDIR has entries with non-zero values,
but write DDIR only has entries with zero values.
Then for write DDIR, you could still set the HIGH_PRIO flag..
even though fio_cmdprio_percentage() returns 0 for all blocksizes
for write DDIR.. yucky..

Anyway, since stat.c only prints something if there is both HIGH
and LOW entries, this shouldn't produce incorrect output.

Anyway, I kept this existing (albeit a bit ugly, but working)
behavior in my V2, but in the upcoming series, where we have a
clat_prio array, we will only ever touch stuff, flags whatever,
if p != 0.


Kind regards,
Niklas

  reply	other threads:[~2021-11-10 11:34 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-09  0:28 [PATCH 0/8] cmdprio cleanup series Niklas Cassel
2021-11-09  0:28 ` [PATCH 1/8] docs: update cmdprio_percentage documentation Niklas Cassel
2021-11-09  6:09   ` Damien Le Moal
2021-11-09  0:28 ` [PATCH 2/8] cmdprio: move cmdprio function definitions to a new cmdprio.c file Niklas Cassel
2021-11-09  6:11   ` Damien Le Moal
2021-11-09  0:28 ` [PATCH 3/8] cmdprio: do not allocate memory for unused data direction Niklas Cassel
2021-11-09  6:12   ` Damien Le Moal
2021-11-09  0:28 ` [PATCH 4/8] io_uring: set async IO priority to td->ioprio in fio_ioring_prep() Niklas Cassel
2021-11-09  6:16   ` Damien Le Moal
2021-11-09  0:28 ` [PATCH 5/8] libaio,io_uring: rename prio_prep() to include cmdprio in the name Niklas Cassel
2021-11-09  6:19   ` Damien Le Moal
2021-11-09  0:28 ` [PATCH 6/8] libaio,io_uring: move common cmdprio_prep() code to cmdprio Niklas Cassel
2021-11-09  6:29   ` Damien Le Moal
2021-11-09 10:43     ` Niklas Cassel
2021-11-09  0:28 ` [PATCH 7/8] cmdprio: add mode to make the logic easier to reason about Niklas Cassel
2021-11-09  6:38   ` Damien Le Moal
2021-11-09 13:29     ` Niklas Cassel
2021-11-09 23:17       ` Niklas Cassel
2021-11-10  5:57       ` Damien Le Moal
2021-11-10 11:34         ` Niklas Cassel [this message]
2021-11-09  0:28 ` [PATCH 8/8] libaio,io_uring: make it possible to cleanup cmdprio malloced data Niklas Cassel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YYuuMU4fte20JA/l@x1-carbon \
    --to=niklas.cassel@wdc.com \
    --cc=Damien.LeMoal@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=damien.lemoal@opensource.wdc.com \
    --cc=fio@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox