From: Damien Le Moal <damien.lemoal@opensource.wdc.com>
To: Hannes Reinecke <hare@suse.de>,
"Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Bart Van Assche <bvanassche@acm.org>,
Niklas Cassel <Niklas.Cassel@wdc.com>,
Paolo Valente <paolo.valente@linaro.org>,
Jens Axboe <axboe@kernel.dk>, Christoph Hellwig <hch@lst.de>,
"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
"linux-ide@vger.kernel.org" <linux-ide@vger.kernel.org>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>
Subject: Re: [PATCH v3 01/18] block: introduce duration-limits priority class
Date: Mon, 30 Jan 2023 23:55:06 +0900 [thread overview]
Message-ID: <0a2b9ba7-cc3f-400b-6a8a-c38ba269af75@opensource.wdc.com> (raw)
In-Reply-To: <f58200e2-1652-c726-ceaf-be78feaab1bc@suse.de>
On 1/30/23 22:44, Hannes Reinecke wrote:
> On 1/29/23 04:52, Damien Le Moal wrote:
>> On 1/29/23 05:25, Martin K. Petersen wrote:
> [ .. ]
>>>
>>> As such, I don't like the "just customize your settings with
>>> cdltools" approach. I'd much rather see us try to define a few QoS
>>> classes that make sense that would apply to every app and use those
>>> to define the application interface. And then have the kernel program
>>> those CDL classes into SCSI/ATA devices by default.
>>
>> Makes sense. Though I think it will be hard to define a set of QoS hints that
>> are useful for a wide range of applications, and even harder to convert the
>> defined hint classes to CDL descriptors. I fear that we may end up with the same
>> issues as IO hints/streams.
>>
>>> Having the kernel provide an abstract interface for bio QoS and
>>> configuring a new disk with a sane handful of classes does not
>>> prevent $CLOUD_VENDOR from overriding what Linux configured. But at
>>> least we'd have a generic approach to block QoS in Linux. Similar to
>>> the existing I/O priority infrastructure which is also not tied to
>>> any particular hardware feature.
>>
>> OK. See below about this.
>>
>>> A generic implementation also allows us to do fancy things in the
>>> hypervisor where we would like to be able to do QoS across multiple
>>> devices as well. Without having ATA or SCSI with CDL involved. Or
>>> whatever things might look like in NVMe.
>>
>> Fair point, especially given that virtio actually already forwards a guest
>> ioprio to the host through the virtio block command. Thinking of that particular
>> point together with what you said, I came up with the change show below as a
>> replacement for this patch 1/18.
>>
>> This changes the 13-bits ioprio data into a 3-bits QOS hint + 3-bits of IO prio
>> level. This is consistent with the IO prio interface since IO priority levels
>> have to be between 0 and 7 (otherwise, errors are returned). So in fact, the
>> upper 10-bits of the ioprio data are ignored and we can safely use 3 of these
>> bits for an IO hint.
>>
>> This hint applies to all priority classes and levels, that is, for the CDL case,
>> we can enrich any priority with a hint that specifies the CDL index to use for
>> an IO.
>>
>> This falls short of actually defining generic IO hints, but this has the
>> advantage to not break anything for current applications using IO priorities,
>> not require any change to existing IO schedulers, while still allowing to pass
>> CDL indexes for IOs down to the scsi & ATA layers (which for now would be the
>> only layers in the kernel acting on the ioprio qos hints).
>>
>> I think that this approach still allows us to enable CDL support, and on top of
>> it, go further and define generic QOS hints that IO scheduler can use and that
>> also potentially map to CDL for scsi & ata (similarly to the RT class IOs
>> mapping to the NCQ priority feature if the user enabled that feature).
>>
>> As mentioned above, I think that defining generic IO hint classes will be
>> difficult. But the change below is I think a good a starting point that should
>> not prevent working on that.
>>
>> Thoughts ?
>>
> I like the idea.
> QoS is one of the recurring topic always coming up sooner or later when
> talking of storage networks, so having _some_ concept of QoS in the
> linux kernel (for storage) would be beneficial.
>
> Maybe time for a topic at LSF?
Yes. I was hoping for a quicker resolution so that we can get the CDL
"mechanical" bits in, but without a nice API for it, we cannot :)
Trying to compile something with Niklas. So far, we are thinking of having
QOS flags + QOS data, the flags determining how (and if) the QOS data is used
and what it means.
Ex of things We could have:
* IOPRIO_QOS_FAILFAST: do not retry the IO if it fails the first time
* IOPRIO_QOS_DURATION_LIMIT: then the QOS data indicates the limit to use
(number). That can be implemented in schedulers and also map to CDL on drives
that support that feature.
That is the difficult part: what else ? For now, considering only our target of
adding scsi & ata CDL support, the above is enough. But is that enough in
general for most users/apps ?
>
> Cheers,
>
> Hannes
>
--
Damien Le Moal
Western Digital Research
next prev parent reply other threads:[~2023-01-30 14:55 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-24 19:02 [PATCH v3 00/18] Add Command Duration Limits support Niklas Cassel
2023-01-24 19:02 ` [PATCH v3 01/18] block: introduce duration-limits priority class Niklas Cassel
2023-01-24 19:27 ` Bart Van Assche
2023-01-24 20:36 ` Bart Van Assche
2023-01-24 21:48 ` Damien Le Moal
2023-01-24 21:29 ` Damien Le Moal
2023-01-24 22:43 ` Bart Van Assche
2023-01-24 22:59 ` Damien Le Moal
2023-01-25 0:05 ` Bart Van Assche
2023-01-25 1:19 ` Damien Le Moal
2023-01-25 18:37 ` Bart Van Assche
2023-01-25 21:23 ` Niklas Cassel
2023-01-26 0:24 ` Damien Le Moal
2023-01-26 13:53 ` Niklas Cassel
2023-01-26 17:33 ` Bart Van Assche
2023-01-27 0:18 ` Damien Le Moal
2023-01-27 1:40 ` Damien Le Moal
2023-01-27 17:23 ` Bart Van Assche
2023-01-28 0:40 ` Damien Le Moal
2023-01-28 0:47 ` Bart Van Assche
2023-01-28 0:59 ` Damien Le Moal
2023-01-28 20:25 ` Martin K. Petersen
2023-01-29 3:52 ` Damien Le Moal
2023-01-30 13:44 ` Hannes Reinecke
2023-01-30 14:55 ` Damien Le Moal [this message]
2023-01-30 19:24 ` Bart Van Assche
2023-01-30 20:40 ` Bart Van Assche
2023-01-31 2:49 ` Martin K. Petersen
2023-01-31 3:10 ` Damien Le Moal
2023-01-30 19:13 ` Bart Van Assche
2023-01-31 2:58 ` Martin K. Petersen
2023-01-31 3:03 ` Damien Le Moal
2023-01-25 23:11 ` Keith Busch
2023-01-26 0:08 ` Damien Le Moal
2023-01-26 5:26 ` Christoph Hellwig
2023-01-25 6:33 ` Christoph Hellwig
2023-01-27 12:43 ` Hannes Reinecke
2023-01-24 19:02 ` [PATCH v3 02/18] block: introduce BLK_STS_DURATION_LIMIT Niklas Cassel
2023-01-24 19:29 ` Bart Van Assche
2023-01-24 19:59 ` Keith Busch
2023-01-24 20:32 ` Bart Van Assche
2023-01-24 21:39 ` Damien Le Moal
2023-01-24 21:36 ` Damien Le Moal
2023-01-24 21:34 ` Damien Le Moal
2023-01-24 19:02 ` [PATCH v3 03/18] scsi: core: allow libata to complete successful commands via EH Niklas Cassel
2023-01-24 19:02 ` [PATCH v3 04/18] scsi: rename and move get_scsi_ml_byte() Niklas Cassel
2023-01-24 19:32 ` Bart Van Assche
2023-01-24 19:02 ` [PATCH v3 05/18] scsi: support retrieving sub-pages of mode pages Niklas Cassel
2023-01-24 19:34 ` Bart Van Assche
2023-01-24 19:02 ` [PATCH v3 06/18] scsi: support service action in scsi_report_opcode() Niklas Cassel
2023-01-24 19:36 ` Bart Van Assche
2023-01-24 19:02 ` [PATCH v3 07/18] scsi: sd: detect support for command duration limits Niklas Cassel
2023-01-24 19:39 ` Bart Van Assche
2023-01-27 13:00 ` Hannes Reinecke
2023-01-28 0:51 ` Damien Le Moal
2023-01-28 2:52 ` Bart Van Assche
2023-01-29 2:05 ` Damien Le Moal
2023-01-24 19:02 ` [PATCH v3 08/18] scsi: sd: set read/write commands CDL index Niklas Cassel
2023-01-27 15:30 ` Hannes Reinecke
2023-01-28 0:03 ` Damien Le Moal
2023-01-30 18:13 ` Hannes Reinecke
2023-01-24 19:02 ` [PATCH v3 09/18] scsi: sd: handle read/write CDL timeout failures Niklas Cassel
2023-01-27 15:34 ` Hannes Reinecke
2023-01-28 0:06 ` Damien Le Moal
2023-02-03 16:49 ` Niklas Cassel
2023-01-24 19:02 ` [PATCH v3 10/18] ata: libata-scsi: remove unnecessary !cmd checks Niklas Cassel
2023-01-27 15:35 ` Hannes Reinecke
2023-01-24 19:02 ` [PATCH v3 11/18] ata: libata: change ata_eh_request_sense() to not set CHECK_CONDITION Niklas Cassel
2023-01-27 15:36 ` Hannes Reinecke
2023-01-24 19:02 ` [PATCH v3 12/18] ata: libata: detect support for command duration limits Niklas Cassel
2023-01-24 19:02 ` [PATCH v3 13/18] ata: libata-scsi: handle CDL bits in ata_scsiop_maint_in() Niklas Cassel
2023-01-27 15:37 ` Hannes Reinecke
2023-01-24 19:03 ` [PATCH v3 14/18] ata: libata-scsi: add support for CDL pages mode sense Niklas Cassel
2023-01-27 15:38 ` Hannes Reinecke
2023-01-24 19:03 ` [PATCH v3 15/18] ata: libata: add ATA feature control sub-page translation Niklas Cassel
2023-01-27 15:40 ` Hannes Reinecke
2023-01-24 19:03 ` [PATCH v3 16/18] ata: libata: set read/write commands CDL index Niklas Cassel
2023-01-27 15:41 ` Hannes Reinecke
2023-01-24 19:03 ` [PATCH v3 17/18] ata: libata: handle completion of CDL commands using policy 0xD Niklas Cassel
2023-01-27 15:43 ` Hannes Reinecke
2023-01-24 19:03 ` [PATCH v3 18/18] Documentation: sysfs-block-device: document command duration limits Niklas Cassel
2023-01-27 15:43 ` Hannes Reinecke
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0a2b9ba7-cc3f-400b-6a8a-c38ba269af75@opensource.wdc.com \
--to=damien.lemoal@opensource.wdc.com \
--cc=Niklas.Cassel@wdc.com \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=linux-ide@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=paolo.valente@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox