linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/2] Enabling ATA Command Priorities
@ 2016-10-11 20:40 Adam Manzanares
  2016-10-11 20:40 ` [PATCH v3 1/2] block: Add iocontext priority to request Adam Manzanares
  2016-10-11 20:40 ` [PATCH v3 2/2] ata: Enabling ATA Command Priorities Adam Manzanares
  0 siblings, 2 replies; 6+ messages in thread
From: Adam Manzanares @ 2016-10-11 20:40 UTC (permalink / raw)
  To: axboe, tj; +Cc: linux-block, linux-ide, Adam Manzanares

This patch builds ATA commands with high priority if the iocontext
of a process is set to real time. The goal of the patch is to
improve tail latencies of workloads that use higher queue depths.

This patch has been tested with an Ultrastar HE8 HDD and cuts the 
the p99.99 tail latency of foreground IO from 2s down to 72ms when
using the deadline scheduler. This patch works independently of the
scheduler so it can be used with all of the currently available 
request based schedulers. 

Foreground IO, for the previously described results, is an async fio job 
submitting 4K read requests at a QD of 1 to the HDD. The foreground IO is set 
with the iopriority class of real time. The background workload is another fio
job submitting read requests at a QD of 32 to the same HDD with default 
iopriority.

This feature is enabled by setting a queue flag that is exposed as a sysfs
entry named rq_ioc_prio. If this feature is enabled, and the submission 
iocontext exists, and the bio_prio is not valid then the request ioprio is
set to the iocontext prio.

v3:
 - Removed null dereference issue in blk-core
 - Renamed queue sysfs entries for clarity
 - Added documentation for sysfs queue entry

v2:
 - Add queue flag to set iopriority going to the request
 - If queue flag set, send iopriority class to ata_build_rw_tf
 - Remove redundant code in ata_ncq_prio_enabled function.


Adam Manzanares (2):
  block: Add iocontext priority to request
  ata: Enabling ATA Command Priorities

 Documentation/block/queue-sysfs.txt | 12 ++++++++++++
 block/blk-core.c                    |  5 +++++
 block/blk-sysfs.c                   | 32 ++++++++++++++++++++++++++++++++
 drivers/ata/libata-core.c           | 35 ++++++++++++++++++++++++++++++++++-
 drivers/ata/libata-scsi.c           | 10 +++++++++-
 drivers/ata/libata.h                |  2 +-
 include/linux/ata.h                 |  6 ++++++
 include/linux/blkdev.h              |  3 +++
 include/linux/libata.h              | 18 ++++++++++++++++++
 9 files changed, 120 insertions(+), 3 deletions(-)

-- 
2.1.4


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v3 1/2] block: Add iocontext priority to request
  2016-10-11 20:40 [PATCH v3 0/2] Enabling ATA Command Priorities Adam Manzanares
@ 2016-10-11 20:40 ` Adam Manzanares
  2016-10-12 14:49   ` Jens Axboe
  2016-10-11 20:40 ` [PATCH v3 2/2] ata: Enabling ATA Command Priorities Adam Manzanares
  1 sibling, 1 reply; 6+ messages in thread
From: Adam Manzanares @ 2016-10-11 20:40 UTC (permalink / raw)
  To: axboe, tj; +Cc: linux-block, linux-ide, Adam Manzanares

Patch adds an association between iocontext ioprio and the ioprio of
a request. This feature is only enabled if a queue flag is set to
indicate that requests should have ioprio associated with them. The
queue flag is exposed as the req_prio queue sysfs entry.

Signed-off-by: Adam Mananzanares <adam.manzanares@hgst.com>
---
 Documentation/block/queue-sysfs.txt | 12 ++++++++++++
 block/blk-core.c                    |  5 +++++
 block/blk-sysfs.c                   | 32 ++++++++++++++++++++++++++++++++
 include/linux/blkdev.h              |  3 +++
 4 files changed, 52 insertions(+)

diff --git a/Documentation/block/queue-sysfs.txt b/Documentation/block/queue-sysfs.txt
index 2a39040..3ca4e8f 100644
--- a/Documentation/block/queue-sysfs.txt
+++ b/Documentation/block/queue-sysfs.txt
@@ -144,6 +144,18 @@ For storage configurations that need to maximize distribution of completion
 processing setting this option to '2' forces the completion to run on the
 requesting cpu (bypassing the "group" aggregation logic).
 
+rq_ioc_prio (RW)
+----------------
+If this option is '1', and there is a valid iocontext associated with the
+issuing context, and the bio we are processing does not have a valid
+prio, then we save the prio value from the iocontext with the request.
+
+This feature can be combined with device drivers that are aware of prio
+values in order to handle prio accordingly. An example would be if the ata
+layer recognizes prio and creates ata commands with high priority and sends
+them to the device. If the hardware supports priorities for commands then
+this has the potential to speed up response times for high priority IO.
+
 scheduler (RW)
 --------------
 When read, this file will display the current and available IO schedulers
diff --git a/block/blk-core.c b/block/blk-core.c
index 14d7c07..2e740c4 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -33,6 +33,7 @@
 #include <linux/ratelimit.h>
 #include <linux/pm_runtime.h>
 #include <linux/blk-cgroup.h>
+#include <linux/ioprio.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/block.h>
@@ -1648,6 +1649,7 @@ unsigned int blk_plug_queued_count(struct request_queue *q)
 
 void init_request_from_bio(struct request *req, struct bio *bio)
 {
+	struct io_context *ioc = rq_ioc(bio);
 	req->cmd_type = REQ_TYPE_FS;
 
 	req->cmd_flags |= bio->bi_opf & REQ_COMMON_MASK;
@@ -1657,6 +1659,9 @@ void init_request_from_bio(struct request *req, struct bio *bio)
 	req->errors = 0;
 	req->__sector = bio->bi_iter.bi_sector;
 	req->ioprio = bio_prio(bio);
+	if (blk_queue_rq_ioc_prio(req->q) && !ioprio_valid(req->ioprio) && ioc)
+		req->ioprio = ioc->ioprio;
+
 	blk_rq_bio_prep(req->q, req, bio);
 }
 
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 9cc8d7c..a9c5105 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -384,6 +384,31 @@ static ssize_t queue_dax_show(struct request_queue *q, char *page)
 	return queue_var_show(blk_queue_dax(q), page);
 }
 
+static ssize_t queue_rq_ioc_prio_show(struct request_queue *q, char *page)
+{
+	return queue_var_show(blk_queue_rq_ioc_prio(q), page);
+}
+
+static ssize_t queue_rq_ioc_prio_store(struct request_queue *q,
+				       const char *page, size_t count)
+{
+	unsigned long rq_ioc_prio_on;
+	ssize_t ret;
+
+	ret = queue_var_store(&rq_ioc_prio_on, page, count);
+	if (ret < 0)
+		return ret;
+
+	spin_lock_irq(q->queue_lock);
+	if (rq_ioc_prio_on)
+		queue_flag_set(QUEUE_FLAG_RQ_IOC_PRIO, q);
+	else
+		queue_flag_clear(QUEUE_FLAG_RQ_IOC_PRIO, q);
+	spin_unlock_irq(q->queue_lock);
+
+	return ret;
+}
+
 static struct queue_sysfs_entry queue_requests_entry = {
 	.attr = {.name = "nr_requests", .mode = S_IRUGO | S_IWUSR },
 	.show = queue_requests_show,
@@ -526,6 +551,12 @@ static struct queue_sysfs_entry queue_dax_entry = {
 	.show = queue_dax_show,
 };
 
+static struct queue_sysfs_entry queue_rq_ioc_prio_entry = {
+	.attr = {.name = "rq_ioc_prio", .mode = S_IRUGO | S_IWUSR },
+	.show = queue_rq_ioc_prio_show,
+	.store = queue_rq_ioc_prio_store,
+};
+
 static struct attribute *default_attrs[] = {
 	&queue_requests_entry.attr,
 	&queue_ra_entry.attr,
@@ -553,6 +584,7 @@ static struct attribute *default_attrs[] = {
 	&queue_poll_entry.attr,
 	&queue_wc_entry.attr,
 	&queue_dax_entry.attr,
+	&queue_rq_ioc_prio_entry.attr,
 	NULL,
 };
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index c47c358..63b842a 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -505,6 +505,7 @@ struct request_queue {
 #define QUEUE_FLAG_FUA	       24	/* device supports FUA writes */
 #define QUEUE_FLAG_FLUSH_NQ    25	/* flush not queueuable */
 #define QUEUE_FLAG_DAX         26	/* device supports DAX */
+#define QUEUE_FLAG_RQ_IOC_PRIO 27	/* Use iocontext ioprio */
 
 #define QUEUE_FLAG_DEFAULT	((1 << QUEUE_FLAG_IO_STAT) |		\
 				 (1 << QUEUE_FLAG_STACKABLE)	|	\
@@ -595,6 +596,8 @@ static inline void queue_flag_clear(unsigned int flag, struct request_queue *q)
 #define blk_queue_secure_erase(q) \
 	(test_bit(QUEUE_FLAG_SECERASE, &(q)->queue_flags))
 #define blk_queue_dax(q)	test_bit(QUEUE_FLAG_DAX, &(q)->queue_flags)
+#define blk_queue_rq_ioc_prio(q) \
+	test_bit(QUEUE_FLAG_RQ_IOC_PRIO, &(q)->queue_flags)
 
 #define blk_noretry_request(rq) \
 	((rq)->cmd_flags & (REQ_FAILFAST_DEV|REQ_FAILFAST_TRANSPORT| \
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v3 2/2] ata: Enabling ATA Command Priorities
  2016-10-11 20:40 [PATCH v3 0/2] Enabling ATA Command Priorities Adam Manzanares
  2016-10-11 20:40 ` [PATCH v3 1/2] block: Add iocontext priority to request Adam Manzanares
@ 2016-10-11 20:40 ` Adam Manzanares
  1 sibling, 0 replies; 6+ messages in thread
From: Adam Manzanares @ 2016-10-11 20:40 UTC (permalink / raw)
  To: axboe, tj; +Cc: linux-block, linux-ide, Adam Manzanares

This patch checks to see if an ATA device supports NCQ command priorities.
If so and the user has specified an iocontext that indicates IO_PRIO_CLASS_RT
and also enables request priorities in the block queue then we build a tf
with a high priority command.

This patch depends on patch block-Add-iocontext-priority-to-request

Signed-off-by: Adam Manzanares <adam.manzanares@hgst.com>
---
 drivers/ata/libata-core.c | 35 ++++++++++++++++++++++++++++++++++-
 drivers/ata/libata-scsi.c | 10 +++++++++-
 drivers/ata/libata.h      |  2 +-
 include/linux/ata.h       |  6 ++++++
 include/linux/libata.h    | 18 ++++++++++++++++++
 5 files changed, 68 insertions(+), 3 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 223a770..181b530 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -739,6 +739,7 @@ u64 ata_tf_read_block(const struct ata_taskfile *tf, struct ata_device *dev)
  *	@n_block: Number of blocks
  *	@tf_flags: RW/FUA etc...
  *	@tag: tag
+ *	@class: IO priority class
  *
  *	LOCKING:
  *	None.
@@ -753,7 +754,7 @@ u64 ata_tf_read_block(const struct ata_taskfile *tf, struct ata_device *dev)
  */
 int ata_build_rw_tf(struct ata_taskfile *tf, struct ata_device *dev,
 		    u64 block, u32 n_block, unsigned int tf_flags,
-		    unsigned int tag)
+		    unsigned int tag, int class)
 {
 	tf->flags |= ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE;
 	tf->flags |= tf_flags;
@@ -785,6 +786,12 @@ int ata_build_rw_tf(struct ata_taskfile *tf, struct ata_device *dev,
 		tf->device = ATA_LBA;
 		if (tf->flags & ATA_TFLAG_FUA)
 			tf->device |= 1 << 7;
+
+		if (ata_ncq_prio_enabled(dev)) {
+			if (class == IOPRIO_CLASS_RT)
+				tf->hob_nsect |= ATA_PRIO_HIGH <<
+						 ATA_SHIFT_PRIO;
+		}
 	} else if (dev->flags & ATA_DFLAG_LBA) {
 		tf->flags |= ATA_TFLAG_LBA;
 
@@ -2156,6 +2163,30 @@ static void ata_dev_config_ncq_non_data(struct ata_device *dev)
 	}
 }
 
+static void ata_dev_config_ncq_prio(struct ata_device *dev)
+{
+	struct ata_port *ap = dev->link->ap;
+	unsigned int err_mask;
+
+	err_mask = ata_read_log_page(dev,
+				     ATA_LOG_SATA_ID_DEV_DATA,
+				     ATA_LOG_SATA_SETTINGS,
+				     ap->sector_buf,
+				     1);
+	if (err_mask) {
+		ata_dev_dbg(dev,
+			    "failed to get Identify Device data, Emask 0x%x\n",
+			    err_mask);
+		return;
+	}
+
+	if (ap->sector_buf[ATA_LOG_NCQ_PRIO_OFFSET] & BIT(3))
+		dev->flags |= ATA_DFLAG_NCQ_PRIO;
+	else
+		ata_dev_dbg(dev, "SATA page does not support priority\n");
+
+}
+
 static int ata_dev_config_ncq(struct ata_device *dev,
 			       char *desc, size_t desc_sz)
 {
@@ -2205,6 +2236,8 @@ static int ata_dev_config_ncq(struct ata_device *dev,
 			ata_dev_config_ncq_send_recv(dev);
 		if (ata_id_has_ncq_non_data(dev->id))
 			ata_dev_config_ncq_non_data(dev);
+		if (ata_id_has_ncq_prio(dev->id))
+			ata_dev_config_ncq_prio(dev);
 	}
 
 	return 0;
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index e207b33..4304694 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -50,6 +50,7 @@
 #include <linux/uaccess.h>
 #include <linux/suspend.h>
 #include <asm/unaligned.h>
+#include <linux/ioprio.h>
 
 #include "libata.h"
 #include "libata-transport.h"
@@ -1757,6 +1758,8 @@ static unsigned int ata_scsi_rw_xlat(struct ata_queued_cmd *qc)
 {
 	struct scsi_cmnd *scmd = qc->scsicmd;
 	const u8 *cdb = scmd->cmnd;
+	struct request *rq = scmd->request;
+	int class = 0;
 	unsigned int tf_flags = 0;
 	u64 block;
 	u32 n_block;
@@ -1822,8 +1825,13 @@ static unsigned int ata_scsi_rw_xlat(struct ata_queued_cmd *qc)
 	qc->flags |= ATA_QCFLAG_IO;
 	qc->nbytes = n_block * scmd->device->sector_size;
 
+	/* If queue supports req prio pass it onto the task file */
+	if (blk_queue_rq_ioc_prio(rq->q))
+		class = IOPRIO_PRIO_CLASS(req_get_ioprio(rq));
+
 	rc = ata_build_rw_tf(&qc->tf, qc->dev, block, n_block, tf_flags,
-			     qc->tag);
+			     qc->tag, class);
+
 	if (likely(rc == 0))
 		return 0;
 
diff --git a/drivers/ata/libata.h b/drivers/ata/libata.h
index 3b301a4..8f3a559 100644
--- a/drivers/ata/libata.h
+++ b/drivers/ata/libata.h
@@ -66,7 +66,7 @@ extern u64 ata_tf_to_lba48(const struct ata_taskfile *tf);
 extern struct ata_queued_cmd *ata_qc_new_init(struct ata_device *dev, int tag);
 extern int ata_build_rw_tf(struct ata_taskfile *tf, struct ata_device *dev,
 			   u64 block, u32 n_block, unsigned int tf_flags,
-			   unsigned int tag);
+			   unsigned int tag, int class);
 extern u64 ata_tf_read_block(const struct ata_taskfile *tf,
 			     struct ata_device *dev);
 extern unsigned ata_exec_internal(struct ata_device *dev,
diff --git a/include/linux/ata.h b/include/linux/ata.h
index adbc812..ebe4c3b 100644
--- a/include/linux/ata.h
+++ b/include/linux/ata.h
@@ -347,6 +347,7 @@ enum {
 	ATA_LOG_DEVSLP_DETO	  = 0x01,
 	ATA_LOG_DEVSLP_VALID	  = 0x07,
 	ATA_LOG_DEVSLP_VALID_MASK = 0x80,
+	ATA_LOG_NCQ_PRIO_OFFSET   = 0x09,
 
 	/* NCQ send and receive log */
 	ATA_LOG_NCQ_SEND_RECV_SUBCMDS_OFFSET	= 0x00,
@@ -897,6 +898,11 @@ static inline bool ata_id_has_ncq_non_data(const u16 *id)
 	return id[ATA_ID_SATA_CAPABILITY_2] & BIT(5);
 }
 
+static inline bool ata_id_has_ncq_prio(const u16 *id)
+{
+	return id[ATA_ID_SATA_CAPABILITY] & BIT(12);
+}
+
 static inline bool ata_id_has_trim(const u16 *id)
 {
 	if (ata_id_major_version(id) >= 7 &&
diff --git a/include/linux/libata.h b/include/linux/libata.h
index e37d4f9..244f261 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -165,6 +165,7 @@ enum {
 	ATA_DFLAG_NO_UNLOAD	= (1 << 17), /* device doesn't support unload */
 	ATA_DFLAG_UNLOCK_HPA	= (1 << 18), /* unlock HPA */
 	ATA_DFLAG_NCQ_SEND_RECV = (1 << 19), /* device supports NCQ SEND and RECV */
+	ATA_DFLAG_NCQ_PRIO	= (1 << 20), /* device supports NCQ priority */
 	ATA_DFLAG_INIT_MASK	= (1 << 24) - 1,
 
 	ATA_DFLAG_DETACH	= (1 << 24),
@@ -341,7 +342,9 @@ enum {
 	ATA_SHIFT_PIO		= 0,
 	ATA_SHIFT_MWDMA		= ATA_SHIFT_PIO + ATA_NR_PIO_MODES,
 	ATA_SHIFT_UDMA		= ATA_SHIFT_MWDMA + ATA_NR_MWDMA_MODES,
+	ATA_SHIFT_PRIO		= 6,
 
+	ATA_PRIO_HIGH		= 2,
 	/* size of buffer to pad xfers ending on unaligned boundaries */
 	ATA_DMA_PAD_SZ		= 4,
 
@@ -1609,6 +1612,21 @@ static inline int ata_ncq_enabled(struct ata_device *dev)
 			      ATA_DFLAG_NCQ)) == ATA_DFLAG_NCQ;
 }
 
+/**
+ *	ata_ncq_prio_enabled - Test whether NCQ prio is enabled
+ *	@dev: ATA device to test for
+ *
+ *	LOCKING:
+ *	spin_lock_irqsave(host lock)
+ *
+ *	RETURNS:
+ *	1 if NCQ prio is enabled for @dev, 0 otherwise.
+ */
+static inline int ata_ncq_prio_enabled(struct ata_device *dev)
+{
+	return (dev->flags & ATA_DFLAG_NCQ_PRIO) == ATA_DFLAG_NCQ_PRIO;
+}
+
 static inline bool ata_fpdma_dsm_supported(struct ata_device *dev)
 {
 	return (dev->flags & ATA_DFLAG_NCQ_SEND_RECV) &&
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 1/2] block: Add iocontext priority to request
  2016-10-11 20:40 ` [PATCH v3 1/2] block: Add iocontext priority to request Adam Manzanares
@ 2016-10-12 14:49   ` Jens Axboe
  2016-10-12 17:58     ` Adam Manzanares
  0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2016-10-12 14:49 UTC (permalink / raw)
  To: Adam Manzanares, tj; +Cc: linux-block, linux-ide

On 10/11/2016 02:40 PM, Adam Manzanares wrote:
> Patch adds an association between iocontext ioprio and the ioprio of
> a request. This feature is only enabled if a queue flag is set to
> indicate that requests should have ioprio associated with them. The
> queue flag is exposed as the req_prio queue sysfs entry.

Honestly, I don't get this patchset. For the normal file system path, we
inherit the iocontext priority into the bio. That in turns gets
inherited by the request. Why is this any different?

It'd be much cleaner to just have 'rq' inherit the IO priority from the
io context when the 'rq' is allocated, and ensuring that we inherit and
override this if the bio has one set instead. It should already work
like that.

And in no way should we add some sysfs file to control this, that is
nuts.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 1/2] block: Add iocontext priority to request
  2016-10-12 14:49   ` Jens Axboe
@ 2016-10-12 17:58     ` Adam Manzanares
  2016-10-12 21:05       ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Adam Manzanares @ 2016-10-12 17:58 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Adam Manzanares, tj, linux-block, linux-ide

The 10/12/2016 08:49, Jens Axboe wrote:
> On 10/11/2016 02:40 PM, Adam Manzanares wrote:
> >Patch adds an association between iocontext ioprio and the ioprio of
> >a request. This feature is only enabled if a queue flag is set to
> >indicate that requests should have ioprio associated with them. The
> >queue flag is exposed as the req_prio queue sysfs entry.
> 
> Honestly, I don't get this patchset. For the normal file system path, we
> inherit the iocontext priority into the bio. That in turns gets
> inherited by the request. Why is this any different?
>
I was hoping this was true before I started looking into this, but the 
iocontext priority is not set on the bio. I did look into setting the
iocontext priority on the bio, and this would have do be done close 
in the call stack to init_request_from_bio so I chose to set it here.

If I missed where this occurs I would appreciate it if you pointed me to the 
code where the bio gets the iopriority from the iocontext. 

> It'd be much cleaner to just have 'rq' inherit the IO priority from the
> io context when the 'rq' is allocated, and ensuring that we inherit and
> override this if the bio has one set instead. It should already work
> like that.

I looked into the request allocation code and the only place I see where 
the iocontext is associated with the request is through a icq. Looking at 
the documentation of the icq it states that the icq_size of the elevator 
has to be set in order for block core to manage this. The only scheduler 
using this currently is the cfq scheduler and I think prioritized requests 
should be independent of the scheduler used. 

I agree that this should make it into the code where the rq is allocated.

Again if I have missed something please point me to the relevant code and 
I will modify the patch as necessary.

> 
> And in no way should we add some sysfs file to control this, that is
> nuts.

My concern is that we will now be exposing priority information to lower 
layers and if there happens to be code that uses the priority now it will 
actually make an impact. My example being the fusion mptsas driver. 

The second issue I foresee is that lower level drivers will need a sysfs 
file to control whether or not we send prioritized commands to the device.
We are wary of sending prioritized commands by default when we are unsure
of how the device will respond to these prioritized commands.

The reason I propose a sysfs file in the queue is that it solves these two 
issues at the same time. 

I would appreciate it if you could suggest an alternative fix for these issues
or an explanation of why these concerns are not valid.

> 
> -- 
> Jens Axboe
> 

Take care,
Adam

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 1/2] block: Add iocontext priority to request
  2016-10-12 17:58     ` Adam Manzanares
@ 2016-10-12 21:05       ` Jens Axboe
  0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2016-10-12 21:05 UTC (permalink / raw)
  To: Adam Manzanares; +Cc: Adam Manzanares, tj, linux-block, linux-ide

On 10/12/2016 11:58 AM, Adam Manzanares wrote:
> The 10/12/2016 08:49, Jens Axboe wrote:
>> On 10/11/2016 02:40 PM, Adam Manzanares wrote:
>>> Patch adds an association between iocontext ioprio and the ioprio of
>>> a request. This feature is only enabled if a queue flag is set to
>>> indicate that requests should have ioprio associated with them. The
>>> queue flag is exposed as the req_prio queue sysfs entry.
>>
>> Honestly, I don't get this patchset. For the normal file system path, we
>> inherit the iocontext priority into the bio. That in turns gets
>> inherited by the request. Why is this any different?
>>
> I was hoping this was true before I started looking into this, but the
> iocontext priority is not set on the bio. I did look into setting the
> iocontext priority on the bio, and this would have do be done close
> in the call stack to init_request_from_bio so I chose to set it here.
>
> If I missed where this occurs I would appreciate it if you pointed me
> to the code where the bio gets the iopriority from the iocontext.

It currently happens in CFQ, since that is what deals with priorities.
But we should just make that the case, generically. The rule is that if
priority is set in a bio, that is what we'll use. But if not, we'll
use what the application has set in general, and that is available in
the io context.

>> It'd be much cleaner to just have 'rq' inherit the IO priority from the
>> io context when the 'rq' is allocated, and ensuring that we inherit and
>> override this if the bio has one set instead. It should already work
>> like that.
>
> I looked into the request allocation code and the only place I see where
> the iocontext is associated with the request is through a icq. Looking at
> the documentation of the icq it states that the icq_size of the elevator
> has to be set in order for block core to manage this. The only scheduler
> using this currently is the cfq scheduler and I think prioritized requests
> should be independent of the scheduler used.

Agree, see above, it should just happen generically.

>> And in no way should we add some sysfs file to control this, that is
>> nuts.
>
> My concern is that we will now be exposing priority information to lower
> layers and if there happens to be code that uses the priority now it will
> actually make an impact. My example being the fusion mptsas driver.
>
> The second issue I foresee is that lower level drivers will need a sysfs
> file to control whether or not we send prioritized commands to the device.
> We are wary of sending prioritized commands by default when we are unsure
> of how the device will respond to these prioritized commands.
>
> The reason I propose a sysfs file in the queue is that it solves these two
> issues at the same time.
>
> I would appreciate it if you could suggest an alternative fix for
> these issues or an explanation of why these concerns are not valid.

At that point it should be a driver knob, it's not something that should
be part of the generic block layer.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-10-12 21:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-11 20:40 [PATCH v3 0/2] Enabling ATA Command Priorities Adam Manzanares
2016-10-11 20:40 ` [PATCH v3 1/2] block: Add iocontext priority to request Adam Manzanares
2016-10-12 14:49   ` Jens Axboe
2016-10-12 17:58     ` Adam Manzanares
2016-10-12 21:05       ` Jens Axboe
2016-10-11 20:40 ` [PATCH v3 2/2] ata: Enabling ATA Command Priorities Adam Manzanares

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).