All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/6] Improve the performance of F2FS on zoned UFS
@ 2023-07-26  0:57 Bart Van Assche
  2023-07-26  0:57 ` [PATCH v3 1/6] block: Introduce the flag REQ_NO_WRITE_LOCK Bart Van Assche
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Bart Van Assche @ 2023-07-26  0:57 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Bart Van Assche

Hi Jens,

This patch series improves write performance for zoned UFS devices. Please
consider these patches for the next merge window.

Thank you,

Bart.

Changes compared to v2:
 - Renamed the request queue flag for disabling zone write locking.
 - Introduced a new request flag for disabling zone write locking.
 - Modified the mq-deadline scheduler such that zone write locking is only
   disabled if both flags are set.
 - Added an F2FS patch that sets the request flag for disabling zone write
   locking.
 - Only disable zone write locking in the UFS driver if auto-hibernation is
   disabled.

Changes compared to v1:
 - Left out the patches that are already upstream.
 - Switched the approach in patch "scsi: Retry unaligned zoned writes" from
   retrying immediately to sending unaligned write commands to the SCSI error
   handler.

Bart Van Assche (5):
  block: Introduce a request queue flag for pipelining zoned writes
  block/mq-deadline: Only use zone locking if necessary
  block/null_blk: Add support for pipelining zoned writes
  scsi: Retry unaligned zoned writes
  scsi: ufs: Enable zoned write pipelining

Bart Van Assche (6):
  block: Introduce the flag REQ_NO_WRITE_LOCK
  block/mq-deadline: Only use zone locking if necessary
  block/null_blk: Support disabling zone write locking
  scsi: Retry unaligned zoned writes
  scsi: ufs: Disable zone write locking
  fs/f2fs: Disable zone write locking

 block/blk-flush.c                 |  3 ++-
 block/mq-deadline.c               | 28 ++++++++++++++-----
 drivers/block/null_blk/main.c     |  2 ++
 drivers/block/null_blk/null_blk.h |  1 +
 drivers/block/null_blk/zoned.c    |  3 +++
 drivers/scsi/scsi_error.c         | 37 +++++++++++++++++++++++++
 drivers/scsi/scsi_lib.c           |  1 +
 drivers/scsi/sd.c                 |  3 +++
 drivers/ufs/core/ufshcd.c         | 45 ++++++++++++++++++++++++++++---
 fs/f2fs/data.c                    |  1 +
 include/linux/blk_types.h         |  8 ++++++
 include/scsi/scsi.h               |  1 +
 12 files changed, 123 insertions(+), 10 deletions(-)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3 1/6] block: Introduce the flag REQ_NO_WRITE_LOCK
  2023-07-26  0:57 [PATCH v3 0/6] Improve the performance of F2FS on zoned UFS Bart Van Assche
@ 2023-07-26  0:57 ` Bart Van Assche
  2023-07-26  8:37   ` Damien Le Moal
  2023-07-26  0:57 ` [PATCH v3 2/6] block/mq-deadline: Only use zone locking if necessary Bart Van Assche
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2023-07-26  0:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Bart Van Assche,
	Damien Le Moal, Ming Lei

Not all software that supports zoned storage allocates and submits zoned
writes in LBA order per zone. Introduce the REQ_NO_WRITE_LOCK flag such
that submitters of zoned writes can indicate that zoned writes are
allocated and submitted in LBA order per zone.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-flush.c         | 3 ++-
 include/linux/blk_types.h | 8 ++++++++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/block/blk-flush.c b/block/blk-flush.c
index e73dc22d05c1..038350ed7cce 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -336,7 +336,8 @@ static void blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq,
 		flush_rq->internal_tag = first_rq->internal_tag;
 
 	flush_rq->cmd_flags = REQ_OP_FLUSH | REQ_PREFLUSH;
-	flush_rq->cmd_flags |= (flags & REQ_DRV) | (flags & REQ_FAILFAST_MASK);
+	flush_rq->cmd_flags |= flags &
+		(REQ_FAILFAST_MASK | REQ_NO_ZONE_WRITE_LOCK | REQ_DRV);
 	flush_rq->rq_flags |= RQF_FLUSH_SEQ;
 	flush_rq->end_io = flush_end_io;
 	/*
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 0bad62cca3d0..68f7934d8fa6 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -416,6 +416,12 @@ enum req_flag_bits {
 	__REQ_PREFLUSH,		/* request for cache flush */
 	__REQ_RAHEAD,		/* read ahead, can fail anytime */
 	__REQ_BACKGROUND,	/* background IO */
+	/*
+	 * Do not use zone write locking. Setting this flag is only safe if
+	 * the request submitter allocates and submit requests in LBA order per
+	 * zone.
+	 */
+	__REQ_NO_ZONE_WRITE_LOCK,
 	__REQ_NOWAIT,           /* Don't wait if request will block */
 	__REQ_POLLED,		/* caller polls for completion using bio_poll */
 	__REQ_ALLOC_CACHE,	/* allocate IO from cache if available */
@@ -448,6 +454,8 @@ enum req_flag_bits {
 #define REQ_PREFLUSH	(__force blk_opf_t)(1ULL << __REQ_PREFLUSH)
 #define REQ_RAHEAD	(__force blk_opf_t)(1ULL << __REQ_RAHEAD)
 #define REQ_BACKGROUND	(__force blk_opf_t)(1ULL << __REQ_BACKGROUND)
+#define REQ_NO_ZONE_WRITE_LOCK	\
+			(__force blk_opf_t)(1ULL << __REQ_NO_ZONE_WRITE_LOCK)
 #define REQ_NOWAIT	(__force blk_opf_t)(1ULL << __REQ_NOWAIT)
 #define REQ_POLLED	(__force blk_opf_t)(1ULL << __REQ_POLLED)
 #define REQ_ALLOC_CACHE	(__force blk_opf_t)(1ULL << __REQ_ALLOC_CACHE)

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 2/6] block/mq-deadline: Only use zone locking if necessary
  2023-07-26  0:57 [PATCH v3 0/6] Improve the performance of F2FS on zoned UFS Bart Van Assche
  2023-07-26  0:57 ` [PATCH v3 1/6] block: Introduce the flag REQ_NO_WRITE_LOCK Bart Van Assche
@ 2023-07-26  0:57 ` Bart Van Assche
  2023-07-26  8:41   ` Damien Le Moal
  2023-07-26 10:32   ` kernel test robot
  2023-07-26  0:57 ` [PATCH v3 3/6] block/null_blk: Support disabling zone write locking Bart Van Assche
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 17+ messages in thread
From: Bart Van Assche @ 2023-07-26  0:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Bart Van Assche,
	Damien Le Moal, Ming Lei

Measurements have shown that limiting the queue depth to one for zoned
writes has a significant negative performance impact on zoned UFS devices.
Hence this patch that disables zone locking by the mq-deadline scheduler
if zoned writes are submitted in order and if the storage controller
preserves the command order. This patch is based on the following
assumptions:
- It happens infrequently that zoned write requests are reordered by the
  block layer.
- The I/O priority of all write requests is the same per zone.
- Either no I/O scheduler is used or an I/O scheduler is used that
  submits write requests per zone in LBA order.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/mq-deadline.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index 02a916ba62ee..ce5b5048935e 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -338,6 +338,18 @@ static struct request *deadline_skip_seq_writes(struct deadline_data *dd,
 	return rq;
 }
 
+/*
+ * Use write locking if either QUEUE_FLAG_NO_ZONE_WRITE_LOCK or
+ * REQ_NO_ZONE_WRITE_LOCK has not been set. Not using zone write locking is
+ * only safe if the submitter allocates and submit requests in LBA order per
+ * zone and if the block driver preserves the request order.
+ */
+static bool dd_use_write_locking(struct request *rq)
+{
+	return !blk_queue_no_zone_write_lock(rq->q) ||
+		!(rq->cmd_flags & REQ_NO_ZONE_WRITE_LOCK);
+}
+
 /*
  * For the specified data direction, return the next request to
  * dispatch using arrival ordered lists.
@@ -353,7 +365,8 @@ deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
 		return NULL;
 
 	rq = rq_entry_fifo(per_prio->fifo_list[data_dir].next);
-	if (data_dir == DD_READ || !blk_queue_is_zoned(rq->q))
+	if (data_dir == DD_READ || !blk_queue_is_zoned(rq->q) ||
+	    !dd_use_write_locking(rq))
 		return rq;
 
 	/*
@@ -398,7 +411,8 @@ deadline_next_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
 	if (!rq)
 		return NULL;
 
-	if (data_dir == DD_READ || !blk_queue_is_zoned(rq->q))
+	if (data_dir == DD_READ || !blk_queue_is_zoned(rq->q) ||
+	    !dd_use_write_locking(rq))
 		return rq;
 
 	/*
@@ -526,8 +540,9 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd,
 	}
 
 	/*
-	 * For a zoned block device, if we only have writes queued and none of
-	 * them can be dispatched, rq will be NULL.
+	 * For a zoned block device that requires write serialization, if we
+	 * only have writes queued and none of them can be dispatched, rq will
+	 * be NULL.
 	 */
 	if (!rq)
 		return NULL;
@@ -552,7 +567,8 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd,
 	/*
 	 * If the request needs its target zone locked, do it.
 	 */
-	blk_req_zone_write_lock(rq);
+	if (dd_use_write_locking(rq))
+		blk_req_zone_write_lock(rq);
 	rq->rq_flags |= RQF_STARTED;
 	return rq;
 }
@@ -933,7 +949,7 @@ static void dd_finish_request(struct request *rq)
 
 	atomic_inc(&per_prio->stats.completed);
 
-	if (blk_queue_is_zoned(q)) {
+	if (blk_queue_is_zoned(rq->q) && dd_use_write_locking(rq)) {
 		unsigned long flags;
 
 		spin_lock_irqsave(&dd->zone_lock, flags);

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 3/6] block/null_blk: Support disabling zone write locking
  2023-07-26  0:57 [PATCH v3 0/6] Improve the performance of F2FS on zoned UFS Bart Van Assche
  2023-07-26  0:57 ` [PATCH v3 1/6] block: Introduce the flag REQ_NO_WRITE_LOCK Bart Van Assche
  2023-07-26  0:57 ` [PATCH v3 2/6] block/mq-deadline: Only use zone locking if necessary Bart Van Assche
@ 2023-07-26  0:57 ` Bart Van Assche
  2023-07-26  8:43   ` Damien Le Moal
  2023-07-26  0:57 ` [PATCH v3 4/6] scsi: Retry unaligned zoned writes Bart Van Assche
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2023-07-26  0:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Bart Van Assche,
	Damien Le Moal, Ming Lei, Chaitanya Kulkarni, Damien Le Moal,
	Nitesh Shetty, Vincent Fu, Dan Carpenter, Akinobu Mita,
	Shin'ichiro Kawasaki, Martin K. Petersen, Johannes Thumshirn

Add a new configfs attribute for disabling zone write locking. The test
script below reports 250 K IOPS with no I/O scheduler, 6 K IOPS with
mq-deadline and write locking enabled and 123 K IOPS with mq-deadline
and write locking disabled. This shows that disabling write locking
results in about 20 times more IOPS for this particular test case.

    #!/bin/bash

    for mode in "none 0" "mq-deadline 0" "mq-deadline 1"; do
        set +e
        for d in /sys/kernel/config/nullb/*; do
            [ -d "$d" ] && rmdir "$d"
        done
        modprobe -r null_blk
        set -e
        read -r iosched no_write_locking <<<"$mode"
        modprobe null_blk nr_devices=0
        (
            cd /sys/kernel/config/nullb
            mkdir nullb0
            cd nullb0
            params=(
                completion_nsec=100000
                hw_queue_depth=64
                irqmode=2                # NULL_IRQ_TIMER
                max_sectors=$((4096/512))
                memory_backed=1
                no_zone_write_lock="${no_write_locking}"
                size=1
                submit_queues=1
                zone_size=1
                zoned=1
                power=1
            )
            for p in "${params[@]}"; do
                echo "${p//*=}" > "${p//=*}"
            done
        )
        udevadm settle
        dev=/dev/nullb0
        [ -b "${dev}" ]
        params=(
            --direct=1
            --filename="${dev}"
            --iodepth=64
            --iodepth_batch=16
            --ioengine=io_uring
            --ioscheduler="${iosched}"
            --gtod_reduce=1
            --hipri=0
            --name=nullb0
            --runtime=30
            --rw=write
            --time_based=1
            --zonemode=zbd
        )
        fio "${params[@]}"
    done

Cc: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/block/null_blk/main.c     | 2 ++
 drivers/block/null_blk/null_blk.h | 1 +
 drivers/block/null_blk/zoned.c    | 3 +++
 3 files changed, 6 insertions(+)

diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
index 864013019d6b..5c0578137f51 100644
--- a/drivers/block/null_blk/main.c
+++ b/drivers/block/null_blk/main.c
@@ -424,6 +424,7 @@ NULLB_DEVICE_ATTR(zone_capacity, ulong, NULL);
 NULLB_DEVICE_ATTR(zone_nr_conv, uint, NULL);
 NULLB_DEVICE_ATTR(zone_max_open, uint, NULL);
 NULLB_DEVICE_ATTR(zone_max_active, uint, NULL);
+NULLB_DEVICE_ATTR(no_zone_write_lock, bool, NULL);
 NULLB_DEVICE_ATTR(virt_boundary, bool, NULL);
 NULLB_DEVICE_ATTR(no_sched, bool, NULL);
 NULLB_DEVICE_ATTR(shared_tag_bitmap, bool, NULL);
@@ -569,6 +570,7 @@ static struct configfs_attribute *nullb_device_attrs[] = {
 	&nullb_device_attr_zone_max_active,
 	&nullb_device_attr_zone_readonly,
 	&nullb_device_attr_zone_offline,
+	&nullb_device_attr_no_zone_write_lock,
 	&nullb_device_attr_virt_boundary,
 	&nullb_device_attr_no_sched,
 	&nullb_device_attr_shared_tag_bitmap,
diff --git a/drivers/block/null_blk/null_blk.h b/drivers/block/null_blk/null_blk.h
index 929f659dd255..b521096bcc3f 100644
--- a/drivers/block/null_blk/null_blk.h
+++ b/drivers/block/null_blk/null_blk.h
@@ -117,6 +117,7 @@ struct nullb_device {
 	bool memory_backed; /* if data is stored in memory */
 	bool discard; /* if support discard */
 	bool zoned; /* if device is zoned */
+	bool no_zone_write_lock;
 	bool virt_boundary; /* virtual boundary on/off for the device */
 	bool no_sched; /* no IO scheduler for the device */
 	bool shared_tag_bitmap; /* use hostwide shared tags */
diff --git a/drivers/block/null_blk/zoned.c b/drivers/block/null_blk/zoned.c
index 55c5b48bc276..31c8364a63e9 100644
--- a/drivers/block/null_blk/zoned.c
+++ b/drivers/block/null_blk/zoned.c
@@ -96,6 +96,9 @@ int null_init_zoned_dev(struct nullb_device *dev, struct request_queue *q)
 
 	spin_lock_init(&dev->zone_res_lock);
 
+	if (dev->no_zone_write_lock)
+		blk_queue_flag_set(QUEUE_FLAG_NO_ZONE_WRITE_LOCK, q);
+
 	if (dev->zone_nr_conv >= dev->nr_zones) {
 		dev->zone_nr_conv = dev->nr_zones - 1;
 		pr_info("changed the number of conventional zones to %u",

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 4/6] scsi: Retry unaligned zoned writes
  2023-07-26  0:57 [PATCH v3 0/6] Improve the performance of F2FS on zoned UFS Bart Van Assche
                   ` (2 preceding siblings ...)
  2023-07-26  0:57 ` [PATCH v3 3/6] block/null_blk: Support disabling zone write locking Bart Van Assche
@ 2023-07-26  0:57 ` Bart Van Assche
  2023-07-26  8:47   ` Damien Le Moal
  2023-07-26 11:23   ` kernel test robot
  2023-07-26  0:57 ` [PATCH v3 5/6] scsi: ufs: Disable zone write locking Bart Van Assche
  2023-07-26  0:57 ` [PATCH v3 6/6] fs/f2fs: " Bart Van Assche
  5 siblings, 2 replies; 17+ messages in thread
From: Bart Van Assche @ 2023-07-26  0:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Bart Van Assche,
	Martin K . Petersen, Damien Le Moal, Ming Lei,
	James E.J. Bottomley

From ZBC-2: "The device server terminates with CHECK CONDITION status, with
the sense key set to ILLEGAL REQUEST, and the additional sense code set to
UNALIGNED WRITE COMMAND a write command, other than an entire medium write
same command, that specifies: a) the starting LBA in a sequential write
required zone set to a value that is not equal to the write pointer for
that sequential write required zone; or b) an ending LBA that is not equal
to the last logical block within a physical block (see SBC-5)."

Send commands that failed with an unaligned write error to the SCSI error
handler. Let the SCSI error handler sort SCSI commands per LBA before
resubmitting these.

Increase the number of retries for write commands sent to a sequential
zone to the maximum number of outstanding commands.

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_error.c | 37 +++++++++++++++++++++++++++++++++++++
 drivers/scsi/scsi_lib.c   |  1 +
 drivers/scsi/sd.c         |  3 +++
 include/scsi/scsi.h       |  1 +
 4 files changed, 42 insertions(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index c67cdcdc3ba8..2b9aec05dc36 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -27,6 +27,7 @@
 #include <linux/blkdev.h>
 #include <linux/delay.h>
 #include <linux/jiffies.h>
+#include <linux/list_sort.h>
 
 #include <scsi/scsi.h>
 #include <scsi/scsi_cmnd.h>
@@ -698,6 +699,17 @@ enum scsi_disposition scsi_check_sense(struct scsi_cmnd *scmd)
 		fallthrough;
 
 	case ILLEGAL_REQUEST:
+		/*
+		 * Unaligned write command. This indicates that zoned writes
+		 * have been received by the device in the wrong order. If zone
+		 * write locking is disabled, retry after all pending commands
+		 * have completed.
+		 */
+		if (sshdr.asc == 0x21 && sshdr.ascq == 0x04 &&
+		    blk_queue_no_zone_write_lock(sdev->request_queue) &&
+		    !scsi_noretry_cmd(scmd) && scsi_cmd_retry_allowed(scmd))
+			return NEEDS_DELAYED_RETRY;
+
 		if (sshdr.asc == 0x20 || /* Invalid command operation code */
 		    sshdr.asc == 0x21 || /* Logical block address out of range */
 		    sshdr.asc == 0x22 || /* Invalid function */
@@ -2223,6 +2235,25 @@ void scsi_eh_flush_done_q(struct list_head *done_q)
 }
 EXPORT_SYMBOL(scsi_eh_flush_done_q);
 
+/*
+ * Returns a negative value if @_a has a lower LBA than @_b, zero if
+ * both have the same LBA and a positive value otherwise.
+ */
+static int scsi_cmp_lba(void *priv, const struct list_head *_a,
+			const struct list_head *_b)
+{
+	struct scsi_cmnd *a = list_entry(_a, typeof(*a), eh_entry);
+	struct scsi_cmnd *b = list_entry(_b, typeof(*b), eh_entry);
+	const sector_t pos_a = blk_rq_pos(scsi_cmd_to_rq(a));
+	const sector_t pos_b = blk_rq_pos(scsi_cmd_to_rq(b));
+
+	if (pos_a < pos_b)
+		return -1;
+	if (pos_a > pos_b)
+		return 1;
+	return 0;
+}
+
 /**
  * scsi_unjam_host - Attempt to fix a host which has a cmd that failed.
  * @shost:	Host to unjam.
@@ -2258,6 +2289,12 @@ static void scsi_unjam_host(struct Scsi_Host *shost)
 
 	SCSI_LOG_ERROR_RECOVERY(1, scsi_eh_prt_fail_stats(shost, &eh_work_q));
 
+	/*
+	 * Sort pending SCSI commands in LBA order. This is important if zone
+	 * write locking is disabled for a zoned SCSI device.
+	 */
+	list_sort(NULL, &eh_work_q, scsi_cmp_lba);
+
 	if (!scsi_eh_get_sense(&eh_work_q, &eh_done_q))
 		scsi_eh_ready_devs(shost, &eh_work_q, &eh_done_q);
 
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 59176946ab56..69da8aee13df 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1443,6 +1443,7 @@ static void scsi_complete(struct request *rq)
 	case ADD_TO_MLQUEUE:
 		scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY);
 		break;
+	case NEEDS_DELAYED_RETRY:
 	default:
 		scsi_eh_scmd_add(cmd);
 		break;
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 68b12afa0721..27b9ebe05b90 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1235,6 +1235,9 @@ static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd)
 	cmd->transfersize = sdp->sector_size;
 	cmd->underflow = nr_blocks << 9;
 	cmd->allowed = sdkp->max_retries;
+	if (blk_queue_no_zone_write_lock(rq->q) &&
+	    blk_rq_is_seq_zoned_write(rq))
+		cmd->allowed += rq->q->nr_requests;
 	cmd->sdb.length = nr_blocks * sdp->sector_size;
 
 	SCSI_LOG_HLQUEUE(1,
diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index ec093594ba53..6600db046227 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -93,6 +93,7 @@ static inline int scsi_status_is_check_condition(int status)
  * Internal return values.
  */
 enum scsi_disposition {
+	NEEDS_DELAYED_RETRY	= 0x2000,
 	NEEDS_RETRY		= 0x2001,
 	SUCCESS			= 0x2002,
 	FAILED			= 0x2003,

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 5/6] scsi: ufs: Disable zone write locking
  2023-07-26  0:57 [PATCH v3 0/6] Improve the performance of F2FS on zoned UFS Bart Van Assche
                   ` (3 preceding siblings ...)
  2023-07-26  0:57 ` [PATCH v3 4/6] scsi: Retry unaligned zoned writes Bart Van Assche
@ 2023-07-26  0:57 ` Bart Van Assche
  2023-07-26 12:04   ` kernel test robot
  2023-07-26  0:57 ` [PATCH v3 6/6] fs/f2fs: " Bart Van Assche
  5 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2023-07-26  0:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Bart Van Assche,
	Martin K . Petersen, Avri Altman, Damien Le Moal, Ming Lei,
	James E.J. Bottomley, Stanley Chu, Can Guo, Asutosh Das,
	Bao D. Nguyen, Bean Huo, Arthur Simchaev

From the UFSHCI 4.0 specification, about the legacy (single queue) mode:
"The host controller always process transfer requests in-order according
to the order submitted to the list. In case of multiple commands with
single doorbell register ringing (batch mode), The dispatch order for
these transfer requests by host controller will base on their index in
the List. A transfer request with lower index value will be executed
before a transfer request with higher index value."

From the UFSHCI 4.0 specification, about the MCQ mode:
"Command Submission
1. Host SW writes an Entry to SQ
2. Host SW updates SQ doorbell tail pointer

Command Processing
3. After fetching the Entry, Host Controller updates SQ doorbell head
   pointer
4. Host controller sends COMMAND UPIU to UFS device"

In other words, for both legacy and MCQ mode, UFS controllers are
required to forward commands to the UFS device in the order these
commands have been received from the host.

Notes:
- For legacy mode this is only correct if the host submits one
  command at a time. The UFS driver does this.
- Also in legacy mode, the command order is not preserved if
  auto-hibernation is enabled in the UFS controller.

This patch improves small write IOPS with a factor four on my test
setup.

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Avri Altman <avri.altman@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/ufs/core/ufshcd.c | 45 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 42 insertions(+), 3 deletions(-)

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 129446775796..0f7f91e2cda9 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -4337,29 +4337,67 @@ int ufshcd_uic_hibern8_exit(struct ufs_hba *hba)
 }
 EXPORT_SYMBOL_GPL(ufshcd_uic_hibern8_exit);
 
+void ufshcd_update_no_zone_write_lock(struct ufs_hba *hba,
+				      bool set_no_zone_write_lock)
+{
+	struct scsi_device *sdev;
+
+	shost_for_each_device(sdev, hba->host)
+		blk_freeze_queue_start(sdev->request_queue);
+	shost_for_each_device(sdev, hba->host) {
+		struct request_queue *q = sdev->request_queue;
+
+		blk_mq_freeze_queue_wait(q);
+		if (set_no_zone_write_lock)
+			blk_queue_flag_set(QUEUE_FLAG_NO_ZONE_WRITE_LOCK, q);
+		else
+			blk_queue_flag_clear(QUEUE_FLAG_NO_ZONE_WRITE_LOCK, q);
+		blk_mq_unfreeze_queue(q);
+	}
+}
+
 void ufshcd_auto_hibern8_update(struct ufs_hba *hba, u32 ahit)
 {
 	unsigned long flags;
-	bool update = false;
+	bool prev_state, new_state, update = false;
 
 	if (!ufshcd_is_auto_hibern8_supported(hba))
 		return;
 
 	spin_lock_irqsave(hba->host->host_lock, flags);
+	prev_state = ufshcd_is_auto_hibern8_enabled(hba);
 	if (hba->ahit != ahit) {
 		hba->ahit = ahit;
 		update = true;
 	}
+	new_state = ufshcd_is_auto_hibern8_enabled(hba);
 	spin_unlock_irqrestore(hba->host->host_lock, flags);
 
-	if (update &&
-	    !pm_runtime_suspended(&hba->ufs_device_wlun->sdev_gendev)) {
+	if (!update)
+		return;
+	if (!is_mcq_enabled(hba) && !prev_state && new_state) {
+		/*
+		 * Auto-hibernation will be enabled. Enable write locking for
+		 * zoned writes since auto-hibernation may cause reordering of
+		 * zoned writes when using the legacy mode of the UFS host
+		 * controller.
+		 */
+		ufshcd_update_no_zone_write_lock(hba, false);
+	}
+	if (!pm_runtime_suspended(&hba->ufs_device_wlun->sdev_gendev)) {
 		ufshcd_rpm_get_sync(hba);
 		ufshcd_hold(hba);
 		ufshcd_auto_hibern8_enable(hba);
 		ufshcd_release(hba);
 		ufshcd_rpm_put_sync(hba);
 	}
+	if (!is_mcq_enabled(hba) && prev_state && !new_state) {
+		/*
+		 * Auto-hibernation has been disabled. Disable write locking
+		 * for zoned writes.
+		 */
+		ufshcd_update_no_zone_write_lock(hba, true);
+	}
 }
 EXPORT_SYMBOL_GPL(ufshcd_auto_hibern8_update);
 
@@ -5139,6 +5177,7 @@ static int ufshcd_slave_configure(struct scsi_device *sdev)
 
 	ufshcd_hpb_configure(hba, sdev);
 
+	blk_queue_flag_set(QUEUE_FLAG_NO_ZONE_WRITE_LOCK, q);
 	blk_queue_update_dma_pad(q, PRDT_DATA_BYTE_COUNT_PAD - 1);
 	if (hba->quirks & UFSHCD_QUIRK_4KB_DMA_ALIGNMENT)
 		blk_queue_update_dma_alignment(q, SZ_4K - 1);

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 6/6] fs/f2fs: Disable zone write locking
  2023-07-26  0:57 [PATCH v3 0/6] Improve the performance of F2FS on zoned UFS Bart Van Assche
                   ` (4 preceding siblings ...)
  2023-07-26  0:57 ` [PATCH v3 5/6] scsi: ufs: Disable zone write locking Bart Van Assche
@ 2023-07-26  0:57 ` Bart Van Assche
  5 siblings, 0 replies; 17+ messages in thread
From: Bart Van Assche @ 2023-07-26  0:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Bart Van Assche,
	Damien Le Moal, Ming Lei, Chao Yu

Set the REQ_NO_ZONE_WRITE_LOCK flag to inform the block layer that F2FS
allocates and submits zoned writes in LBA order per zone.

Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 fs/f2fs/data.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 5882afe71d82..6361553f4ab1 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -569,6 +569,7 @@ static void f2fs_submit_write_bio(struct f2fs_sb_info *sbi, struct bio *bio,
 		}
 	}
 
+	bio->bi_opf |= REQ_NO_ZONE_WRITE_LOCK;
 	trace_f2fs_submit_write_bio(sbi->sb, type, bio);
 	iostat_update_submit_ctx(bio, type);
 	submit_bio(bio);

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 1/6] block: Introduce the flag REQ_NO_WRITE_LOCK
  2023-07-26  0:57 ` [PATCH v3 1/6] block: Introduce the flag REQ_NO_WRITE_LOCK Bart Van Assche
@ 2023-07-26  8:37   ` Damien Le Moal
  2023-07-26 14:58     ` Bart Van Assche
  0 siblings, 1 reply; 17+ messages in thread
From: Damien Le Moal @ 2023-07-26  8:37 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Ming Lei

On 7/26/23 09:57, Bart Van Assche wrote:
> Not all software that supports zoned storage allocates and submits zoned
> writes in LBA order per zone. Introduce the REQ_NO_WRITE_LOCK flag such
> that submitters of zoned writes can indicate that zoned writes are
> allocated and submitted in LBA order per zone.

That is absolutely NOT true. *ALL* in-kernel and application software supporting
zoned block devices issue writes sequentially. Otherwise, they are considered
buggy. The justification for zone write locking is to preserve sequential write
streams in the presence of storage adapters that do not preserve command orders,
or if the low level driver requeue write commands. If both of these conditions
are false (e.g. UFS), with the help of proper scsi requeue handling, zone write
locking can be disabled. REQ_NO_WRITE_LOCK indicates that a write request does
not need zone write locking when the underlying device signaled that it can
preserve command ordering. So we also need a queue flag for that...

> 
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
> Cc: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  block/blk-flush.c         | 3 ++-
>  include/linux/blk_types.h | 8 ++++++++
>  2 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-flush.c b/block/blk-flush.c
> index e73dc22d05c1..038350ed7cce 100644
> --- a/block/blk-flush.c
> +++ b/block/blk-flush.c
> @@ -336,7 +336,8 @@ static void blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq,
>  		flush_rq->internal_tag = first_rq->internal_tag;
>  
>  	flush_rq->cmd_flags = REQ_OP_FLUSH | REQ_PREFLUSH;
> -	flush_rq->cmd_flags |= (flags & REQ_DRV) | (flags & REQ_FAILFAST_MASK);
> +	flush_rq->cmd_flags |= flags &
> +		(REQ_FAILFAST_MASK | REQ_NO_ZONE_WRITE_LOCK | REQ_DRV);

Why ? I do not see why this change is necessary. If it is, the commit message
should mention it and this change should probably be its own patch.

>  	flush_rq->rq_flags |= RQF_FLUSH_SEQ;
>  	flush_rq->end_io = flush_end_io;
>  	/*
> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
> index 0bad62cca3d0..68f7934d8fa6 100644
> --- a/include/linux/blk_types.h
> +++ b/include/linux/blk_types.h
> @@ -416,6 +416,12 @@ enum req_flag_bits {
>  	__REQ_PREFLUSH,		/* request for cache flush */
>  	__REQ_RAHEAD,		/* read ahead, can fail anytime */
>  	__REQ_BACKGROUND,	/* background IO */
> +	/*
> +	 * Do not use zone write locking. Setting this flag is only safe if
> +	 * the request submitter allocates and submit requests in LBA order per
> +	 * zone.

...if the request submitter submits write requests sequentially per zone.

> +	 */
> +	__REQ_NO_ZONE_WRITE_LOCK,
>  	__REQ_NOWAIT,           /* Don't wait if request will block */
>  	__REQ_POLLED,		/* caller polls for completion using bio_poll */
>  	__REQ_ALLOC_CACHE,	/* allocate IO from cache if available */
> @@ -448,6 +454,8 @@ enum req_flag_bits {
>  #define REQ_PREFLUSH	(__force blk_opf_t)(1ULL << __REQ_PREFLUSH)
>  #define REQ_RAHEAD	(__force blk_opf_t)(1ULL << __REQ_RAHEAD)
>  #define REQ_BACKGROUND	(__force blk_opf_t)(1ULL << __REQ_BACKGROUND)
> +#define REQ_NO_ZONE_WRITE_LOCK	\
> +			(__force blk_opf_t)(1ULL << __REQ_NO_ZONE_WRITE_LOCK)
>  #define REQ_NOWAIT	(__force blk_opf_t)(1ULL << __REQ_NOWAIT)
>  #define REQ_POLLED	(__force blk_opf_t)(1ULL << __REQ_POLLED)
>  #define REQ_ALLOC_CACHE	(__force blk_opf_t)(1ULL << __REQ_ALLOC_CACHE)

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 2/6] block/mq-deadline: Only use zone locking if necessary
  2023-07-26  0:57 ` [PATCH v3 2/6] block/mq-deadline: Only use zone locking if necessary Bart Van Assche
@ 2023-07-26  8:41   ` Damien Le Moal
  2023-07-26 10:32   ` kernel test robot
  1 sibling, 0 replies; 17+ messages in thread
From: Damien Le Moal @ 2023-07-26  8:41 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Ming Lei

On 7/26/23 09:57, Bart Van Assche wrote:
> Measurements have shown that limiting the queue depth to one for zoned
> writes has a significant negative performance impact on zoned UFS devices.
> Hence this patch that disables zone locking by the mq-deadline scheduler
> if zoned writes are submitted in order and if the storage controller
> preserves the command order. This patch is based on the following
> assumptions:

if zoned writes are submitted in order and if the storage... -> if the storage
controller preserves...

> - It happens infrequently that zoned write requests are reordered by the
>   block layer.
> - The I/O priority of all write requests is the same per zone.
> - Either no I/O scheduler is used or an I/O scheduler is used that
>   submits write requests per zone in LBA order.
> 
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>

Please use dlemoal@kernel.org

> Cc: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  block/mq-deadline.c | 28 ++++++++++++++++++++++------
>  1 file changed, 22 insertions(+), 6 deletions(-)
> 
> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> index 02a916ba62ee..ce5b5048935e 100644
> --- a/block/mq-deadline.c
> +++ b/block/mq-deadline.c
> @@ -338,6 +338,18 @@ static struct request *deadline_skip_seq_writes(struct deadline_data *dd,
>  	return rq;
>  }
>  
> +/*
> + * Use write locking if either QUEUE_FLAG_NO_ZONE_WRITE_LOCK or
> + * REQ_NO_ZONE_WRITE_LOCK has not been set. Not using zone write locking is
> + * only safe if the submitter allocates and submit requests in LBA order per
> + * zone and if the block driver preserves the request order.
> + */
> +static bool dd_use_write_locking(struct request *rq)
> +{
> +	return !blk_queue_no_zone_write_lock(rq->q) ||
> +		!(rq->cmd_flags & REQ_NO_ZONE_WRITE_LOCK);
> +}
> +
>  /*
>   * For the specified data direction, return the next request to
>   * dispatch using arrival ordered lists.
> @@ -353,7 +365,8 @@ deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
>  		return NULL;
>  
>  	rq = rq_entry_fifo(per_prio->fifo_list[data_dir].next);
> -	if (data_dir == DD_READ || !blk_queue_is_zoned(rq->q))
> +	if (data_dir == DD_READ || !blk_queue_is_zoned(rq->q) ||
> +	    !dd_use_write_locking(rq))

The test for !blk_queue_is_zoned(rq->q) can be moved inside
dd_use_write_locking() I think. That will avoid repeating it again below.

>  		return rq;
>  
>  	/*
> @@ -398,7 +411,8 @@ deadline_next_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
>  	if (!rq)
>  		return NULL;
>  
> -	if (data_dir == DD_READ || !blk_queue_is_zoned(rq->q))
> +	if (data_dir == DD_READ || !blk_queue_is_zoned(rq->q) ||
> +	    !dd_use_write_locking(rq))
>  		return rq;
>  
>  	/*
> @@ -526,8 +540,9 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd,
>  	}
>  
>  	/*
> -	 * For a zoned block device, if we only have writes queued and none of
> -	 * them can be dispatched, rq will be NULL.
> +	 * For a zoned block device that requires write serialization, if we
> +	 * only have writes queued and none of them can be dispatched, rq will
> +	 * be NULL.
>  	 */
>  	if (!rq)
>  		return NULL;
> @@ -552,7 +567,8 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd,
>  	/*
>  	 * If the request needs its target zone locked, do it.
>  	 */
> -	blk_req_zone_write_lock(rq);
> +	if (dd_use_write_locking(rq))
> +		blk_req_zone_write_lock(rq);
>  	rq->rq_flags |= RQF_STARTED;
>  	return rq;
>  }
> @@ -933,7 +949,7 @@ static void dd_finish_request(struct request *rq)
>  
>  	atomic_inc(&per_prio->stats.completed);
>  
> -	if (blk_queue_is_zoned(q)) {
> +	if (blk_queue_is_zoned(rq->q) && dd_use_write_locking(rq)) {
>  		unsigned long flags;
>  
>  		spin_lock_irqsave(&dd->zone_lock, flags);

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 3/6] block/null_blk: Support disabling zone write locking
  2023-07-26  0:57 ` [PATCH v3 3/6] block/null_blk: Support disabling zone write locking Bart Van Assche
@ 2023-07-26  8:43   ` Damien Le Moal
  0 siblings, 0 replies; 17+ messages in thread
From: Damien Le Moal @ 2023-07-26  8:43 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Damien Le Moal,
	Ming Lei, Chaitanya Kulkarni, Nitesh Shetty, Vincent Fu,
	Dan Carpenter, Akinobu Mita, Shin'ichiro Kawasaki,
	Martin K. Petersen, Johannes Thumshirn

On 7/26/23 09:57, Bart Van Assche wrote:
> Add a new configfs attribute for disabling zone write locking. The test
> script below reports 250 K IOPS with no I/O scheduler, 6 K IOPS with
> mq-deadline and write locking enabled and 123 K IOPS with mq-deadline
> and write locking disabled. This shows that disabling write locking
> results in about 20 times more IOPS for this particular test case.
> 
>     #!/bin/bash
> 
>     for mode in "none 0" "mq-deadline 0" "mq-deadline 1"; do
>         set +e
>         for d in /sys/kernel/config/nullb/*; do
>             [ -d "$d" ] && rmdir "$d"
>         done
>         modprobe -r null_blk
>         set -e
>         read -r iosched no_write_locking <<<"$mode"
>         modprobe null_blk nr_devices=0
>         (
>             cd /sys/kernel/config/nullb
>             mkdir nullb0
>             cd nullb0
>             params=(
>                 completion_nsec=100000
>                 hw_queue_depth=64
>                 irqmode=2                # NULL_IRQ_TIMER
>                 max_sectors=$((4096/512))
>                 memory_backed=1
>                 no_zone_write_lock="${no_write_locking}"
>                 size=1
>                 submit_queues=1
>                 zone_size=1
>                 zoned=1
>                 power=1
>             )
>             for p in "${params[@]}"; do
>                 echo "${p//*=}" > "${p//=*}"
>             done
>         )
>         udevadm settle
>         dev=/dev/nullb0
>         [ -b "${dev}" ]
>         params=(
>             --direct=1
>             --filename="${dev}"
>             --iodepth=64
>             --iodepth_batch=16
>             --ioengine=io_uring
>             --ioscheduler="${iosched}"
>             --gtod_reduce=1
>             --hipri=0
>             --name=nullb0
>             --runtime=30
>             --rw=write
>             --time_based=1
>             --zonemode=zbd
>         )
>         fio "${params[@]}"
>     done
> 
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
> Cc: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  drivers/block/null_blk/main.c     | 2 ++
>  drivers/block/null_blk/null_blk.h | 1 +
>  drivers/block/null_blk/zoned.c    | 3 +++
>  3 files changed, 6 insertions(+)
> 
> diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
> index 864013019d6b..5c0578137f51 100644
> --- a/drivers/block/null_blk/main.c
> +++ b/drivers/block/null_blk/main.c
> @@ -424,6 +424,7 @@ NULLB_DEVICE_ATTR(zone_capacity, ulong, NULL);
>  NULLB_DEVICE_ATTR(zone_nr_conv, uint, NULL);
>  NULLB_DEVICE_ATTR(zone_max_open, uint, NULL);
>  NULLB_DEVICE_ATTR(zone_max_active, uint, NULL);
> +NULLB_DEVICE_ATTR(no_zone_write_lock, bool, NULL);
>  NULLB_DEVICE_ATTR(virt_boundary, bool, NULL);
>  NULLB_DEVICE_ATTR(no_sched, bool, NULL);
>  NULLB_DEVICE_ATTR(shared_tag_bitmap, bool, NULL);
> @@ -569,6 +570,7 @@ static struct configfs_attribute *nullb_device_attrs[] = {
>  	&nullb_device_attr_zone_max_active,
>  	&nullb_device_attr_zone_readonly,
>  	&nullb_device_attr_zone_offline,
> +	&nullb_device_attr_no_zone_write_lock,
>  	&nullb_device_attr_virt_boundary,
>  	&nullb_device_attr_no_sched,
>  	&nullb_device_attr_shared_tag_bitmap,
> diff --git a/drivers/block/null_blk/null_blk.h b/drivers/block/null_blk/null_blk.h
> index 929f659dd255..b521096bcc3f 100644
> --- a/drivers/block/null_blk/null_blk.h
> +++ b/drivers/block/null_blk/null_blk.h
> @@ -117,6 +117,7 @@ struct nullb_device {
>  	bool memory_backed; /* if data is stored in memory */
>  	bool discard; /* if support discard */
>  	bool zoned; /* if device is zoned */
> +	bool no_zone_write_lock;
>  	bool virt_boundary; /* virtual boundary on/off for the device */
>  	bool no_sched; /* no IO scheduler for the device */
>  	bool shared_tag_bitmap; /* use hostwide shared tags */
> diff --git a/drivers/block/null_blk/zoned.c b/drivers/block/null_blk/zoned.c
> index 55c5b48bc276..31c8364a63e9 100644
> --- a/drivers/block/null_blk/zoned.c
> +++ b/drivers/block/null_blk/zoned.c
> @@ -96,6 +96,9 @@ int null_init_zoned_dev(struct nullb_device *dev, struct request_queue *q)
>  
>  	spin_lock_init(&dev->zone_res_lock);
>  
> +	if (dev->no_zone_write_lock)
> +		blk_queue_flag_set(QUEUE_FLAG_NO_ZONE_WRITE_LOCK, q);

This patch look OK, but there is no patch introducing
QUEUE_FLAG_NO_ZONE_WRITE_LOCK... Patch 1 only introduced REQ_NO_ZONE_WRITE_LOCK.

So the series as is will not compile.

> +
>  	if (dev->zone_nr_conv >= dev->nr_zones) {
>  		dev->zone_nr_conv = dev->nr_zones - 1;
>  		pr_info("changed the number of conventional zones to %u",

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 4/6] scsi: Retry unaligned zoned writes
  2023-07-26  0:57 ` [PATCH v3 4/6] scsi: Retry unaligned zoned writes Bart Van Assche
@ 2023-07-26  8:47   ` Damien Le Moal
  2023-07-26 15:02     ` Bart Van Assche
  2023-07-26 11:23   ` kernel test robot
  1 sibling, 1 reply; 17+ messages in thread
From: Damien Le Moal @ 2023-07-26  8:47 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Martin K . Petersen,
	Damien Le Moal, Ming Lei, James E.J. Bottomley

On 7/26/23 09:57, Bart Van Assche wrote:
> From ZBC-2: "The device server terminates with CHECK CONDITION status, with
> the sense key set to ILLEGAL REQUEST, and the additional sense code set to
> UNALIGNED WRITE COMMAND a write command, other than an entire medium write
> same command, that specifies: a) the starting LBA in a sequential write
> required zone set to a value that is not equal to the write pointer for
> that sequential write required zone; or b) an ending LBA that is not equal
> to the last logical block within a physical block (see SBC-5)."
> 
> Send commands that failed with an unaligned write error to the SCSI error
> handler. Let the SCSI error handler sort SCSI commands per LBA before
> resubmitting these.
> 
> Increase the number of retries for write commands sent to a sequential
> zone to the maximum number of outstanding commands.
> 
> Cc: Martin K. Petersen <martin.petersen@oracle.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
> Cc: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  drivers/scsi/scsi_error.c | 37 +++++++++++++++++++++++++++++++++++++
>  drivers/scsi/scsi_lib.c   |  1 +
>  drivers/scsi/sd.c         |  3 +++
>  include/scsi/scsi.h       |  1 +
>  4 files changed, 42 insertions(+)
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index c67cdcdc3ba8..2b9aec05dc36 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -27,6 +27,7 @@
>  #include <linux/blkdev.h>
>  #include <linux/delay.h>
>  #include <linux/jiffies.h>
> +#include <linux/list_sort.h>
>  
>  #include <scsi/scsi.h>
>  #include <scsi/scsi_cmnd.h>
> @@ -698,6 +699,17 @@ enum scsi_disposition scsi_check_sense(struct scsi_cmnd *scmd)
>  		fallthrough;
>  
>  	case ILLEGAL_REQUEST:
> +		/*
> +		 * Unaligned write command. This indicates that zoned writes
> +		 * have been received by the device in the wrong order. If zone
> +		 * write locking is disabled, retry after all pending commands
> +		 * have completed.
> +		 */
> +		if (sshdr.asc == 0x21 && sshdr.ascq == 0x04 &&
> +		    blk_queue_no_zone_write_lock(sdev->request_queue) &&
> +		    !scsi_noretry_cmd(scmd) && scsi_cmd_retry_allowed(scmd))
> +			return NEEDS_DELAYED_RETRY;
> +
>  		if (sshdr.asc == 0x20 || /* Invalid command operation code */
>  		    sshdr.asc == 0x21 || /* Logical block address out of range */
>  		    sshdr.asc == 0x22 || /* Invalid function */
> @@ -2223,6 +2235,25 @@ void scsi_eh_flush_done_q(struct list_head *done_q)
>  }
>  EXPORT_SYMBOL(scsi_eh_flush_done_q);
>  
> +/*
> + * Returns a negative value if @_a has a lower LBA than @_b, zero if
> + * both have the same LBA and a positive value otherwise.
> + */
> +static int scsi_cmp_lba(void *priv, const struct list_head *_a,
> +			const struct list_head *_b)

The argument priv is unused.

> +{
> +	struct scsi_cmnd *a = list_entry(_a, typeof(*a), eh_entry);
> +	struct scsi_cmnd *b = list_entry(_b, typeof(*b), eh_entry);
> +	const sector_t pos_a = blk_rq_pos(scsi_cmd_to_rq(a));
> +	const sector_t pos_b = blk_rq_pos(scsi_cmd_to_rq(b));
> +
> +	if (pos_a < pos_b)
> +		return -1;
> +	if (pos_a > pos_b)
> +		return 1;
> +	return 0;
> +}
> +
>  /**
>   * scsi_unjam_host - Attempt to fix a host which has a cmd that failed.
>   * @shost:	Host to unjam.
> @@ -2258,6 +2289,12 @@ static void scsi_unjam_host(struct Scsi_Host *shost)
>  
>  	SCSI_LOG_ERROR_RECOVERY(1, scsi_eh_prt_fail_stats(shost, &eh_work_q));
>  
> +	/*
> +	 * Sort pending SCSI commands in LBA order. This is important if zone
> +	 * write locking is disabled for a zoned SCSI device.
> +	 */
> +	list_sort(NULL, &eh_work_q, scsi_cmp_lba);

Should we do this only for zoned devices ?

> +
>  	if (!scsi_eh_get_sense(&eh_work_q, &eh_done_q))
>  		scsi_eh_ready_devs(shost, &eh_work_q, &eh_done_q);
>  
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 59176946ab56..69da8aee13df 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1443,6 +1443,7 @@ static void scsi_complete(struct request *rq)
>  	case ADD_TO_MLQUEUE:
>  		scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY);
>  		break;
> +	case NEEDS_DELAYED_RETRY:
>  	default:
>  		scsi_eh_scmd_add(cmd);
>  		break;
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 68b12afa0721..27b9ebe05b90 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -1235,6 +1235,9 @@ static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd)
>  	cmd->transfersize = sdp->sector_size;
>  	cmd->underflow = nr_blocks << 9;
>  	cmd->allowed = sdkp->max_retries;
> +	if (blk_queue_no_zone_write_lock(rq->q) &&
> +	    blk_rq_is_seq_zoned_write(rq))
> +		cmd->allowed += rq->q->nr_requests;

Aouch... that could be a lot...

>  	cmd->sdb.length = nr_blocks * sdp->sector_size;
>  
>  	SCSI_LOG_HLQUEUE(1,
> diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
> index ec093594ba53..6600db046227 100644
> --- a/include/scsi/scsi.h
> +++ b/include/scsi/scsi.h
> @@ -93,6 +93,7 @@ static inline int scsi_status_is_check_condition(int status)
>   * Internal return values.
>   */
>  enum scsi_disposition {
> +	NEEDS_DELAYED_RETRY	= 0x2000,
>  	NEEDS_RETRY		= 0x2001,
>  	SUCCESS			= 0x2002,
>  	FAILED			= 0x2003,

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 2/6] block/mq-deadline: Only use zone locking if necessary
  2023-07-26  0:57 ` [PATCH v3 2/6] block/mq-deadline: Only use zone locking if necessary Bart Van Assche
  2023-07-26  8:41   ` Damien Le Moal
@ 2023-07-26 10:32   ` kernel test robot
  1 sibling, 0 replies; 17+ messages in thread
From: kernel test robot @ 2023-07-26 10:32 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: llvm, oe-kbuild-all

Hi Bart,

kernel test robot noticed the following build errors:

[auto build test ERROR on axboe-block/for-next]
[also build test ERROR on mkp-scsi/for-next jejb-scsi/for-next linus/master v6.5-rc3 next-20230726]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Bart-Van-Assche/block-Introduce-the-flag-REQ_NO_WRITE_LOCK/20230726-085936
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20230726005742.303865-3-bvanassche%40acm.org
patch subject: [PATCH v3 2/6] block/mq-deadline: Only use zone locking if necessary
config: s390-randconfig-r044-20230726 (https://download.01.org/0day-ci/archive/20230726/202307261811.g5Zde96H-lkp@intel.com/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
reproduce: (https://download.01.org/0day-ci/archive/20230726/202307261811.g5Zde96H-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202307261811.g5Zde96H-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from block/mq-deadline.c:19:
   In file included from include/trace/events/block.h:8:
   In file included from include/linux/blktrace_api.h:5:
   In file included from include/linux/blk-mq.h:8:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     547 |         val = __raw_readb(PCI_IOBASE + addr);
         |                           ~~~~~~~~~~ ^
   include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     560 |         val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu'
      37 | #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
         |                                                           ^
   include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16'
     102 | #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
         |                                                      ^
   In file included from block/mq-deadline.c:19:
   In file included from include/trace/events/block.h:8:
   In file included from include/linux/blktrace_api.h:5:
   In file included from include/linux/blk-mq.h:8:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     573 |         val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu'
      35 | #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
         |                                                           ^
   include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32'
     115 | #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
         |                                                      ^
   In file included from block/mq-deadline.c:19:
   In file included from include/trace/events/block.h:8:
   In file included from include/linux/blktrace_api.h:5:
   In file included from include/linux/blk-mq.h:8:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     584 |         __raw_writeb(value, PCI_IOBASE + addr);
         |                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     594 |         __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     604 |         __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:692:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     692 |         readsb(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:700:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     700 |         readsw(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:708:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     708 |         readsl(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:717:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     717 |         writesb(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:726:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     726 |         writesw(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:735:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     735 |         writesl(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
>> block/mq-deadline.c:349:10: error: call to undeclared function 'blk_queue_no_zone_write_lock'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     349 |         return !blk_queue_no_zone_write_lock(rq->q) ||
         |                 ^
   block/mq-deadline.c:349:10: note: did you mean 'blk_req_zone_write_lock'?
   include/linux/blk-mq.h:1213:20: note: 'blk_req_zone_write_lock' declared here
    1213 | static inline void blk_req_zone_write_lock(struct request *rq)
         |                    ^
   12 warnings and 1 error generated.


vim +/blk_queue_no_zone_write_lock +349 block/mq-deadline.c

   340	
   341	/*
   342	 * Use write locking if either QUEUE_FLAG_NO_ZONE_WRITE_LOCK or
   343	 * REQ_NO_ZONE_WRITE_LOCK has not been set. Not using zone write locking is
   344	 * only safe if the submitter allocates and submit requests in LBA order per
   345	 * zone and if the block driver preserves the request order.
   346	 */
   347	static bool dd_use_write_locking(struct request *rq)
   348	{
 > 349		return !blk_queue_no_zone_write_lock(rq->q) ||
   350			!(rq->cmd_flags & REQ_NO_ZONE_WRITE_LOCK);
   351	}
   352	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 4/6] scsi: Retry unaligned zoned writes
  2023-07-26  0:57 ` [PATCH v3 4/6] scsi: Retry unaligned zoned writes Bart Van Assche
  2023-07-26  8:47   ` Damien Le Moal
@ 2023-07-26 11:23   ` kernel test robot
  1 sibling, 0 replies; 17+ messages in thread
From: kernel test robot @ 2023-07-26 11:23 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: llvm, oe-kbuild-all

Hi Bart,

kernel test robot noticed the following build errors:

[auto build test ERROR on axboe-block/for-next]
[also build test ERROR on mkp-scsi/for-next jejb-scsi/for-next linus/master v6.5-rc3 next-20230726]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Bart-Van-Assche/block-Introduce-the-flag-REQ_NO_WRITE_LOCK/20230726-085936
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20230726005742.303865-5-bvanassche%40acm.org
patch subject: [PATCH v3 4/6] scsi: Retry unaligned zoned writes
config: s390-randconfig-r044-20230726 (https://download.01.org/0day-ci/archive/20230726/202307261938.3I7MPEJs-lkp@intel.com/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
reproduce: (https://download.01.org/0day-ci/archive/20230726/202307261938.3I7MPEJs-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202307261938.3I7MPEJs-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from drivers/scsi/scsi_error.c:32:
   In file included from include/scsi/scsi.h:10:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     547 |         val = __raw_readb(PCI_IOBASE + addr);
         |                           ~~~~~~~~~~ ^
   include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     560 |         val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu'
      37 | #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
         |                                                           ^
   include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16'
     102 | #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
         |                                                      ^
   In file included from drivers/scsi/scsi_error.c:32:
   In file included from include/scsi/scsi.h:10:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     573 |         val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu'
      35 | #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
         |                                                           ^
   include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32'
     115 | #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
         |                                                      ^
   In file included from drivers/scsi/scsi_error.c:32:
   In file included from include/scsi/scsi.h:10:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     584 |         __raw_writeb(value, PCI_IOBASE + addr);
         |                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     594 |         __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     604 |         __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:692:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     692 |         readsb(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:700:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     700 |         readsw(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:708:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     708 |         readsl(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:717:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     717 |         writesb(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:726:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     726 |         writesw(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:735:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     735 |         writesl(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
>> drivers/scsi/scsi_error.c:709:7: error: call to undeclared function 'blk_queue_no_zone_write_lock'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     709 |                     blk_queue_no_zone_write_lock(sdev->request_queue) &&
         |                     ^
   drivers/scsi/scsi_error.c:709:7: note: did you mean 'blk_req_zone_write_lock'?
   include/linux/blk-mq.h:1213:20: note: 'blk_req_zone_write_lock' declared here
    1213 | static inline void blk_req_zone_write_lock(struct request *rq)
         |                    ^
   12 warnings and 1 error generated.


vim +/blk_queue_no_zone_write_lock +709 drivers/scsi/scsi_error.c

   526	
   527	/**
   528	 * scsi_check_sense - Examine scsi cmd sense
   529	 * @scmd:	Cmd to have sense checked.
   530	 *
   531	 * Return value:
   532	 *	SUCCESS or FAILED or NEEDS_RETRY or ADD_TO_MLQUEUE
   533	 *
   534	 * Notes:
   535	 *	When a deferred error is detected the current command has
   536	 *	not been executed and needs retrying.
   537	 */
   538	enum scsi_disposition scsi_check_sense(struct scsi_cmnd *scmd)
   539	{
   540		struct request *req = scsi_cmd_to_rq(scmd);
   541		struct scsi_device *sdev = scmd->device;
   542		struct scsi_sense_hdr sshdr;
   543	
   544		if (! scsi_command_normalize_sense(scmd, &sshdr))
   545			return FAILED;	/* no valid sense data */
   546	
   547		scsi_report_sense(sdev, &sshdr);
   548	
   549		if (scsi_sense_is_deferred(&sshdr))
   550			return NEEDS_RETRY;
   551	
   552		if (sdev->handler && sdev->handler->check_sense) {
   553			enum scsi_disposition rc;
   554	
   555			rc = sdev->handler->check_sense(sdev, &sshdr);
   556			if (rc != SCSI_RETURN_NOT_HANDLED)
   557				return rc;
   558			/* handler does not care. Drop down to default handling */
   559		}
   560	
   561		if (scmd->cmnd[0] == TEST_UNIT_READY &&
   562		    scmd->submitter != SUBMITTED_BY_SCSI_ERROR_HANDLER)
   563			/*
   564			 * nasty: for mid-layer issued TURs, we need to return the
   565			 * actual sense data without any recovery attempt.  For eh
   566			 * issued ones, we need to try to recover and interpret
   567			 */
   568			return SUCCESS;
   569	
   570		/*
   571		 * Previous logic looked for FILEMARK, EOM or ILI which are
   572		 * mainly associated with tapes and returned SUCCESS.
   573		 */
   574		if (sshdr.response_code == 0x70) {
   575			/* fixed format */
   576			if (scmd->sense_buffer[2] & 0xe0)
   577				return SUCCESS;
   578		} else {
   579			/*
   580			 * descriptor format: look for "stream commands sense data
   581			 * descriptor" (see SSC-3). Assume single sense data
   582			 * descriptor. Ignore ILI from SBC-2 READ LONG and WRITE LONG.
   583			 */
   584			if ((sshdr.additional_length > 3) &&
   585			    (scmd->sense_buffer[8] == 0x4) &&
   586			    (scmd->sense_buffer[11] & 0xe0))
   587				return SUCCESS;
   588		}
   589	
   590		switch (sshdr.sense_key) {
   591		case NO_SENSE:
   592			return SUCCESS;
   593		case RECOVERED_ERROR:
   594			return /* soft_error */ SUCCESS;
   595	
   596		case ABORTED_COMMAND:
   597			if (sshdr.asc == 0x10) /* DIF */
   598				return SUCCESS;
   599	
   600			/*
   601			 * Check aborts due to command duration limit policy:
   602			 * ABORTED COMMAND additional sense code with the
   603			 * COMMAND TIMEOUT BEFORE PROCESSING or
   604			 * COMMAND TIMEOUT DURING PROCESSING or
   605			 * COMMAND TIMEOUT DURING PROCESSING DUE TO ERROR RECOVERY
   606			 * additional sense code qualifiers.
   607			 */
   608			if (sshdr.asc == 0x2e &&
   609			    sshdr.ascq >= 0x01 && sshdr.ascq <= 0x03) {
   610				set_scsi_ml_byte(scmd, SCSIML_STAT_DL_TIMEOUT);
   611				req->cmd_flags |= REQ_FAILFAST_DEV;
   612				req->rq_flags |= RQF_QUIET;
   613				return SUCCESS;
   614			}
   615	
   616			if (sshdr.asc == 0x44 && sdev->sdev_bflags & BLIST_RETRY_ITF)
   617				return ADD_TO_MLQUEUE;
   618			if (sshdr.asc == 0xc1 && sshdr.ascq == 0x01 &&
   619			    sdev->sdev_bflags & BLIST_RETRY_ASC_C1)
   620				return ADD_TO_MLQUEUE;
   621	
   622			return NEEDS_RETRY;
   623		case NOT_READY:
   624		case UNIT_ATTENTION:
   625			/*
   626			 * if we are expecting a cc/ua because of a bus reset that we
   627			 * performed, treat this just as a retry.  otherwise this is
   628			 * information that we should pass up to the upper-level driver
   629			 * so that we can deal with it there.
   630			 */
   631			if (scmd->device->expecting_cc_ua) {
   632				/*
   633				 * Because some device does not queue unit
   634				 * attentions correctly, we carefully check
   635				 * additional sense code and qualifier so as
   636				 * not to squash media change unit attention.
   637				 */
   638				if (sshdr.asc != 0x28 || sshdr.ascq != 0x00) {
   639					scmd->device->expecting_cc_ua = 0;
   640					return NEEDS_RETRY;
   641				}
   642			}
   643			/*
   644			 * we might also expect a cc/ua if another LUN on the target
   645			 * reported a UA with an ASC/ASCQ of 3F 0E -
   646			 * REPORTED LUNS DATA HAS CHANGED.
   647			 */
   648			if (scmd->device->sdev_target->expecting_lun_change &&
   649			    sshdr.asc == 0x3f && sshdr.ascq == 0x0e)
   650				return NEEDS_RETRY;
   651			/*
   652			 * if the device is in the process of becoming ready, we
   653			 * should retry.
   654			 */
   655			if ((sshdr.asc == 0x04) && (sshdr.ascq == 0x01))
   656				return NEEDS_RETRY;
   657			/*
   658			 * if the device is not started, we need to wake
   659			 * the error handler to start the motor
   660			 */
   661			if (scmd->device->allow_restart &&
   662			    (sshdr.asc == 0x04) && (sshdr.ascq == 0x02))
   663				return FAILED;
   664			/*
   665			 * Pass the UA upwards for a determination in the completion
   666			 * functions.
   667			 */
   668			return SUCCESS;
   669	
   670			/* these are not supported */
   671		case DATA_PROTECT:
   672			if (sshdr.asc == 0x27 && sshdr.ascq == 0x07) {
   673				/* Thin provisioning hard threshold reached */
   674				set_scsi_ml_byte(scmd, SCSIML_STAT_NOSPC);
   675				return SUCCESS;
   676			}
   677			fallthrough;
   678		case COPY_ABORTED:
   679		case VOLUME_OVERFLOW:
   680		case MISCOMPARE:
   681		case BLANK_CHECK:
   682			set_scsi_ml_byte(scmd, SCSIML_STAT_TGT_FAILURE);
   683			return SUCCESS;
   684	
   685		case MEDIUM_ERROR:
   686			if (sshdr.asc == 0x11 || /* UNRECOVERED READ ERR */
   687			    sshdr.asc == 0x13 || /* AMNF DATA FIELD */
   688			    sshdr.asc == 0x14) { /* RECORD NOT FOUND */
   689				set_scsi_ml_byte(scmd, SCSIML_STAT_MED_ERROR);
   690				return SUCCESS;
   691			}
   692			return NEEDS_RETRY;
   693	
   694		case HARDWARE_ERROR:
   695			if (scmd->device->retry_hwerror)
   696				return ADD_TO_MLQUEUE;
   697			else
   698				set_scsi_ml_byte(scmd, SCSIML_STAT_TGT_FAILURE);
   699			fallthrough;
   700	
   701		case ILLEGAL_REQUEST:
   702			/*
   703			 * Unaligned write command. This indicates that zoned writes
   704			 * have been received by the device in the wrong order. If zone
   705			 * write locking is disabled, retry after all pending commands
   706			 * have completed.
   707			 */
   708			if (sshdr.asc == 0x21 && sshdr.ascq == 0x04 &&
 > 709			    blk_queue_no_zone_write_lock(sdev->request_queue) &&
   710			    !scsi_noretry_cmd(scmd) && scsi_cmd_retry_allowed(scmd))
   711				return NEEDS_DELAYED_RETRY;
   712	
   713			if (sshdr.asc == 0x20 || /* Invalid command operation code */
   714			    sshdr.asc == 0x21 || /* Logical block address out of range */
   715			    sshdr.asc == 0x22 || /* Invalid function */
   716			    sshdr.asc == 0x24 || /* Invalid field in cdb */
   717			    sshdr.asc == 0x26 || /* Parameter value invalid */
   718			    sshdr.asc == 0x27) { /* Write protected */
   719				set_scsi_ml_byte(scmd, SCSIML_STAT_TGT_FAILURE);
   720			}
   721			return SUCCESS;
   722	
   723		case COMPLETED:
   724			if (sshdr.asc == 0x55 && sshdr.ascq == 0x0a) {
   725				set_scsi_ml_byte(scmd, SCSIML_STAT_DL_TIMEOUT);
   726				req->cmd_flags |= REQ_FAILFAST_DEV;
   727				req->rq_flags |= RQF_QUIET;
   728			}
   729			return SUCCESS;
   730	
   731		default:
   732			return SUCCESS;
   733		}
   734	}
   735	EXPORT_SYMBOL_GPL(scsi_check_sense);
   736	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 5/6] scsi: ufs: Disable zone write locking
  2023-07-26  0:57 ` [PATCH v3 5/6] scsi: ufs: Disable zone write locking Bart Van Assche
@ 2023-07-26 12:04   ` kernel test robot
  0 siblings, 0 replies; 17+ messages in thread
From: kernel test robot @ 2023-07-26 12:04 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: llvm, oe-kbuild-all, linux-block, Christoph Hellwig, Jaegeuk Kim,
	Bart Van Assche, Martin K . Petersen, Avri Altman, Damien Le Moal,
	Ming Lei, James E.J. Bottomley, Stanley Chu, Can Guo, Asutosh Das,
	Bao D. Nguyen, Bean Huo, Arthur Simchaev

Hi Bart,

kernel test robot noticed the following build errors:

[auto build test ERROR on axboe-block/for-next]
[also build test ERROR on jejb-scsi/for-next linus/master v6.5-rc3]
[cannot apply to mkp-scsi/for-next next-20230726]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Bart-Van-Assche/block-Introduce-the-flag-REQ_NO_WRITE_LOCK/20230726-085936
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20230726005742.303865-6-bvanassche%40acm.org
patch subject: [PATCH v3 5/6] scsi: ufs: Disable zone write locking
config: s390-randconfig-r044-20230726 (https://download.01.org/0day-ci/archive/20230726/202307261940.mazVXKtc-lkp@intel.com/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
reproduce: (https://download.01.org/0day-ci/archive/20230726/202307261940.mazVXKtc-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202307261940.mazVXKtc-lkp@intel.com/

All error/warnings (new ones prefixed by >>):

   In file included from drivers/ufs/core/ufshcd.c:25:
   In file included from include/scsi/scsi_cmnd.h:5:
   In file included from include/linux/dma-mapping.h:11:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     547 |         val = __raw_readb(PCI_IOBASE + addr);
         |                           ~~~~~~~~~~ ^
   include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     560 |         val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu'
      37 | #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
         |                                                           ^
   include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16'
     102 | #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
         |                                                      ^
   In file included from drivers/ufs/core/ufshcd.c:25:
   In file included from include/scsi/scsi_cmnd.h:5:
   In file included from include/linux/dma-mapping.h:11:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     573 |         val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu'
      35 | #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
         |                                                           ^
   include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32'
     115 | #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
         |                                                      ^
   In file included from drivers/ufs/core/ufshcd.c:25:
   In file included from include/scsi/scsi_cmnd.h:5:
   In file included from include/linux/dma-mapping.h:11:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     584 |         __raw_writeb(value, PCI_IOBASE + addr);
         |                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     594 |         __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     604 |         __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:692:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     692 |         readsb(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:700:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     700 |         readsw(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:708:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     708 |         readsl(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:717:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     717 |         writesb(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:726:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     726 |         writesw(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:735:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     735 |         writesl(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
>> drivers/ufs/core/ufshcd.c:4352:23: error: use of undeclared identifier 'QUEUE_FLAG_NO_ZONE_WRITE_LOCK'; did you mean '__REQ_NO_ZONE_WRITE_LOCK'?
    4352 |                         blk_queue_flag_set(QUEUE_FLAG_NO_ZONE_WRITE_LOCK, q);
         |                                            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         |                                            __REQ_NO_ZONE_WRITE_LOCK
   include/linux/blk_types.h:424:2: note: '__REQ_NO_ZONE_WRITE_LOCK' declared here
     424 |         __REQ_NO_ZONE_WRITE_LOCK,
         |         ^
   drivers/ufs/core/ufshcd.c:4354:25: error: use of undeclared identifier 'QUEUE_FLAG_NO_ZONE_WRITE_LOCK'; did you mean '__REQ_NO_ZONE_WRITE_LOCK'?
    4354 |                         blk_queue_flag_clear(QUEUE_FLAG_NO_ZONE_WRITE_LOCK, q);
         |                                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         |                                              __REQ_NO_ZONE_WRITE_LOCK
   include/linux/blk_types.h:424:2: note: '__REQ_NO_ZONE_WRITE_LOCK' declared here
     424 |         __REQ_NO_ZONE_WRITE_LOCK,
         |         ^
>> drivers/ufs/core/ufshcd.c:4340:6: warning: no previous prototype for function 'ufshcd_update_no_zone_write_lock' [-Wmissing-prototypes]
    4340 | void ufshcd_update_no_zone_write_lock(struct ufs_hba *hba,
         |      ^
   drivers/ufs/core/ufshcd.c:4340:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
    4340 | void ufshcd_update_no_zone_write_lock(struct ufs_hba *hba,
         | ^
         | static 
   drivers/ufs/core/ufshcd.c:5180:21: error: use of undeclared identifier 'QUEUE_FLAG_NO_ZONE_WRITE_LOCK'; did you mean '__REQ_NO_ZONE_WRITE_LOCK'?
    5180 |         blk_queue_flag_set(QUEUE_FLAG_NO_ZONE_WRITE_LOCK, q);
         |                            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         |                            __REQ_NO_ZONE_WRITE_LOCK
   include/linux/blk_types.h:424:2: note: '__REQ_NO_ZONE_WRITE_LOCK' declared here
     424 |         __REQ_NO_ZONE_WRITE_LOCK,
         |         ^
   drivers/ufs/core/ufshcd.c:10277:44: warning: shift count >= width of type [-Wshift-count-overflow]
    10277 |                 if (!dma_set_mask_and_coherent(hba->dev, DMA_BIT_MASK(64)))
          |                                                          ^~~~~~~~~~~~~~~~
   include/linux/dma-mapping.h:77:54: note: expanded from macro 'DMA_BIT_MASK'
      77 | #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1))
         |                                                      ^ ~~~
   14 warnings and 3 errors generated.


vim +4352 drivers/ufs/core/ufshcd.c

  4339	
> 4340	void ufshcd_update_no_zone_write_lock(struct ufs_hba *hba,
  4341					      bool set_no_zone_write_lock)
  4342	{
  4343		struct scsi_device *sdev;
  4344	
  4345		shost_for_each_device(sdev, hba->host)
  4346			blk_freeze_queue_start(sdev->request_queue);
  4347		shost_for_each_device(sdev, hba->host) {
  4348			struct request_queue *q = sdev->request_queue;
  4349	
  4350			blk_mq_freeze_queue_wait(q);
  4351			if (set_no_zone_write_lock)
> 4352				blk_queue_flag_set(QUEUE_FLAG_NO_ZONE_WRITE_LOCK, q);
  4353			else
  4354				blk_queue_flag_clear(QUEUE_FLAG_NO_ZONE_WRITE_LOCK, q);
  4355			blk_mq_unfreeze_queue(q);
  4356		}
  4357	}
  4358	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 1/6] block: Introduce the flag REQ_NO_WRITE_LOCK
  2023-07-26  8:37   ` Damien Le Moal
@ 2023-07-26 14:58     ` Bart Van Assche
  0 siblings, 0 replies; 17+ messages in thread
From: Bart Van Assche @ 2023-07-26 14:58 UTC (permalink / raw)
  To: Damien Le Moal, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Ming Lei

On 7/26/23 01:37, Damien Le Moal wrote:
> REQ_NO_WRITE_LOCK indicates that a write request does
> not need zone write locking when the underlying device signaled that it can
> preserve command ordering. So we also need a queue flag for that...

My apologies: I accidentally left out the patch "block: Introduce the 
flag QUEUE_FLAG_NO_ZONE_WRITE_LOCK" from this series. I will repost this 
series.

Bart.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 4/6] scsi: Retry unaligned zoned writes
  2023-07-26  8:47   ` Damien Le Moal
@ 2023-07-26 15:02     ` Bart Van Assche
  2023-07-26 23:46       ` Damien Le Moal
  0 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2023-07-26 15:02 UTC (permalink / raw)
  To: Damien Le Moal, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Martin K . Petersen,
	Damien Le Moal, Ming Lei, James E.J. Bottomley

On 7/26/23 01:47, Damien Le Moal wrote:
> On 7/26/23 09:57, Bart Van Assche wrote:
>> +/*
>> + * Returns a negative value if @_a has a lower LBA than @_b, zero if
>> + * both have the same LBA and a positive value otherwise.
>> + */
>> +static int scsi_cmp_lba(void *priv, const struct list_head *_a,
>> +			const struct list_head *_b)
> 
> The argument priv is unused.

I cannot remove the 'priv' argument. From include/linux/list_sort.h:

typedef int __attribute__((nonnull(2,3))) (*list_cmp_func_t)(void *,
		const struct list_head *, const struct list_head *);

__attribute__((nonnull(2,3)))
void list_sort(void *priv, struct list_head *head, list_cmp_func_t cmp);

>>   /**
>>    * scsi_unjam_host - Attempt to fix a host which has a cmd that failed.
>>    * @shost:	Host to unjam.
>> @@ -2258,6 +2289,12 @@ static void scsi_unjam_host(struct Scsi_Host *shost)
>>   
>>   	SCSI_LOG_ERROR_RECOVERY(1, scsi_eh_prt_fail_stats(shost, &eh_work_q));
>>   
>> +	/*
>> +	 * Sort pending SCSI commands in LBA order. This is important if zone
>> +	 * write locking is disabled for a zoned SCSI device.
>> +	 */
>> +	list_sort(NULL, &eh_work_q, scsi_cmp_lba);
> 
> Should we do this only for zoned devices ?

I'm not sure this is possible. Error handling happens per SCSI host. 
Some of the logical units associated with a host may be zoned while 
others may represent conventional storage or have no storage associated 
with them (WLUN).

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 4/6] scsi: Retry unaligned zoned writes
  2023-07-26 15:02     ` Bart Van Assche
@ 2023-07-26 23:46       ` Damien Le Moal
  0 siblings, 0 replies; 17+ messages in thread
From: Damien Le Moal @ 2023-07-26 23:46 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Jaegeuk Kim, Martin K . Petersen,
	Ming Lei, James E.J. Bottomley

On 7/27/23 00:02, Bart Van Assche wrote:
> On 7/26/23 01:47, Damien Le Moal wrote:
>> On 7/26/23 09:57, Bart Van Assche wrote:
>>> +/*
>>> + * Returns a negative value if @_a has a lower LBA than @_b, zero if
>>> + * both have the same LBA and a positive value otherwise.
>>> + */
>>> +static int scsi_cmp_lba(void *priv, const struct list_head *_a,
>>> +            const struct list_head *_b)
>>
>> The argument priv is unused.
> 
> I cannot remove the 'priv' argument. From include/linux/list_sort.h:
> 
> typedef int __attribute__((nonnull(2,3))) (*list_cmp_func_t)(void *,
>         const struct list_head *, const struct list_head *);
> 
> __attribute__((nonnull(2,3)))
> void list_sort(void *priv, struct list_head *head, list_cmp_func_t cmp);

Yes. I missed that this is a callbalck. Sorry for the noise.

> 
>>>   /**
>>>    * scsi_unjam_host - Attempt to fix a host which has a cmd that failed.
>>>    * @shost:    Host to unjam.
>>> @@ -2258,6 +2289,12 @@ static void scsi_unjam_host(struct Scsi_Host *shost)
>>>         SCSI_LOG_ERROR_RECOVERY(1, scsi_eh_prt_fail_stats(shost, &eh_work_q));
>>>   +    /*
>>> +     * Sort pending SCSI commands in LBA order. This is important if zone
>>> +     * write locking is disabled for a zoned SCSI device.
>>> +     */
>>> +    list_sort(NULL, &eh_work_q, scsi_cmp_lba);
>>
>> Should we do this only for zoned devices ?
> 
> I'm not sure this is possible. Error handling happens per SCSI host. Some of
> the logical units associated with a host may be zoned while others may
> represent conventional storage or have no storage associated with them (WLUN).
> 
> Thanks,
> 
> Bart.

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-07-26 23:46 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-26  0:57 [PATCH v3 0/6] Improve the performance of F2FS on zoned UFS Bart Van Assche
2023-07-26  0:57 ` [PATCH v3 1/6] block: Introduce the flag REQ_NO_WRITE_LOCK Bart Van Assche
2023-07-26  8:37   ` Damien Le Moal
2023-07-26 14:58     ` Bart Van Assche
2023-07-26  0:57 ` [PATCH v3 2/6] block/mq-deadline: Only use zone locking if necessary Bart Van Assche
2023-07-26  8:41   ` Damien Le Moal
2023-07-26 10:32   ` kernel test robot
2023-07-26  0:57 ` [PATCH v3 3/6] block/null_blk: Support disabling zone write locking Bart Van Assche
2023-07-26  8:43   ` Damien Le Moal
2023-07-26  0:57 ` [PATCH v3 4/6] scsi: Retry unaligned zoned writes Bart Van Assche
2023-07-26  8:47   ` Damien Le Moal
2023-07-26 15:02     ` Bart Van Assche
2023-07-26 23:46       ` Damien Le Moal
2023-07-26 11:23   ` kernel test robot
2023-07-26  0:57 ` [PATCH v3 5/6] scsi: ufs: Disable zone write locking Bart Van Assche
2023-07-26 12:04   ` kernel test robot
2023-07-26  0:57 ` [PATCH v3 6/6] fs/f2fs: " Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.