linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] Block layer support ZAC/ZBC commands
@ 2016-06-10  7:10 Shaun Tancheff
  2016-06-10  7:10 ` [PATCH v3 1/3] Add bio/request flags for using ZBC/ZAC commands Shaun Tancheff
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Shaun Tancheff @ 2016-06-10  7:10 UTC (permalink / raw)
  To: linux-ide, linux-block, linux-scsi
  Cc: Shaun Tancheff, Jens Axboe, James E . J . Bottomley,
	Martin K . Petersen, Jeff Layton, J . Bruce Fields

Hi Jens,

This series is on your for-next branch.

As Host Aware drives are becoming available we would like to be able
to make use of such drives. This series is also intended to be suitable
for use by Host Managed drives.

ZAC/ZBC drives add new commands for discovering and working with Zones.

This extends the ZAC/ZBC support up to the block layer.

Patches for util-linux can be found here:
https://github.com/Seagate/ZDM-Device-Mapper/tree/master/patches/util-linux

Using BIOs to issue ZBC commands allows DM targets (such as ZDM) or
file-systems such as btrfs or nilfs2 to extend their block allocation
schemes and issue discards that are zone aware.

A perhaps non-obvious approach is that a conventional drive will 
returns a descriptor with a single large conventional zone.

The last patch dealing with ata16 passthrough is to workaround HBA SAS 
controllers that don't support ZBC. It will be dropped now that firmware
updates are starting to appear.

V3:
 - Rebase on Mike Cristie's separate bio operations
 - Update blkzoned_api.h to include report zones PARTIAL bit.
 - Use zoned report reserved bit for ata-passthrough flag.

V2:
 - Changed bi_rw to op_flags clarify sepeartion of bio op from flags.
 - Fixed memory leak in blkdev_issue_zone_report failing to put_bio().
 - Documented opt in blkdev_issue_zone_report.
 - Moved include/uapi/linux/fs.h changes to patch 3
 - Fixed commit message for first patch in series.

Shaun Tancheff (3):
  Add bio/request flags for using ZBC/ZAC commands
  Add ioctl to issue ZBC/ZAC commands via block layer
  Add ata pass-through path for ZAC commands.

 MAINTAINERS                       |   9 ++
 block/blk-lib.c                   |  98 +++++++++++++++++
 block/ioctl.c                     | 142 ++++++++++++++++++++++++
 drivers/scsi/sd.c                 | 139 ++++++++++++++++++++++-
 drivers/scsi/sd.h                 |   1 +
 include/linux/ata.h               |  15 +++
 include/linux/bio.h               |   4 +-
 include/linux/blk_types.h         |  16 ++-
 include/linux/blkzoned_api.h      |  25 +++++
 include/uapi/linux/Kbuild         |   1 +
 include/uapi/linux/blkzoned_api.h | 224 ++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/fs.h           |   1 +
 12 files changed, 671 insertions(+), 4 deletions(-)
 create mode 100644 include/linux/blkzoned_api.h
 create mode 100644 include/uapi/linux/blkzoned_api.h

-- 
2.8.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 1/3] Add bio/request flags for using ZBC/ZAC commands
  2016-06-10  7:10 [PATCH v3 0/3] Block layer support ZAC/ZBC commands Shaun Tancheff
@ 2016-06-10  7:10 ` Shaun Tancheff
  2016-06-10  7:10 ` [PATCH v3 2/3] Add ioctl to issue ZBC/ZAC commands via block layer Shaun Tancheff
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Shaun Tancheff @ 2016-06-10  7:10 UTC (permalink / raw)
  To: linux-ide, linux-block, linux-scsi
  Cc: Shaun Tancheff, Jens Axboe, James E . J . Bottomley,
	Martin K . Petersen, Jeff Layton, J . Bruce Fields,
	Shaun Tancheff

T10 ZBC and T13 ZAC specify operations for Zoned devices.

To be able to access the zone information and open and close zones
adding flags for the report zones command (REQ_REPORT_ZONES) and for
Open and Close zone (REQ_OPEN_ZONE and REQ_CLOSE_ZONE) can be added
for use by struct bio's bi_rw and by struct request's cmd_flags.

To reduce the number of additional flags needed REQ_RESET_ZONE shares
the same flag as REQ_REPORT_ZONES and is differentiated by direction.
Report zones is a device read that requires a buffer. Reset is a device
command (WRITE) that has no associated data transfer.

The Finish zone command is intentionally not implimented as there is no
current use case for that operation.

Report zones currently defaults to reporting on all zones. It expected
that support for the zone option flag will piggy back on streamid
support. The report option is useful as it can reduce the number of
zones in each report, but not critical.

Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
---
V3:
 - Rebase on Mike Cristie's separate bio operations
 - Update blkzoned_api.h to include report zones PARTIAL bit.

V2:
 - Changed bi_rw to op_flags clarify sepeartion of bio op from flags.
 - Fixed memory leak in blkdev_issue_zone_report failing to put_bio().
 - Documented opt in blkdev_issue_zone_report.
 - Removed include/uapi/linux/fs.h from this patch.
---
 MAINTAINERS                       |   9 ++
 block/blk-lib.c                   |  98 +++++++++++++++++
 drivers/scsi/sd.c                 |  99 ++++++++++++++++++
 drivers/scsi/sd.h                 |   1 +
 include/linux/bio.h               |   4 +-
 include/linux/blk_types.h         |  16 ++-
 include/linux/blkzoned_api.h      |  25 +++++
 include/uapi/linux/Kbuild         |   1 +
 include/uapi/linux/blkzoned_api.h | 214 ++++++++++++++++++++++++++++++++++++++
 9 files changed, 464 insertions(+), 3 deletions(-)
 create mode 100644 include/linux/blkzoned_api.h
 create mode 100644 include/uapi/linux/blkzoned_api.h

diff --git a/MAINTAINERS b/MAINTAINERS
index ed42cb6..d9fafa2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12662,6 +12662,15 @@ F:	Documentation/networking/z8530drv.txt
 F:	drivers/net/hamradio/*scc.c
 F:	drivers/net/hamradio/z8530.h
 
+ZBC AND ZBC BLOCK DEVICES
+M:	Shaun Tancheff <shaun.tancheff@seagate.com>
+W:	http://seagate.com
+W:	https://github.com/Seagate/ZDM-Device-Mapper
+L:	linux-block@vger.kernel.org
+S:	Maintained
+F:	include/linux/blkzoned_api.h
+F:	include/uapi/linux/blkzoned_api.h
+
 ZBUD COMPRESSED PAGE ALLOCATOR
 M:	Seth Jennings <sjenning@redhat.com>
 L:	linux-mm@kvack.org
diff --git a/block/blk-lib.c b/block/blk-lib.c
index ff2a7f0..eda0071 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -6,6 +6,7 @@
 #include <linux/bio.h>
 #include <linux/blkdev.h>
 #include <linux/scatterlist.h>
+#include <linux/blkzoned_api.h>
 
 #include "blk.h"
 
@@ -252,3 +253,100 @@ int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
 	return __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask);
 }
 EXPORT_SYMBOL(blkdev_issue_zeroout);
+
+/**
+ * blkdev_issue_zone_report - queue a report zones operation
+ * @bdev:	target blockdev
+ * @op_flags:	extra bio rw flags. If unsure, use 0.
+ * @sector:	starting sector (report will include this sector).
+ * @opt:	See: zone_report_option, default is 0 (all zones).
+ * @page:	one or more contiguous pages.
+ * @pgsz:	up to size of page in bytes, size of report.
+ * @gfp_mask:	memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ *    Issue a zone report request for the sectors in question.
+ */
+int blkdev_issue_zone_report(struct block_device *bdev, unsigned int op_flags,
+			     sector_t sector, u8 opt, struct page *page,
+			     size_t pgsz, gfp_t gfp_mask)
+{
+	struct bdev_zone_report *conv = page_address(page);
+	struct bio *bio;
+	unsigned int nr_iovecs = 1;
+	int ret = 0;
+
+	if (pgsz < (sizeof(struct bdev_zone_report) +
+		    sizeof(struct bdev_zone_descriptor)))
+		return -EINVAL;
+
+	bio = bio_alloc(gfp_mask, nr_iovecs);
+	if (!bio)
+		return -ENOMEM;
+
+	conv->descriptor_count = 0;
+	bio->bi_iter.bi_sector = sector;
+	bio->bi_bdev = bdev;
+	bio->bi_vcnt = 0;
+	bio->bi_iter.bi_size = 0;
+
+	op_flags |= REQ_REPORT_ZONES;
+
+	/* FUTURE ... when streamid is available: */
+	/* bio_set_streamid(bio, opt); */
+
+	bio_add_page(bio, page, pgsz, 0);
+	bio_set_op_attrs(bio, REQ_OP_READ, op_flags);
+	ret = submit_bio_wait(bio);
+
+	/*
+	 * When our request it nak'd the underlying device maybe conventional
+	 * so ... report a single conventional zone the size of the device.
+	 */
+	if (ret == -EIO && conv->descriptor_count) {
+		/* Adjust the conventional to the size of the partition ... */
+		__be64 blksz = cpu_to_be64(bdev->bd_part->nr_sects);
+
+		conv->maximum_lba = blksz;
+		conv->descriptors[0].type = ZTYP_CONVENTIONAL;
+		conv->descriptors[0].flags = ZCOND_CONVENTIONAL << 4;
+		conv->descriptors[0].length = blksz;
+		conv->descriptors[0].lba_start = 0;
+		conv->descriptors[0].lba_wptr = blksz;
+		ret = 0;
+	}
+	bio_put(bio);
+	return ret;
+}
+EXPORT_SYMBOL(blkdev_issue_zone_report);
+
+/**
+ * blkdev_issue_zone_action - queue a report zones operation
+ * @bdev:	target blockdev
+ * @op_flags:	REQ_OPEN_ZONE, REQ_CLOSE_ZONE, or REQ_RESET_ZONE.
+ * @sector:	starting lba of sector
+ * @gfp_mask:	memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ *    Issue a zone report request for the sectors in question.
+ */
+int blkdev_issue_zone_action(struct block_device *bdev, unsigned int op_flags,
+			     sector_t sector, gfp_t gfp_mask)
+{
+	int ret;
+	struct bio *bio;
+
+	bio = bio_alloc(gfp_mask, 1);
+	if (!bio)
+		return -ENOMEM;
+
+	bio->bi_iter.bi_sector = sector;
+	bio->bi_bdev = bdev;
+	bio->bi_vcnt = 0;
+	bio->bi_iter.bi_size = 0;
+	bio_set_op_attrs(bio, REQ_OP_WRITE, op_flags);
+	ret = submit_bio_wait(bio);
+	bio_put(bio);
+	return ret;
+}
+EXPORT_SYMBOL(blkdev_issue_zone_action);
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 5a9db0f..241faf5 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -52,6 +52,7 @@
 #include <linux/slab.h>
 #include <linux/pm_runtime.h>
 #include <linux/pr.h>
+#include <linux/blkzoned_api.h>
 #include <asm/uaccess.h>
 #include <asm/unaligned.h>
 
@@ -1134,6 +1135,100 @@ static int sd_setup_read_write_cmnd(struct scsi_cmnd *SCpnt)
 	return ret;
 }
 
+static int sd_setup_zoned_cmnd(struct scsi_cmnd *cmd)
+{
+	struct request *rq = cmd->request;
+	struct scsi_device *sdp = cmd->device;
+	struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+	struct bio *bio = rq->bio;
+	sector_t sector = blk_rq_pos(rq);
+	struct gendisk *disk = rq->rq_disk;
+	unsigned int nr_bytes = blk_rq_bytes(rq);
+	int ret = BLKPREP_KILL;
+	u8 allbit = 0;
+
+	if (rq->cmd_flags & REQ_REPORT_ZONES && rq_data_dir(rq) == READ) {
+		WARN_ON(nr_bytes == 0);
+
+		/*
+		 * For conventional drives generate a report that shows a
+		 * large single convetional zone the size of the block device
+		 */
+		if (sdkp->zoned != 1 && sdkp->device->type != TYPE_ZBC) {
+			void *src;
+			struct bdev_zone_report *conv;
+
+			if (nr_bytes < sizeof(struct bdev_zone_report))
+				goto out;
+
+			src = kmap_atomic(bio->bi_io_vec->bv_page);
+			conv = src + bio->bi_io_vec->bv_offset;
+			conv->descriptor_count = cpu_to_be32(1);
+			conv->same_field = ZS_ALL_SAME;
+			conv->maximum_lba = cpu_to_be64(disk->part0.nr_sects);
+			kunmap_atomic(src);
+			goto out;
+		}
+
+		ret = scsi_init_io(cmd);
+		if (ret != BLKPREP_OK)
+			goto out;
+
+		cmd = rq->special;
+		if (sdp->changed) {
+			pr_err("SCSI disk has been changed or is not present.");
+			ret = BLKPREP_KILL;
+			goto out;
+		}
+
+		cmd->cmd_len = 16;
+		memset(cmd->cmnd, 0, cmd->cmd_len);
+		cmd->cmnd[0] = ZBC_IN;
+		cmd->cmnd[1] = ZI_REPORT_ZONES;
+		put_unaligned_be64(sector, &cmd->cmnd[2]);
+		put_unaligned_be32(nr_bytes, &cmd->cmnd[10]);
+		/* FUTURE ... when streamid is available */
+		/* cmd->cmnd[14] = bio_get_streamid(bio); */
+
+		cmd->sc_data_direction = DMA_FROM_DEVICE;
+		cmd->sdb.length = nr_bytes;
+		cmd->transfersize = sdp->sector_size;
+		cmd->underflow = 0;
+		cmd->allowed = SD_MAX_RETRIES;
+		ret = BLKPREP_OK;
+		goto out;
+	}
+
+	if (sdkp->zoned != 1 && sdkp->device->type != TYPE_ZBC)
+		goto out;
+
+	if (sector == ~0ul) {
+		allbit = 1;
+		sector = 0;
+	}
+
+	cmd->cmd_len = 16;
+	memset(cmd->cmnd, 0, cmd->cmd_len);
+	memset(&cmd->sdb, 0, sizeof(cmd->sdb));
+	cmd->cmnd[0] = ZBC_OUT;
+	cmd->cmnd[1] = ZO_OPEN_ZONE;
+	if (rq->cmd_flags & REQ_CLOSE_ZONE)
+		cmd->cmnd[1] = ZO_CLOSE_ZONE;
+	if (rq->cmd_flags & REQ_RESET_ZONE)
+		cmd->cmnd[1] = ZO_RESET_WRITE_POINTER;
+	cmd->cmnd[14] = allbit;
+	put_unaligned_be64(sector, &cmd->cmnd[2]);
+
+	cmd->transfersize = 0;
+	cmd->underflow = 0;
+	cmd->allowed = SD_MAX_RETRIES;
+	cmd->sc_data_direction = DMA_NONE;
+
+	ret = BLKPREP_OK;
+out:
+	return ret;
+}
+
 static int sd_init_command(struct scsi_cmnd *cmd)
 {
 	struct request *rq = cmd->request;
@@ -1147,6 +1242,8 @@ static int sd_init_command(struct scsi_cmnd *cmd)
 		return sd_setup_flush_cmnd(cmd);
 	case REQ_OP_READ:
 	case REQ_OP_WRITE:
+		if (rq->cmd_flags & REQ_ZONED_CMDS)
+			return sd_setup_zoned_cmnd(cmd);
 		return sd_setup_read_write_cmnd(cmd);
 	default:
 		BUG();
@@ -2738,6 +2835,8 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
 		queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, sdkp->disk->queue);
 	}
 
+	sdkp->zoned = (buffer[8] >> 4) & 3;
+
  out:
 	kfree(buffer);
 }
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 654630b..e012175 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -94,6 +94,7 @@ struct scsi_disk {
 	unsigned	lbpvpd : 1;
 	unsigned	ws10 : 1;
 	unsigned	ws16 : 1;
+	unsigned	zoned: 2;
 };
 #define to_scsi_disk(obj) container_of(obj,struct scsi_disk,dev)
 
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 0bbb2e3..d428218 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -104,7 +104,9 @@ static inline bool bio_has_data(struct bio *bio)
 
 static inline bool bio_no_advance_iter(struct bio *bio)
 {
-	return bio_op(bio) == REQ_OP_DISCARD || bio_op(bio) == REQ_OP_WRITE_SAME;
+	return bio_op(bio) == REQ_OP_DISCARD ||
+	       bio_op(bio) == REQ_OP_WRITE_SAME ||
+	       bio->bi_rw & REQ_ZONED_CMDS;
 }
 
 static inline bool bio_is_rw(struct bio *bio)
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 562ab83..f4c84ec 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -170,6 +170,10 @@ enum rq_flag_bits {
 	__REQ_FUA,		/* forced unit access */
 	__REQ_PREFLUSH,		/* request for cache flush */
 
+	__REQ_REPORT_ZONES,	/* Zoned device: Report Zones */
+	__REQ_OPEN_ZONE,	/* Zoned device: Open Zone */
+	__REQ_CLOSE_ZONE,	/* Zoned device: Close Zone */
+
 	/* bio only flags */
 	__REQ_RAHEAD,		/* read ahead, can fail anytime */
 	__REQ_THROTTLED,	/* This bio has already been subjected to
@@ -207,17 +211,25 @@ enum rq_flag_bits {
 #define REQ_PRIO		(1ULL << __REQ_PRIO)
 #define REQ_NOIDLE		(1ULL << __REQ_NOIDLE)
 #define REQ_INTEGRITY		(1ULL << __REQ_INTEGRITY)
+#define REQ_REPORT_ZONES	(1ULL << __REQ_REPORT_ZONES)
+#define REQ_OPEN_ZONE		(1ULL << __REQ_OPEN_ZONE)
+#define REQ_CLOSE_ZONE		(1ULL << __REQ_CLOSE_ZONE)
+#define REQ_RESET_ZONE		(REQ_REPORT_ZONES)
+#define REQ_ZONED_CMDS \
+	(REQ_OPEN_ZONE | REQ_CLOSE_ZONE | REQ_RESET_ZONE | REQ_REPORT_ZONES)
 
 #define REQ_FAILFAST_MASK \
 	(REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT | REQ_FAILFAST_DRIVER)
 #define REQ_COMMON_MASK \
 	(REQ_FAILFAST_MASK | REQ_SYNC | REQ_META | REQ_PRIO | REQ_NOIDLE | \
-	 REQ_PREFLUSH | REQ_FUA | REQ_SECURE | REQ_INTEGRITY | REQ_NOMERGE)
+	 REQ_PREFLUSH | REQ_FUA | REQ_SECURE | REQ_INTEGRITY | REQ_NOMERGE | \
+	 REQ_ZONED_CMDS)
 #define REQ_CLONE_MASK		REQ_COMMON_MASK
 
 /* This mask is used for both bio and request merge checking */
 #define REQ_NOMERGE_FLAGS \
-	(REQ_NOMERGE | REQ_STARTED | REQ_SOFTBARRIER | REQ_PREFLUSH | REQ_FUA | REQ_FLUSH_SEQ)
+	(REQ_NOMERGE | REQ_STARTED | REQ_SOFTBARRIER | REQ_PREFLUSH | \
+	 REQ_FUA | REQ_FLUSH_SEQ | REQ_ZONED_CMDS)
 
 #define REQ_RAHEAD		(1ULL << __REQ_RAHEAD)
 #define REQ_THROTTLED		(1ULL << __REQ_THROTTLED)
diff --git a/include/linux/blkzoned_api.h b/include/linux/blkzoned_api.h
new file mode 100644
index 0000000..9fc2373
--- /dev/null
+++ b/include/linux/blkzoned_api.h
@@ -0,0 +1,25 @@
+/*
+ * Functions for zone based SMR devices.
+ *
+ * Copyright (C) 2015 Seagate Technology PLC
+ *
+ * Written by:
+ * Shaun Tancheff <shaun.tancheff@seagate.com>
+ *
+ * This file is licensed under  the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#ifndef _BLKZONED_API_H
+#define _BLKZONED_API_H
+
+#include <uapi/linux/blkzoned_api.h>
+
+extern int blkdev_issue_zone_action(struct block_device *, unsigned int,
+				    sector_t, gfp_t);
+extern int blkdev_issue_zone_report(struct block_device *, unsigned int,
+				    sector_t, u8 opt, struct page *, size_t,
+				    gfp_t);
+
+#endif /* _BLKZONED_API_H */
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 8bdae34..5152fa4 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -70,6 +70,7 @@ header-y += bfs_fs.h
 header-y += binfmts.h
 header-y += blkpg.h
 header-y += blktrace_api.h
+header-y += blkzoned_api.h
 header-y += bpf_common.h
 header-y += bpf.h
 header-y += bpqether.h
diff --git a/include/uapi/linux/blkzoned_api.h b/include/uapi/linux/blkzoned_api.h
new file mode 100644
index 0000000..48c17ad
--- /dev/null
+++ b/include/uapi/linux/blkzoned_api.h
@@ -0,0 +1,214 @@
+/*
+ * Functions for zone based SMR devices.
+ *
+ * Copyright (C) 2015 Seagate Technology PLC
+ *
+ * Written by:
+ * Shaun Tancheff <shaun.tancheff@seagate.com>
+ *
+ * This file is licensed under  the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#ifndef _UAPI_BLKZONED_API_H
+#define _UAPI_BLKZONED_API_H
+
+#include <linux/types.h>
+
+/**
+ * enum zone_report_option - Report Zones types to be included.
+ *
+ * @ZOPT_NON_SEQ_AND_RESET: Default (all zones).
+ * @ZOPT_ZC1_EMPTY: Zones which are empty.
+ * @ZOPT_ZC2_OPEN_IMPLICIT: Zones open but not explicitly opened
+ * @ZOPT_ZC3_OPEN_EXPLICIT: Zones opened explicitly
+ * @ZOPT_ZC4_CLOSED: Zones closed for writing.
+ * @ZOPT_ZC5_FULL: Zones that are full.
+ * @ZOPT_ZC6_READ_ONLY: Zones that are read-only
+ * @ZOPT_ZC7_OFFLINE: Zones that are offline
+ * @ZOPT_RESET: Zones that are empty
+ * @ZOPT_NON_SEQ: Zones that have HA media-cache writes pending
+ * @ZOPT_NON_WP_ZONES: Zones that do not have Write Pointers (conventional)
+ * @ZOPT_PARTIAL_FLAG: Modifies the definition of the Zone List Length field.
+ *
+ * Used by Report Zones in bdev_zone_get_report: report_option
+ */
+enum zone_report_option {
+	ZOPT_NON_SEQ_AND_RESET   = 0x00,
+	ZOPT_ZC1_EMPTY,
+	ZOPT_ZC2_OPEN_IMPLICIT,
+	ZOPT_ZC3_OPEN_EXPLICIT,
+	ZOPT_ZC4_CLOSED,
+	ZOPT_ZC5_FULL,
+	ZOPT_ZC6_READ_ONLY,
+	ZOPT_ZC7_OFFLINE,
+	ZOPT_RESET               = 0x10,
+	ZOPT_NON_SEQ             = 0x11,
+	ZOPT_NON_WP_ZONES        = 0x3f,
+	ZOPT_PARTIAL_FLAG        = 0x80,
+};
+
+/**
+ * enum bdev_zone_type - Type of zone in descriptor
+ *
+ * @ZTYP_RESERVED: Reserved
+ * @ZTYP_CONVENTIONAL: Conventional random write zone (No Write Pointer)
+ * @ZTYP_SEQ_WRITE_REQUIRED: Non-sequential writes are rejected.
+ * @ZTYP_SEQ_WRITE_PREFERRED: Non-sequential writes allowed but discouraged.
+ *
+ * Returned from Report Zones. See bdev_zone_descriptor* type.
+ */
+enum bdev_zone_type {
+	ZTYP_RESERVED            = 0,
+	ZTYP_CONVENTIONAL        = 1,
+	ZTYP_SEQ_WRITE_REQUIRED  = 2,
+	ZTYP_SEQ_WRITE_PREFERRED = 3,
+};
+
+
+/**
+ * enum bdev_zone_condition - Condition of zone in descriptor
+ *
+ * @ZCOND_CONVENTIONAL: N/A
+ * @ZCOND_ZC1_EMPTY: Empty
+ * @ZCOND_ZC2_OPEN_IMPLICIT: Opened via write to zone.
+ * @ZCOND_ZC3_OPEN_EXPLICIT: Opened via open zone command.
+ * @ZCOND_ZC4_CLOSED: Closed
+ * @ZCOND_ZC6_READ_ONLY:
+ * @ZCOND_ZC5_FULL: No remaining space in zone.
+ * @ZCOND_ZC7_OFFLINE: Offline
+ *
+ * Returned from Report Zones. See bdev_zone_descriptor* flags.
+ */
+enum bdev_zone_condition {
+	ZCOND_CONVENTIONAL       = 0,
+	ZCOND_ZC1_EMPTY          = 1,
+	ZCOND_ZC2_OPEN_IMPLICIT  = 2,
+	ZCOND_ZC3_OPEN_EXPLICIT  = 3,
+	ZCOND_ZC4_CLOSED         = 4,
+	/* 0x5 to 0xC are reserved */
+	ZCOND_ZC6_READ_ONLY      = 0xd,
+	ZCOND_ZC5_FULL           = 0xe,
+	ZCOND_ZC7_OFFLINE        = 0xf,
+};
+
+
+/**
+ * enum bdev_zone_same - Report Zones same code.
+ *
+ * @ZS_ALL_DIFFERENT: All zones differ in type and size.
+ * @ZS_ALL_SAME: All zones are the same size and type.
+ * @ZS_LAST_DIFFERS: All zones are the same size and type except the last zone.
+ * @ZS_SAME_LEN_DIFF_TYPES: All zones are the same length but types differ.
+ *
+ * Returned from Report Zones. See bdev_zone_report* same_field.
+ */
+enum bdev_zone_same {
+	ZS_ALL_DIFFERENT        = 0,
+	ZS_ALL_SAME             = 1,
+	ZS_LAST_DIFFERS         = 2,
+	ZS_SAME_LEN_DIFF_TYPES  = 3,
+};
+
+
+/**
+ * struct bdev_zone_get_report - ioctl: Report Zones request
+ *
+ * @zone_locator_lba: starting lba for first [reported] zone
+ * @return_page_count: number of *bytes* allocated for result
+ * @report_option: see: zone_report_option enum
+ *
+ * Used to issue report zones command to connected device
+ */
+struct bdev_zone_get_report {
+	__u64 zone_locator_lba;
+	__u32 return_page_count;
+	__u8  report_option;
+} __packed;
+
+/**
+ * struct bdev_zone_descriptor_le - See: bdev_zone_descriptor
+ */
+struct bdev_zone_descriptor_le {
+	__u8 type;
+	__u8 flags;
+	__u8 reserved1[6];
+	__le64 length;
+	__le64 lba_start;
+	__le64 lba_wptr;
+	__u8 reserved[32];
+} __packed;
+
+
+/**
+ * struct bdev_zone_report_le - See: bdev_zone_report
+ */
+struct bdev_zone_report_le {
+	__le32 descriptor_count;
+	__u8 same_field;
+	__u8 reserved1[3];
+	__le64 maximum_lba;
+	__u8 reserved2[48];
+	struct bdev_zone_descriptor_le descriptors[0];
+} __packed;
+
+
+/**
+ * struct bdev_zone_descriptor - A Zone descriptor entry from report zones
+ *
+ * @type: see zone_type enum
+ * @flags: Bits 0:reset, 1:non-seq, 2-3: resv, 4-7: see zone_condition enum
+ * @reserved1: padding
+ * @length: length of zone in sectors
+ * @lba_start: lba where the zone starts.
+ * @lba_wptr: lba of the current write pointer.
+ * @reserved: padding
+ *
+ */
+struct bdev_zone_descriptor {
+	__u8 type;
+	__u8 flags;
+	__u8  reserved1[6];
+	__be64 length;
+	__be64 lba_start;
+	__be64 lba_wptr;
+	__u8 reserved[32];
+} __packed;
+
+
+/**
+ * struct bdev_zone_report - Report Zones result
+ *
+ * @descriptor_count: Number of descriptor entries that follow
+ * @same_field: bits 0-3: enum zone_same (MASK: 0x0F)
+ * @reserved1: padding
+ * @maximum_lba: LBA of the last logical sector on the device, inclusive
+ *               of all logical sectors in all zones.
+ * @reserved2: padding
+ * @descriptors: array of descriptors follows.
+ */
+struct bdev_zone_report {
+	__be32 descriptor_count;
+	__u8 same_field;
+	__u8 reserved1[3];
+	__be64 maximum_lba;
+	__u8 reserved2[48];
+	struct bdev_zone_descriptor descriptors[0];
+} __packed;
+
+
+/**
+ * struct bdev_zone_report_io - Report Zones ioctl argument.
+ *
+ * @in: Report Zones inputs
+ * @out: Report Zones output
+ */
+struct bdev_zone_report_io {
+	union {
+		struct bdev_zone_get_report in;
+		struct bdev_zone_report out;
+	} data;
+} __packed;
+
+#endif /* _UAPI_BLKZONED_API_H */
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 2/3] Add ioctl to issue ZBC/ZAC commands via block layer
  2016-06-10  7:10 [PATCH v3 0/3] Block layer support ZAC/ZBC commands Shaun Tancheff
  2016-06-10  7:10 ` [PATCH v3 1/3] Add bio/request flags for using ZBC/ZAC commands Shaun Tancheff
@ 2016-06-10  7:10 ` Shaun Tancheff
  2016-06-10  7:10 ` [PATCH v3 3/3] Add ata pass-through path for ZAC commands Shaun Tancheff
  2016-06-10  7:13 ` [PATCH v3 1/3] Add bio/request flags for using ZBC/ZAC commands Shaun Tancheff
  3 siblings, 0 replies; 8+ messages in thread
From: Shaun Tancheff @ 2016-06-10  7:10 UTC (permalink / raw)
  To: linux-ide, linux-block, linux-scsi
  Cc: Shaun Tancheff, Jens Axboe, James E . J . Bottomley,
	Martin K . Petersen, Jeff Layton, J . Bruce Fields,
	Shaun Tancheff

Add New ioctl types
    BLKREPORT    - Issue Report Zones to device.
    BLKOPENZONE  - Issue an Zone Action: Open Zone command.
    BLKCLOSEZONE - Issue an Zone Action: Close Zone command.
    BLKRESETZONE - Issue an Zone Action: Reset Zone command.

Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
---
 block/ioctl.c                     | 110 ++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/blkzoned_api.h |   6 +++
 include/uapi/linux/fs.h           |   1 +
 3 files changed, 117 insertions(+)

diff --git a/block/ioctl.c b/block/ioctl.c
index ed2397f..1e89721 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -7,6 +7,7 @@
 #include <linux/backing-dev.h>
 #include <linux/fs.h>
 #include <linux/blktrace_api.h>
+#include <linux/blkzoned_api.h>
 #include <linux/pr.h>
 #include <asm/uaccess.h>
 
@@ -194,6 +195,109 @@ int blkdev_reread_part(struct block_device *bdev)
 }
 EXPORT_SYMBOL(blkdev_reread_part);
 
+static int blk_zoned_report_ioctl(struct block_device *bdev, fmode_t mode,
+		void __user *parg)
+{
+	int error = -EFAULT;
+	gfp_t gfp = GFP_KERNEL;
+	struct bdev_zone_report_io *zone_iodata = NULL;
+	int order = 0;
+	struct page *pgs = NULL;
+	u32 alloc_size = PAGE_SIZE;
+	unsigned long op_flags = 0;
+	u8 opt = 0;
+
+	if (!(mode & FMODE_READ))
+		return -EBADF;
+
+	zone_iodata = (void *)get_zeroed_page(gfp);
+	if (!zone_iodata) {
+		error = -ENOMEM;
+		goto report_zones_out;
+	}
+	if (copy_from_user(zone_iodata, parg, sizeof(*zone_iodata))) {
+		error = -EFAULT;
+		goto report_zones_out;
+	}
+	if (zone_iodata->data.in.return_page_count > alloc_size) {
+		int npages;
+
+		alloc_size = zone_iodata->data.in.return_page_count;
+		npages = (alloc_size + PAGE_SIZE - 1) / PAGE_SIZE;
+		order =  ilog2(roundup_pow_of_two(npages));
+		pgs = alloc_pages(gfp, order);
+		if (pgs) {
+			void *mem = page_address(pgs);
+
+			if (!mem) {
+				error = -ENOMEM;
+				goto report_zones_out;
+			}
+			memset(mem, 0, alloc_size);
+			memcpy(mem, zone_iodata, sizeof(*zone_iodata));
+			free_page((unsigned long)zone_iodata);
+			zone_iodata = mem;
+		} else {
+			/* Result requires DMA capable memory */
+			pr_err("Not enough memory available for request.\n");
+			error = -ENOMEM;
+			goto report_zones_out;
+		}
+	}
+	opt = zone_iodata->data.in.report_option;
+	error = blkdev_issue_zone_report(bdev, op_flags,
+			zone_iodata->data.in.zone_locator_lba, opt,
+			pgs ? pgs : virt_to_page(zone_iodata),
+			alloc_size, GFP_KERNEL);
+
+	if (error)
+		goto report_zones_out;
+
+	if (copy_to_user(parg, zone_iodata, alloc_size))
+		error = -EFAULT;
+
+report_zones_out:
+	if (pgs)
+		__free_pages(pgs, order);
+	else if (zone_iodata)
+		free_page((unsigned long)zone_iodata);
+	return error;
+}
+
+static int blk_zoned_action_ioctl(struct block_device *bdev, fmode_t mode,
+				  unsigned int cmd, unsigned long arg)
+{
+	unsigned long op_flags = 0;
+
+	if (!(mode & FMODE_WRITE))
+		return -EBADF;
+
+	/*
+	 * When acting on zones we explicitly disallow using a partition.
+	 */
+	if (bdev != bdev->bd_contains) {
+		pr_err("%s: All zone operations disallowed on this device\n",
+			__func__);
+		return -EFAULT;
+	}
+
+	switch (cmd) {
+	case BLKOPENZONE:
+		op_flags |= REQ_OPEN_ZONE;
+		break;
+	case BLKCLOSEZONE:
+		op_flags |= REQ_CLOSE_ZONE;
+		break;
+	case BLKRESETZONE:
+		op_flags |= REQ_RESET_ZONE;
+		break;
+	default:
+		pr_err("%s: Unknown action: %u\n", __func__, cmd);
+		WARN_ON(1);
+	}
+	return blkdev_issue_zone_action(bdev, op_flags, arg, GFP_KERNEL);
+}
+
 static int blk_ioctl_discard(struct block_device *bdev, fmode_t mode,
 		unsigned long arg, unsigned long flags)
 {
@@ -568,6 +672,12 @@ int blkdev_ioctl(struct block_device *bdev, fmode_t mode, unsigned cmd,
 	case BLKTRACESETUP:
 	case BLKTRACETEARDOWN:
 		return blk_trace_ioctl(bdev, cmd, argp);
+	case BLKREPORT:
+		return blk_zoned_report_ioctl(bdev, mode, argp);
+	case BLKOPENZONE:
+	case BLKCLOSEZONE:
+	case BLKRESETZONE:
+		return blk_zoned_action_ioctl(bdev, mode, cmd, arg);
 	case IOC_PR_REGISTER:
 		return blkdev_pr_register(bdev, argp);
 	case IOC_PR_RESERVE:
diff --git a/include/uapi/linux/blkzoned_api.h b/include/uapi/linux/blkzoned_api.h
index 48c17ad..3566de0 100644
--- a/include/uapi/linux/blkzoned_api.h
+++ b/include/uapi/linux/blkzoned_api.h
@@ -211,4 +211,10 @@ struct bdev_zone_report_io {
 	} data;
 } __packed;
 
+/* continuing from uapi/linux/fs.h: */
+#define BLKREPORT	_IOWR(0x12, 130, struct bdev_zone_report_io)
+#define BLKOPENZONE	_IO(0x12, 131)
+#define BLKCLOSEZONE	_IO(0x12, 132)
+#define BLKRESETZONE	_IO(0x12, 133)
+
 #endif /* _UAPI_BLKZONED_API_H */
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 3b00f7c..c0b565b 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -222,6 +222,7 @@ struct fsxattr {
 #define BLKSECDISCARD _IO(0x12,125)
 #define BLKROTATIONAL _IO(0x12,126)
 #define BLKZEROOUT _IO(0x12,127)
+/* A jump here: See blkzoned_api.h, Reserving 130 to 133. */
 
 #define BMAP_IOCTL 1		/* obsolete - kept for compatibility */
 #define FIBMAP	   _IO(0x00,1)	/* bmap access */
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 3/3] Add ata pass-through path for ZAC commands.
  2016-06-10  7:10 [PATCH v3 0/3] Block layer support ZAC/ZBC commands Shaun Tancheff
  2016-06-10  7:10 ` [PATCH v3 1/3] Add bio/request flags for using ZBC/ZAC commands Shaun Tancheff
  2016-06-10  7:10 ` [PATCH v3 2/3] Add ioctl to issue ZBC/ZAC commands via block layer Shaun Tancheff
@ 2016-06-10  7:10 ` Shaun Tancheff
  2016-06-10  7:19   ` Hannes Reinecke
  2016-06-10  7:13 ` [PATCH v3 1/3] Add bio/request flags for using ZBC/ZAC commands Shaun Tancheff
  3 siblings, 1 reply; 8+ messages in thread
From: Shaun Tancheff @ 2016-06-10  7:10 UTC (permalink / raw)
  To: linux-ide, linux-block, linux-scsi
  Cc: Shaun Tancheff, Jens Axboe, James E . J . Bottomley,
	Martin K . Petersen, Jeff Layton, J . Bruce Fields,
	Shaun Tancheff

The current generation of HBA SAS adapters support connecting SATA
drives and perform SCSI<->ATA translations in hardware.
Unfortunately the ZBC commands are not being translate (yet).

Currently users of SAS controllers can only send ZAC commands via
ata pass-through.

This method overloads the meaning of REQ_META to direct ZBC commands
to construct ZAC equivalent ATA pass through commands.
Note also that this approach expects the initiator to deal with the
little endian result due to bypassing the normal translation layers.

Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
---
So this patch isn't the right way to work around hardware that is
missing features (mixing ATA commands in SCSI interface code) it
maybe useful for end users in the near term who have HBA SAS
controllers that don't support ZBC <-> ZAC translations.

V3:
 - Use zoned report reserved bit for ata-passthrough flag.
v2:
 - Added REQ_META to op_flags if high bit is set in opt.
---
 block/ioctl.c                     | 34 +++++++++++++++++++-
 drivers/scsi/sd.c                 | 68 ++++++++++++++++++++++++++++++---------
 include/linux/ata.h               | 15 +++++++++
 include/uapi/linux/blkzoned_api.h |  4 +++
 4 files changed, 105 insertions(+), 16 deletions(-)

diff --git a/block/ioctl.c b/block/ioctl.c
index 1e89721..b9dea29 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -244,7 +244,10 @@ static int blk_zoned_report_ioctl(struct block_device *bdev, fmode_t mode,
 			goto report_zones_out;
 		}
 	}
-	opt = zone_iodata->data.in.report_option;
+	opt = zone_iodata->data.in.report_option & ~(ZOPT_USE_ATA_PASS);
+	if (zone_iodata->data.in.report_option & ZOPT_USE_ATA_PASS)
+		op_flags |= REQ_META;
+
 	error = blkdev_issue_zone_report(bdev, op_flags,
 			zone_iodata->data.in.zone_locator_lba, opt,
 			pgs ? pgs : virt_to_page(zone_iodata),
@@ -281,6 +284,35 @@ static int blk_zoned_action_ioctl(struct block_device *bdev, fmode_t mode,
 		return -EFAULT;
 	}
 
+	/*
+	 * When the low bit is set force ATA passthrough try to work around
+	 * older SAS HBA controllers that don't support ZBC to ZAC translation.
+	 *
+	 * When the low bit is clear follow the normal path but also correct
+	 * for ~0ul LBA means 'for all lbas'.
+	 *
+	 * NB: We should do extra checking here to see if the user specified
+	 *     the entire block device as opposed to a partition of the
+	 *     device....
+	 */
+	if (arg & 1) {
+		op_flags |= REQ_META;
+		if (arg != ~0ul)
+			arg &= ~1ul; /* ~1 :: 0xFF...FE */
+	} else {
+		if (arg == ~1ul)
+			arg = ~0ul;
+	}
+
+	/*
+	 * When acting on zones we explicitly disallow using a partition.
+	 */
+	if (bdev != bdev->bd_contains) {
+		pr_err("%s: All zone operations disallowed on this device\n",
+			__func__);
+		return -EFAULT;
+	}
+
 	switch (cmd) {
 	case BLKOPENZONE:
 		op_flags |= REQ_OPEN_ZONE;
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 241faf5..1a6c5b3 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -53,6 +53,7 @@
 #include <linux/pm_runtime.h>
 #include <linux/pr.h>
 #include <linux/blkzoned_api.h>
+#include <linux/ata.h>
 #include <asm/uaccess.h>
 #include <asm/unaligned.h>
 
@@ -1183,12 +1184,28 @@ static int sd_setup_zoned_cmnd(struct scsi_cmnd *cmd)
 
 		cmd->cmd_len = 16;
 		memset(cmd->cmnd, 0, cmd->cmd_len);
-		cmd->cmnd[0] = ZBC_IN;
-		cmd->cmnd[1] = ZI_REPORT_ZONES;
-		put_unaligned_be64(sector, &cmd->cmnd[2]);
-		put_unaligned_be32(nr_bytes, &cmd->cmnd[10]);
-		/* FUTURE ... when streamid is available */
-		/* cmd->cmnd[14] = bio_get_streamid(bio); */
+		if (rq->cmd_flags & REQ_META) {
+			cmd->cmnd[0] = ATA_16;
+			cmd->cmnd[1] = (0x6 << 1) | 1;
+			cmd->cmnd[2] = 0x0e;
+			/* FUTURE ... when streamid is available */
+			/* cmd->cmnd[3] = bio_get_streamid(bio); */
+			cmd->cmnd[4] = ATA_SUBCMD_ZAC_MGMT_IN_REPORT_ZONES;
+			cmd->cmnd[5] = ((nr_bytes / 512) >> 8) & 0xff;
+			cmd->cmnd[6] = (nr_bytes / 512) & 0xff;
+
+			_lba_to_cmd_ata(&cmd->cmnd[7], sector);
+
+			cmd->cmnd[13] = 1 << 6;
+			cmd->cmnd[14] = ATA_CMD_ZAC_MGMT_IN;
+		} else {
+			cmd->cmnd[0] = ZBC_IN;
+			cmd->cmnd[1] = ZI_REPORT_ZONES;
+			put_unaligned_be64(sector, &cmd->cmnd[2]);
+			put_unaligned_be32(nr_bytes, &cmd->cmnd[10]);
+			/* FUTURE ... when streamid is available */
+			/* cmd->cmnd[14] = bio_get_streamid(bio); */
+		}
 
 		cmd->sc_data_direction = DMA_FROM_DEVICE;
 		cmd->sdb.length = nr_bytes;
@@ -1210,14 +1227,28 @@ static int sd_setup_zoned_cmnd(struct scsi_cmnd *cmd)
 	cmd->cmd_len = 16;
 	memset(cmd->cmnd, 0, cmd->cmd_len);
 	memset(&cmd->sdb, 0, sizeof(cmd->sdb));
-	cmd->cmnd[0] = ZBC_OUT;
-	cmd->cmnd[1] = ZO_OPEN_ZONE;
-	if (rq->cmd_flags & REQ_CLOSE_ZONE)
-		cmd->cmnd[1] = ZO_CLOSE_ZONE;
-	if (rq->cmd_flags & REQ_RESET_ZONE)
-		cmd->cmnd[1] = ZO_RESET_WRITE_POINTER;
-	cmd->cmnd[14] = allbit;
-	put_unaligned_be64(sector, &cmd->cmnd[2]);
+	if (rq->cmd_flags & REQ_META) {
+		cmd->cmnd[0] = ATA_16;
+		cmd->cmnd[1] = (3 << 1) | 1;
+		cmd->cmnd[3] = allbit;
+		cmd->cmnd[4] = ATA_SUBCMD_ZAC_MGMT_OUT_RESET_WRITE_POINTER;
+		if (rq->cmd_flags & REQ_OPEN_ZONE)
+			cmd->cmnd[4] = ATA_SUBCMD_ZAC_MGMT_OUT_OPEN_ZONE;
+		if (rq->cmd_flags & REQ_CLOSE_ZONE)
+			cmd->cmnd[4] = ATA_SUBCMD_ZAC_MGMT_OUT_CLOSE_ZONE;
+		_lba_to_cmd_ata(&cmd->cmnd[7], sector);
+		cmd->cmnd[13] = 1 << 6;
+		cmd->cmnd[14] = ATA_CMD_ZAC_MGMT_OUT;
+	} else {
+		cmd->cmnd[0] = ZBC_OUT;
+		cmd->cmnd[1] = ZO_OPEN_ZONE;
+		if (rq->cmd_flags & REQ_CLOSE_ZONE)
+			cmd->cmnd[1] = ZO_CLOSE_ZONE;
+		if (rq->cmd_flags & REQ_RESET_ZONE)
+			cmd->cmnd[1] = ZO_RESET_WRITE_POINTER;
+		cmd->cmnd[14] = allbit;
+		put_unaligned_be64(sector, &cmd->cmnd[2]);
+	}
 
 	cmd->transfersize = 0;
 	cmd->underflow = 0;
@@ -2819,7 +2850,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
 {
 	unsigned char *buffer;
 	u16 rot;
-	const int vpd_len = 64;
+	const int vpd_len = 512;
 
 	buffer = kmalloc(vpd_len, GFP_KERNEL);
 
@@ -2836,6 +2867,13 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
 	}
 
 	sdkp->zoned = (buffer[8] >> 4) & 3;
+	if (sdkp->zoned != 1) {
+		struct scsi_device *sdev = sdkp->device;
+
+		/* buf size is 512, page is 60 + 512, we need page 206 */
+		if (!scsi_get_vpd_page(sdev, 0x89, buffer, SD_BUF_SIZE))
+			sdkp->zoned = ata_id_zoned_cap((u16 *)&buffer[60]);
+	}
 
  out:
 	kfree(buffer);
diff --git a/include/linux/ata.h b/include/linux/ata.h
index 99346be..5cc1a85 100644
--- a/include/linux/ata.h
+++ b/include/linux/ata.h
@@ -1060,6 +1060,21 @@ static inline void ata_id_to_hd_driveid(u16 *id)
 #endif
 }
 
+/**
+ * _lba_to_cmd_ata() - Copy lba48 to ATA command
+ * @cmd: ATA command as an array of bytes
+ * @_lba: lba48 in the low 48 bits
+ */
+static inline void _lba_to_cmd_ata(u8 *cmd, u64 _lba)
+{
+	cmd[1] =  _lba	      & 0xff;
+	cmd[3] = (_lba >>  8) & 0xff;
+	cmd[5] = (_lba >> 16) & 0xff;
+	cmd[0] = (_lba >> 24) & 0xff;
+	cmd[2] = (_lba >> 32) & 0xff;
+	cmd[4] = (_lba >> 40) & 0xff;
+}
+
 /*
  * Write LBA Range Entries to the buffer that will cover the extent from
  * sector to sector + count.  This is used for TRIM and for ADD LBA(S)
diff --git a/include/uapi/linux/blkzoned_api.h b/include/uapi/linux/blkzoned_api.h
index 3566de0..bebfc4d 100644
--- a/include/uapi/linux/blkzoned_api.h
+++ b/include/uapi/linux/blkzoned_api.h
@@ -32,6 +32,8 @@
  * @ZOPT_NON_WP_ZONES: Zones that do not have Write Pointers (conventional)
  * @ZOPT_PARTIAL_FLAG: Modifies the definition of the Zone List Length field.
  *
+ * @ZOPT_USE_ATA_PASS: Flag used in kernel to service command I/O
+ *
  * Used by Report Zones in bdev_zone_get_report: report_option
  */
 enum zone_report_option {
@@ -47,6 +49,8 @@ enum zone_report_option {
 	ZOPT_NON_SEQ             = 0x11,
 	ZOPT_NON_WP_ZONES        = 0x3f,
 	ZOPT_PARTIAL_FLAG        = 0x80,
+
+	ZOPT_USE_ATA_PASS        = 0x40, /* reserved by spec */
 };
 
 /**
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 1/3] Add bio/request flags for using ZBC/ZAC commands
  2016-06-10  7:10 [PATCH v3 0/3] Block layer support ZAC/ZBC commands Shaun Tancheff
                   ` (2 preceding siblings ...)
  2016-06-10  7:10 ` [PATCH v3 3/3] Add ata pass-through path for ZAC commands Shaun Tancheff
@ 2016-06-10  7:13 ` Shaun Tancheff
  2016-06-13  8:01   ` Christoph Hellwig
  3 siblings, 1 reply; 8+ messages in thread
From: Shaun Tancheff @ 2016-06-10  7:13 UTC (permalink / raw)
  To: linux-ide, linux-block, linux-scsi
  Cc: Shaun Tancheff, Jens Axboe, James E . J . Bottomley,
	Martin K . Petersen, Jeff Layton, J . Bruce Fields,
	Shaun Tancheff

T10 ZBC and T13 ZAC specify operations for Zoned devices.

To be able to access the zone information and open and close zones
adding flags for the report zones command (REQ_REPORT_ZONES) and for
Open and Close zone (REQ_OPEN_ZONE and REQ_CLOSE_ZONE) can be added
for use by struct bio's bi_rw and by struct request's cmd_flags.

To reduce the number of additional flags needed REQ_RESET_ZONE shares
the same flag as REQ_REPORT_ZONES and is differentiated by direction.
Report zones is a device read that requires a buffer. Reset is a device
command (WRITE) that has no associated data transfer.

The Finish zone command is intentionally not implimented as there is no
current use case for that operation.

Report zones currently defaults to reporting on all zones. It expected
that support for the zone option flag will piggy back on streamid
support. The report option is useful as it can reduce the number of
zones in each report, but not critical.

Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
---
V3:
 - Rebase on Mike Cristie's separate bio operations
 - Update blkzoned_api.h to include report zones PARTIAL bit.

V2:
 - Changed bi_rw to op_flags clarify sepeartion of bio op from flags.
 - Fixed memory leak in blkdev_issue_zone_report failing to put_bio().
 - Documented opt in blkdev_issue_zone_report.
 - Removed include/uapi/linux/fs.h from this patch.
---
 MAINTAINERS                       |   9 ++
 block/blk-lib.c                   |  98 +++++++++++++++++
 drivers/scsi/sd.c                 |  99 ++++++++++++++++++
 drivers/scsi/sd.h                 |   1 +
 include/linux/bio.h               |   4 +-
 include/linux/blk_types.h         |  16 ++-
 include/linux/blkzoned_api.h      |  25 +++++
 include/uapi/linux/Kbuild         |   1 +
 include/uapi/linux/blkzoned_api.h | 214 ++++++++++++++++++++++++++++++++++++++
 9 files changed, 464 insertions(+), 3 deletions(-)
 create mode 100644 include/linux/blkzoned_api.h
 create mode 100644 include/uapi/linux/blkzoned_api.h

diff --git a/MAINTAINERS b/MAINTAINERS
index ed42cb6..d9fafa2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12662,6 +12662,15 @@ F:	Documentation/networking/z8530drv.txt
 F:	drivers/net/hamradio/*scc.c
 F:	drivers/net/hamradio/z8530.h
 
+ZBC AND ZBC BLOCK DEVICES
+M:	Shaun Tancheff <shaun.tancheff@seagate.com>
+W:	http://seagate.com
+W:	https://github.com/Seagate/ZDM-Device-Mapper
+L:	linux-block@vger.kernel.org
+S:	Maintained
+F:	include/linux/blkzoned_api.h
+F:	include/uapi/linux/blkzoned_api.h
+
 ZBUD COMPRESSED PAGE ALLOCATOR
 M:	Seth Jennings <sjenning@redhat.com>
 L:	linux-mm@kvack.org
diff --git a/block/blk-lib.c b/block/blk-lib.c
index ff2a7f0..eda0071 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -6,6 +6,7 @@
 #include <linux/bio.h>
 #include <linux/blkdev.h>
 #include <linux/scatterlist.h>
+#include <linux/blkzoned_api.h>
 
 #include "blk.h"
 
@@ -252,3 +253,100 @@ int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
 	return __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask);
 }
 EXPORT_SYMBOL(blkdev_issue_zeroout);
+
+/**
+ * blkdev_issue_zone_report - queue a report zones operation
+ * @bdev:	target blockdev
+ * @op_flags:	extra bio rw flags. If unsure, use 0.
+ * @sector:	starting sector (report will include this sector).
+ * @opt:	See: zone_report_option, default is 0 (all zones).
+ * @page:	one or more contiguous pages.
+ * @pgsz:	up to size of page in bytes, size of report.
+ * @gfp_mask:	memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ *    Issue a zone report request for the sectors in question.
+ */
+int blkdev_issue_zone_report(struct block_device *bdev, unsigned int op_flags,
+			     sector_t sector, u8 opt, struct page *page,
+			     size_t pgsz, gfp_t gfp_mask)
+{
+	struct bdev_zone_report *conv = page_address(page);
+	struct bio *bio;
+	unsigned int nr_iovecs = 1;
+	int ret = 0;
+
+	if (pgsz < (sizeof(struct bdev_zone_report) +
+		    sizeof(struct bdev_zone_descriptor)))
+		return -EINVAL;
+
+	bio = bio_alloc(gfp_mask, nr_iovecs);
+	if (!bio)
+		return -ENOMEM;
+
+	conv->descriptor_count = 0;
+	bio->bi_iter.bi_sector = sector;
+	bio->bi_bdev = bdev;
+	bio->bi_vcnt = 0;
+	bio->bi_iter.bi_size = 0;
+
+	op_flags |= REQ_REPORT_ZONES;
+
+	/* FUTURE ... when streamid is available: */
+	/* bio_set_streamid(bio, opt); */
+
+	bio_add_page(bio, page, pgsz, 0);
+	bio_set_op_attrs(bio, REQ_OP_READ, op_flags);
+	ret = submit_bio_wait(bio);
+
+	/*
+	 * When our request it nak'd the underlying device maybe conventional
+	 * so ... report a single conventional zone the size of the device.
+	 */
+	if (ret == -EIO && conv->descriptor_count) {
+		/* Adjust the conventional to the size of the partition ... */
+		__be64 blksz = cpu_to_be64(bdev->bd_part->nr_sects);
+
+		conv->maximum_lba = blksz;
+		conv->descriptors[0].type = ZTYP_CONVENTIONAL;
+		conv->descriptors[0].flags = ZCOND_CONVENTIONAL << 4;
+		conv->descriptors[0].length = blksz;
+		conv->descriptors[0].lba_start = 0;
+		conv->descriptors[0].lba_wptr = blksz;
+		ret = 0;
+	}
+	bio_put(bio);
+	return ret;
+}
+EXPORT_SYMBOL(blkdev_issue_zone_report);
+
+/**
+ * blkdev_issue_zone_action - queue a report zones operation
+ * @bdev:	target blockdev
+ * @op_flags:	REQ_OPEN_ZONE, REQ_CLOSE_ZONE, or REQ_RESET_ZONE.
+ * @sector:	starting lba of sector
+ * @gfp_mask:	memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ *    Issue a zone report request for the sectors in question.
+ */
+int blkdev_issue_zone_action(struct block_device *bdev, unsigned int op_flags,
+			     sector_t sector, gfp_t gfp_mask)
+{
+	int ret;
+	struct bio *bio;
+
+	bio = bio_alloc(gfp_mask, 1);
+	if (!bio)
+		return -ENOMEM;
+
+	bio->bi_iter.bi_sector = sector;
+	bio->bi_bdev = bdev;
+	bio->bi_vcnt = 0;
+	bio->bi_iter.bi_size = 0;
+	bio_set_op_attrs(bio, REQ_OP_WRITE, op_flags);
+	ret = submit_bio_wait(bio);
+	bio_put(bio);
+	return ret;
+}
+EXPORT_SYMBOL(blkdev_issue_zone_action);
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 5a9db0f..241faf5 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -52,6 +52,7 @@
 #include <linux/slab.h>
 #include <linux/pm_runtime.h>
 #include <linux/pr.h>
+#include <linux/blkzoned_api.h>
 #include <asm/uaccess.h>
 #include <asm/unaligned.h>
 
@@ -1134,6 +1135,100 @@ static int sd_setup_read_write_cmnd(struct scsi_cmnd *SCpnt)
 	return ret;
 }
 
+static int sd_setup_zoned_cmnd(struct scsi_cmnd *cmd)
+{
+	struct request *rq = cmd->request;
+	struct scsi_device *sdp = cmd->device;
+	struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
+	struct bio *bio = rq->bio;
+	sector_t sector = blk_rq_pos(rq);
+	struct gendisk *disk = rq->rq_disk;
+	unsigned int nr_bytes = blk_rq_bytes(rq);
+	int ret = BLKPREP_KILL;
+	u8 allbit = 0;
+
+	if (rq->cmd_flags & REQ_REPORT_ZONES && rq_data_dir(rq) == READ) {
+		WARN_ON(nr_bytes == 0);
+
+		/*
+		 * For conventional drives generate a report that shows a
+		 * large single convetional zone the size of the block device
+		 */
+		if (sdkp->zoned != 1 && sdkp->device->type != TYPE_ZBC) {
+			void *src;
+			struct bdev_zone_report *conv;
+
+			if (nr_bytes < sizeof(struct bdev_zone_report))
+				goto out;
+
+			src = kmap_atomic(bio->bi_io_vec->bv_page);
+			conv = src + bio->bi_io_vec->bv_offset;
+			conv->descriptor_count = cpu_to_be32(1);
+			conv->same_field = ZS_ALL_SAME;
+			conv->maximum_lba = cpu_to_be64(disk->part0.nr_sects);
+			kunmap_atomic(src);
+			goto out;
+		}
+
+		ret = scsi_init_io(cmd);
+		if (ret != BLKPREP_OK)
+			goto out;
+
+		cmd = rq->special;
+		if (sdp->changed) {
+			pr_err("SCSI disk has been changed or is not present.");
+			ret = BLKPREP_KILL;
+			goto out;
+		}
+
+		cmd->cmd_len = 16;
+		memset(cmd->cmnd, 0, cmd->cmd_len);
+		cmd->cmnd[0] = ZBC_IN;
+		cmd->cmnd[1] = ZI_REPORT_ZONES;
+		put_unaligned_be64(sector, &cmd->cmnd[2]);
+		put_unaligned_be32(nr_bytes, &cmd->cmnd[10]);
+		/* FUTURE ... when streamid is available */
+		/* cmd->cmnd[14] = bio_get_streamid(bio); */
+
+		cmd->sc_data_direction = DMA_FROM_DEVICE;
+		cmd->sdb.length = nr_bytes;
+		cmd->transfersize = sdp->sector_size;
+		cmd->underflow = 0;
+		cmd->allowed = SD_MAX_RETRIES;
+		ret = BLKPREP_OK;
+		goto out;
+	}
+
+	if (sdkp->zoned != 1 && sdkp->device->type != TYPE_ZBC)
+		goto out;
+
+	if (sector == ~0ul) {
+		allbit = 1;
+		sector = 0;
+	}
+
+	cmd->cmd_len = 16;
+	memset(cmd->cmnd, 0, cmd->cmd_len);
+	memset(&cmd->sdb, 0, sizeof(cmd->sdb));
+	cmd->cmnd[0] = ZBC_OUT;
+	cmd->cmnd[1] = ZO_OPEN_ZONE;
+	if (rq->cmd_flags & REQ_CLOSE_ZONE)
+		cmd->cmnd[1] = ZO_CLOSE_ZONE;
+	if (rq->cmd_flags & REQ_RESET_ZONE)
+		cmd->cmnd[1] = ZO_RESET_WRITE_POINTER;
+	cmd->cmnd[14] = allbit;
+	put_unaligned_be64(sector, &cmd->cmnd[2]);
+
+	cmd->transfersize = 0;
+	cmd->underflow = 0;
+	cmd->allowed = SD_MAX_RETRIES;
+	cmd->sc_data_direction = DMA_NONE;
+
+	ret = BLKPREP_OK;
+out:
+	return ret;
+}
+
 static int sd_init_command(struct scsi_cmnd *cmd)
 {
 	struct request *rq = cmd->request;
@@ -1147,6 +1242,8 @@ static int sd_init_command(struct scsi_cmnd *cmd)
 		return sd_setup_flush_cmnd(cmd);
 	case REQ_OP_READ:
 	case REQ_OP_WRITE:
+		if (rq->cmd_flags & REQ_ZONED_CMDS)
+			return sd_setup_zoned_cmnd(cmd);
 		return sd_setup_read_write_cmnd(cmd);
 	default:
 		BUG();
@@ -2738,6 +2835,8 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp)
 		queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, sdkp->disk->queue);
 	}
 
+	sdkp->zoned = (buffer[8] >> 4) & 3;
+
  out:
 	kfree(buffer);
 }
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 654630b..e012175 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -94,6 +94,7 @@ struct scsi_disk {
 	unsigned	lbpvpd : 1;
 	unsigned	ws10 : 1;
 	unsigned	ws16 : 1;
+	unsigned	zoned: 2;
 };
 #define to_scsi_disk(obj) container_of(obj,struct scsi_disk,dev)
 
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 0bbb2e3..d428218 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -104,7 +104,9 @@ static inline bool bio_has_data(struct bio *bio)
 
 static inline bool bio_no_advance_iter(struct bio *bio)
 {
-	return bio_op(bio) == REQ_OP_DISCARD || bio_op(bio) == REQ_OP_WRITE_SAME;
+	return bio_op(bio) == REQ_OP_DISCARD ||
+	       bio_op(bio) == REQ_OP_WRITE_SAME ||
+	       bio->bi_rw & REQ_ZONED_CMDS;
 }
 
 static inline bool bio_is_rw(struct bio *bio)
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 562ab83..f4c84ec 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -170,6 +170,10 @@ enum rq_flag_bits {
 	__REQ_FUA,		/* forced unit access */
 	__REQ_PREFLUSH,		/* request for cache flush */
 
+	__REQ_REPORT_ZONES,	/* Zoned device: Report Zones */
+	__REQ_OPEN_ZONE,	/* Zoned device: Open Zone */
+	__REQ_CLOSE_ZONE,	/* Zoned device: Close Zone */
+
 	/* bio only flags */
 	__REQ_RAHEAD,		/* read ahead, can fail anytime */
 	__REQ_THROTTLED,	/* This bio has already been subjected to
@@ -207,17 +211,25 @@ enum rq_flag_bits {
 #define REQ_PRIO		(1ULL << __REQ_PRIO)
 #define REQ_NOIDLE		(1ULL << __REQ_NOIDLE)
 #define REQ_INTEGRITY		(1ULL << __REQ_INTEGRITY)
+#define REQ_REPORT_ZONES	(1ULL << __REQ_REPORT_ZONES)
+#define REQ_OPEN_ZONE		(1ULL << __REQ_OPEN_ZONE)
+#define REQ_CLOSE_ZONE		(1ULL << __REQ_CLOSE_ZONE)
+#define REQ_RESET_ZONE		(REQ_REPORT_ZONES)
+#define REQ_ZONED_CMDS \
+	(REQ_OPEN_ZONE | REQ_CLOSE_ZONE | REQ_RESET_ZONE | REQ_REPORT_ZONES)
 
 #define REQ_FAILFAST_MASK \
 	(REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT | REQ_FAILFAST_DRIVER)
 #define REQ_COMMON_MASK \
 	(REQ_FAILFAST_MASK | REQ_SYNC | REQ_META | REQ_PRIO | REQ_NOIDLE | \
-	 REQ_PREFLUSH | REQ_FUA | REQ_SECURE | REQ_INTEGRITY | REQ_NOMERGE)
+	 REQ_PREFLUSH | REQ_FUA | REQ_SECURE | REQ_INTEGRITY | REQ_NOMERGE | \
+	 REQ_ZONED_CMDS)
 #define REQ_CLONE_MASK		REQ_COMMON_MASK
 
 /* This mask is used for both bio and request merge checking */
 #define REQ_NOMERGE_FLAGS \
-	(REQ_NOMERGE | REQ_STARTED | REQ_SOFTBARRIER | REQ_PREFLUSH | REQ_FUA | REQ_FLUSH_SEQ)
+	(REQ_NOMERGE | REQ_STARTED | REQ_SOFTBARRIER | REQ_PREFLUSH | \
+	 REQ_FUA | REQ_FLUSH_SEQ | REQ_ZONED_CMDS)
 
 #define REQ_RAHEAD		(1ULL << __REQ_RAHEAD)
 #define REQ_THROTTLED		(1ULL << __REQ_THROTTLED)
diff --git a/include/linux/blkzoned_api.h b/include/linux/blkzoned_api.h
new file mode 100644
index 0000000..9fc2373
--- /dev/null
+++ b/include/linux/blkzoned_api.h
@@ -0,0 +1,25 @@
+/*
+ * Functions for zone based SMR devices.
+ *
+ * Copyright (C) 2015 Seagate Technology PLC
+ *
+ * Written by:
+ * Shaun Tancheff <shaun.tancheff@seagate.com>
+ *
+ * This file is licensed under  the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#ifndef _BLKZONED_API_H
+#define _BLKZONED_API_H
+
+#include <uapi/linux/blkzoned_api.h>
+
+extern int blkdev_issue_zone_action(struct block_device *, unsigned int,
+				    sector_t, gfp_t);
+extern int blkdev_issue_zone_report(struct block_device *, unsigned int,
+				    sector_t, u8 opt, struct page *, size_t,
+				    gfp_t);
+
+#endif /* _BLKZONED_API_H */
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 8bdae34..5152fa4 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -70,6 +70,7 @@ header-y += bfs_fs.h
 header-y += binfmts.h
 header-y += blkpg.h
 header-y += blktrace_api.h
+header-y += blkzoned_api.h
 header-y += bpf_common.h
 header-y += bpf.h
 header-y += bpqether.h
diff --git a/include/uapi/linux/blkzoned_api.h b/include/uapi/linux/blkzoned_api.h
new file mode 100644
index 0000000..48c17ad
--- /dev/null
+++ b/include/uapi/linux/blkzoned_api.h
@@ -0,0 +1,214 @@
+/*
+ * Functions for zone based SMR devices.
+ *
+ * Copyright (C) 2015 Seagate Technology PLC
+ *
+ * Written by:
+ * Shaun Tancheff <shaun.tancheff@seagate.com>
+ *
+ * This file is licensed under  the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#ifndef _UAPI_BLKZONED_API_H
+#define _UAPI_BLKZONED_API_H
+
+#include <linux/types.h>
+
+/**
+ * enum zone_report_option - Report Zones types to be included.
+ *
+ * @ZOPT_NON_SEQ_AND_RESET: Default (all zones).
+ * @ZOPT_ZC1_EMPTY: Zones which are empty.
+ * @ZOPT_ZC2_OPEN_IMPLICIT: Zones open but not explicitly opened
+ * @ZOPT_ZC3_OPEN_EXPLICIT: Zones opened explicitly
+ * @ZOPT_ZC4_CLOSED: Zones closed for writing.
+ * @ZOPT_ZC5_FULL: Zones that are full.
+ * @ZOPT_ZC6_READ_ONLY: Zones that are read-only
+ * @ZOPT_ZC7_OFFLINE: Zones that are offline
+ * @ZOPT_RESET: Zones that are empty
+ * @ZOPT_NON_SEQ: Zones that have HA media-cache writes pending
+ * @ZOPT_NON_WP_ZONES: Zones that do not have Write Pointers (conventional)
+ * @ZOPT_PARTIAL_FLAG: Modifies the definition of the Zone List Length field.
+ *
+ * Used by Report Zones in bdev_zone_get_report: report_option
+ */
+enum zone_report_option {
+	ZOPT_NON_SEQ_AND_RESET   = 0x00,
+	ZOPT_ZC1_EMPTY,
+	ZOPT_ZC2_OPEN_IMPLICIT,
+	ZOPT_ZC3_OPEN_EXPLICIT,
+	ZOPT_ZC4_CLOSED,
+	ZOPT_ZC5_FULL,
+	ZOPT_ZC6_READ_ONLY,
+	ZOPT_ZC7_OFFLINE,
+	ZOPT_RESET               = 0x10,
+	ZOPT_NON_SEQ             = 0x11,
+	ZOPT_NON_WP_ZONES        = 0x3f,
+	ZOPT_PARTIAL_FLAG        = 0x80,
+};
+
+/**
+ * enum bdev_zone_type - Type of zone in descriptor
+ *
+ * @ZTYP_RESERVED: Reserved
+ * @ZTYP_CONVENTIONAL: Conventional random write zone (No Write Pointer)
+ * @ZTYP_SEQ_WRITE_REQUIRED: Non-sequential writes are rejected.
+ * @ZTYP_SEQ_WRITE_PREFERRED: Non-sequential writes allowed but discouraged.
+ *
+ * Returned from Report Zones. See bdev_zone_descriptor* type.
+ */
+enum bdev_zone_type {
+	ZTYP_RESERVED            = 0,
+	ZTYP_CONVENTIONAL        = 1,
+	ZTYP_SEQ_WRITE_REQUIRED  = 2,
+	ZTYP_SEQ_WRITE_PREFERRED = 3,
+};
+
+
+/**
+ * enum bdev_zone_condition - Condition of zone in descriptor
+ *
+ * @ZCOND_CONVENTIONAL: N/A
+ * @ZCOND_ZC1_EMPTY: Empty
+ * @ZCOND_ZC2_OPEN_IMPLICIT: Opened via write to zone.
+ * @ZCOND_ZC3_OPEN_EXPLICIT: Opened via open zone command.
+ * @ZCOND_ZC4_CLOSED: Closed
+ * @ZCOND_ZC6_READ_ONLY:
+ * @ZCOND_ZC5_FULL: No remaining space in zone.
+ * @ZCOND_ZC7_OFFLINE: Offline
+ *
+ * Returned from Report Zones. See bdev_zone_descriptor* flags.
+ */
+enum bdev_zone_condition {
+	ZCOND_CONVENTIONAL       = 0,
+	ZCOND_ZC1_EMPTY          = 1,
+	ZCOND_ZC2_OPEN_IMPLICIT  = 2,
+	ZCOND_ZC3_OPEN_EXPLICIT  = 3,
+	ZCOND_ZC4_CLOSED         = 4,
+	/* 0x5 to 0xC are reserved */
+	ZCOND_ZC6_READ_ONLY      = 0xd,
+	ZCOND_ZC5_FULL           = 0xe,
+	ZCOND_ZC7_OFFLINE        = 0xf,
+};
+
+
+/**
+ * enum bdev_zone_same - Report Zones same code.
+ *
+ * @ZS_ALL_DIFFERENT: All zones differ in type and size.
+ * @ZS_ALL_SAME: All zones are the same size and type.
+ * @ZS_LAST_DIFFERS: All zones are the same size and type except the last zone.
+ * @ZS_SAME_LEN_DIFF_TYPES: All zones are the same length but types differ.
+ *
+ * Returned from Report Zones. See bdev_zone_report* same_field.
+ */
+enum bdev_zone_same {
+	ZS_ALL_DIFFERENT        = 0,
+	ZS_ALL_SAME             = 1,
+	ZS_LAST_DIFFERS         = 2,
+	ZS_SAME_LEN_DIFF_TYPES  = 3,
+};
+
+
+/**
+ * struct bdev_zone_get_report - ioctl: Report Zones request
+ *
+ * @zone_locator_lba: starting lba for first [reported] zone
+ * @return_page_count: number of *bytes* allocated for result
+ * @report_option: see: zone_report_option enum
+ *
+ * Used to issue report zones command to connected device
+ */
+struct bdev_zone_get_report {
+	__u64 zone_locator_lba;
+	__u32 return_page_count;
+	__u8  report_option;
+} __packed;
+
+/**
+ * struct bdev_zone_descriptor_le - See: bdev_zone_descriptor
+ */
+struct bdev_zone_descriptor_le {
+	__u8 type;
+	__u8 flags;
+	__u8 reserved1[6];
+	__le64 length;
+	__le64 lba_start;
+	__le64 lba_wptr;
+	__u8 reserved[32];
+} __packed;
+
+
+/**
+ * struct bdev_zone_report_le - See: bdev_zone_report
+ */
+struct bdev_zone_report_le {
+	__le32 descriptor_count;
+	__u8 same_field;
+	__u8 reserved1[3];
+	__le64 maximum_lba;
+	__u8 reserved2[48];
+	struct bdev_zone_descriptor_le descriptors[0];
+} __packed;
+
+
+/**
+ * struct bdev_zone_descriptor - A Zone descriptor entry from report zones
+ *
+ * @type: see zone_type enum
+ * @flags: Bits 0:reset, 1:non-seq, 2-3: resv, 4-7: see zone_condition enum
+ * @reserved1: padding
+ * @length: length of zone in sectors
+ * @lba_start: lba where the zone starts.
+ * @lba_wptr: lba of the current write pointer.
+ * @reserved: padding
+ *
+ */
+struct bdev_zone_descriptor {
+	__u8 type;
+	__u8 flags;
+	__u8  reserved1[6];
+	__be64 length;
+	__be64 lba_start;
+	__be64 lba_wptr;
+	__u8 reserved[32];
+} __packed;
+
+
+/**
+ * struct bdev_zone_report - Report Zones result
+ *
+ * @descriptor_count: Number of descriptor entries that follow
+ * @same_field: bits 0-3: enum zone_same (MASK: 0x0F)
+ * @reserved1: padding
+ * @maximum_lba: LBA of the last logical sector on the device, inclusive
+ *               of all logical sectors in all zones.
+ * @reserved2: padding
+ * @descriptors: array of descriptors follows.
+ */
+struct bdev_zone_report {
+	__be32 descriptor_count;
+	__u8 same_field;
+	__u8 reserved1[3];
+	__be64 maximum_lba;
+	__u8 reserved2[48];
+	struct bdev_zone_descriptor descriptors[0];
+} __packed;
+
+
+/**
+ * struct bdev_zone_report_io - Report Zones ioctl argument.
+ *
+ * @in: Report Zones inputs
+ * @out: Report Zones output
+ */
+struct bdev_zone_report_io {
+	union {
+		struct bdev_zone_get_report in;
+		struct bdev_zone_report out;
+	} data;
+} __packed;
+
+#endif /* _UAPI_BLKZONED_API_H */
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 3/3] Add ata pass-through path for ZAC commands.
  2016-06-10  7:10 ` [PATCH v3 3/3] Add ata pass-through path for ZAC commands Shaun Tancheff
@ 2016-06-10  7:19   ` Hannes Reinecke
  2016-06-10  7:28     ` Shaun Tancheff
  0 siblings, 1 reply; 8+ messages in thread
From: Hannes Reinecke @ 2016-06-10  7:19 UTC (permalink / raw)
  To: Shaun Tancheff, linux-ide, linux-block, linux-scsi
  Cc: Jens Axboe, James E . J . Bottomley, Martin K . Petersen,
	Jeff Layton, J . Bruce Fields, Shaun Tancheff

On 06/10/2016 09:10 AM, Shaun Tancheff wrote:
> The current generation of HBA SAS adapters support connecting SATA
> drives and perform SCSI<->ATA translations in hardware.
> Unfortunately the ZBC commands are not being translate (yet).
> 
> Currently users of SAS controllers can only send ZAC commands via
> ata pass-through.
> 
> This method overloads the meaning of REQ_META to direct ZBC commands
> to construct ZAC equivalent ATA pass through commands.
> Note also that this approach expects the initiator to deal with the
> little endian result due to bypassing the normal translation layers.
> 
> Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
> ---
> So this patch isn't the right way to work around hardware that is
> missing features (mixing ATA commands in SCSI interface code) it
> maybe useful for end users in the near term who have HBA SAS
> controllers that don't support ZBC <-> ZAC translations.
> 
And indeed, this patch isn't right.
It is just for a very specific SAS HBA (mpt2sas/mpt3sas).
Other SAS HBAs like isci and hisi_sas work just nicely here.
So a translation into a ATA_16 command is _wrong_.
If you need to do this you'll have to move it into the LLDD itself.
Or use blacklisting to invoke this behaviour.
But _not_ in the general code path.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: F. Imend�rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N�rnberg)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 3/3] Add ata pass-through path for ZAC commands.
  2016-06-10  7:19   ` Hannes Reinecke
@ 2016-06-10  7:28     ` Shaun Tancheff
  0 siblings, 0 replies; 8+ messages in thread
From: Shaun Tancheff @ 2016-06-10  7:28 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Shaun Tancheff, linux-ide, linux-block, linux-scsi, Jens Axboe,
	James E . J . Bottomley, Martin K . Petersen, Jeff Layton,
	J . Bruce Fields

On Fri, Jun 10, 2016 at 2:19 AM, Hannes Reinecke <hare@suse.de> wrote:
> On 06/10/2016 09:10 AM, Shaun Tancheff wrote:
>> The current generation of HBA SAS adapters support connecting SATA
>> drives and perform SCSI<->ATA translations in hardware.
>> Unfortunately the ZBC commands are not being translate (yet).
>>
>> Currently users of SAS controllers can only send ZAC commands via
>> ata pass-through.
>>
>> This method overloads the meaning of REQ_META to direct ZBC commands
>> to construct ZAC equivalent ATA pass through commands.
>> Note also that this approach expects the initiator to deal with the
>> little endian result due to bypassing the normal translation layers.
>>
>> Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
>> ---
>> So this patch isn't the right way to work around hardware that is
>> missing features (mixing ATA commands in SCSI interface code) it
>> maybe useful for end users in the near term who have HBA SAS
>> controllers that don't support ZBC <-> ZAC translations.
>>
> And indeed, this patch isn't right.
> It is just for a very specific SAS HBA (mpt2sas/mpt3sas).
> Other SAS HBAs like isci and hisi_sas work just nicely here.

That is good to know there are some vendors that are on the ball.

> So a translation into a ATA_16 command is _wrong_.
> If you need to do this you'll have to move it into the LLDD itself.
> Or use blacklisting to invoke this behaviour.
> But _not_ in the general code path.

Agreed. Thanks!

> Cheers,
>
> Hannes
> --
> Dr. Hannes Reinecke                Teamlead Storage & Networking
> hare@suse.de                                   +49 911 74053 688
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N=C3=BCrnberg
> GF: F. Imend=C3=B6rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
> HRB 21284 (AG N=C3=BCrnberg)

--=20
Shaun Tancheff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/3] Add bio/request flags for using ZBC/ZAC commands
  2016-06-10  7:13 ` [PATCH v3 1/3] Add bio/request flags for using ZBC/ZAC commands Shaun Tancheff
@ 2016-06-13  8:01   ` Christoph Hellwig
  0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2016-06-13  8:01 UTC (permalink / raw)
  To: Shaun Tancheff
  Cc: linux-ide, linux-block, linux-scsi, Jens Axboe,
	James E . J . Bottomley, Martin K . Petersen, Jeff Layton,
	J . Bruce Fields, Shaun Tancheff

On Fri, Jun 10, 2016 at 02:13:53AM -0500, Shaun Tancheff wrote:
> T10 ZBC and T13 ZAC specify operations for Zoned devices.
> 
> To be able to access the zone information and open and close zones
> adding flags for the report zones command (REQ_REPORT_ZONES) and for
> Open and Close zone (REQ_OPEN_ZONE and REQ_CLOSE_ZONE) can be added
> for use by struct bio's bi_rw and by struct request's cmd_flags.

These need to be new operations, e.g. they should be in the REQ_OP_*
enum.  And please use a separate opcode for each actual operation.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-06-13  8:01 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-10  7:10 [PATCH v3 0/3] Block layer support ZAC/ZBC commands Shaun Tancheff
2016-06-10  7:10 ` [PATCH v3 1/3] Add bio/request flags for using ZBC/ZAC commands Shaun Tancheff
2016-06-10  7:10 ` [PATCH v3 2/3] Add ioctl to issue ZBC/ZAC commands via block layer Shaun Tancheff
2016-06-10  7:10 ` [PATCH v3 3/3] Add ata pass-through path for ZAC commands Shaun Tancheff
2016-06-10  7:19   ` Hannes Reinecke
2016-06-10  7:28     ` Shaun Tancheff
2016-06-10  7:13 ` [PATCH v3 1/3] Add bio/request flags for using ZBC/ZAC commands Shaun Tancheff
2016-06-13  8:01   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).