public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/6] Enable testing small DMA segment sizes
@ 2026-03-27 21:13 Bart Van Assche
  2026-03-27 21:13 ` [PATCH v3 1/6] block: Fix a source code comment Bart Van Assche
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Bart Van Assche @ 2026-03-27 21:13 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Damien Le Moal, Ming Lei,
	Bart Van Assche

Hi Jens,

About one year ago support was merged for DMA segment sizes smaller than
the virtual memory page size. No blktests exist yet for the new codepaths
related to small segment size support. This patch series makes it possible
to test these code paths on a system (e.g. a VM) with 4 KiB pages.

The corresponding blktest patch is available here:
https://lore.kernel.org/linux-block/20260323200751.1238583-1-bvanassche@acm.org/

Please consider this patch series for the next merge window.

Thanks,

Bart.

Changes compared to v2:
 - Wrapped an overly long line and fixed the "Fixes:" tag in patch 2.
 - Restored the seg_boundary_mask check in patch 3.
 - Moved the BLK_MIN_SEGMENT_SIZE constant to a public header file in patch 4.
 - Check the max_segment_size value before passing it to the block layer in
   patch 5.
 - Added a sixth patch for scsi_debug to this series. Because the scsi_debug
   patch uses the BLK_MIN_SEGMENT_SIZE constant, it depends on patch 3.

Changes compared to v1:
 - Addressed Damien's comment and made the null_blk kernel module parameter more
   clear.
 - Added three patches to this series: one patch that fixes a source code
   comment and two patches that reduce the number of users of the
   BLK_MIN_SEGMENT_SIZE constant.

Bart Van Assche (6):
  block: Fix a source code comment
  block: Fix the max_user_sectors lower bound
  block: Fix the DMA segment boundary mask check
  block: Reduce the minimum value for the maximum DMA segment size
  null_blk: Support configuring the maximum DMA segment size
  scsi_debug: Support configuring the maximum segment size

 block/blk-settings.c              | 15 ++++++-----
 block/blk.h                       |  1 -
 drivers/block/null_blk/main.c     | 43 +++++++++++++++++++++++++++++++
 drivers/block/null_blk/null_blk.h |  1 +
 drivers/scsi/scsi_debug.c         | 26 ++++++++++++++++++-
 include/linux/blkdev.h            |  1 +
 6 files changed, 78 insertions(+), 9 deletions(-)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/6] block: Fix a source code comment
  2026-03-27 21:13 [PATCH v3 0/6] Enable testing small DMA segment sizes Bart Van Assche
@ 2026-03-27 21:13 ` Bart Van Assche
  2026-03-27 21:13 ` [PATCH v3 2/6] block: Fix the max_user_sectors lower bound Bart Van Assche
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Bart Van Assche @ 2026-03-27 21:13 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Damien Le Moal, Ming Lei,
	Bart Van Assche

Fix a source code comment that is no longer correct since commit
889c57066cee ("block: make segment size limit workable for > 4K
PAGE_SIZE").

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-settings.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 78c83817b9d3..87724d30be4f 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -470,8 +470,8 @@ int blk_validate_limits(struct queue_limits *lim)
 	} else {
 		/*
 		 * The maximum segment size has an odd historic 64k default that
-		 * drivers probably should override.  Just like the I/O size we
-		 * require drivers to at least handle a full page per segment.
+		 * drivers probably should override. The maximum DMA segment
+		 * size may be less than the virtual memory page size.
 		 */
 		if (!lim->max_segment_size)
 			lim->max_segment_size = BLK_MAX_SEGMENT_SIZE;

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 2/6] block: Fix the max_user_sectors lower bound
  2026-03-27 21:13 [PATCH v3 0/6] Enable testing small DMA segment sizes Bart Van Assche
  2026-03-27 21:13 ` [PATCH v3 1/6] block: Fix a source code comment Bart Van Assche
@ 2026-03-27 21:13 ` Bart Van Assche
  2026-03-27 21:13 ` [PATCH v3 3/6] block: Fix the DMA segment boundary mask check Bart Van Assche
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Bart Van Assche @ 2026-03-27 21:13 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Damien Le Moal, Ming Lei,
	Bart Van Assche, Hannes Reinecke

The lowest value that can be supported for lim->max_user_sectors is the
logical block size. This patch prepares for reducing BLK_MIN_SEGMENT_SIZE
to a value that may be less than the logical block size.

Reviewed-by: Ming Lei <ming.lei@redhat.com>
Fixes: d690cb8ae14b ("block: add an API to atomically update queue limits")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-settings.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 87724d30be4f..56017098d2c7 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -403,7 +403,8 @@ int blk_validate_limits(struct queue_limits *lim)
 	max_hw_sectors = min_not_zero(lim->max_hw_sectors,
 				lim->max_dev_sectors);
 	if (lim->max_user_sectors) {
-		if (lim->max_user_sectors < BLK_MIN_SEGMENT_SIZE / SECTOR_SIZE)
+		if (lim->max_user_sectors <
+		    lim->logical_block_size / SECTOR_SIZE)
 			return -EINVAL;
 		lim->max_sectors = min(max_hw_sectors, lim->max_user_sectors);
 	} else if (lim->io_opt > (BLK_DEF_MAX_SECTORS_CAP << SECTOR_SHIFT)) {

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 3/6] block: Fix the DMA segment boundary mask check
  2026-03-27 21:13 [PATCH v3 0/6] Enable testing small DMA segment sizes Bart Van Assche
  2026-03-27 21:13 ` [PATCH v3 1/6] block: Fix a source code comment Bart Van Assche
  2026-03-27 21:13 ` [PATCH v3 2/6] block: Fix the max_user_sectors lower bound Bart Van Assche
@ 2026-03-27 21:13 ` Bart Van Assche
  2026-03-27 21:13 ` [PATCH v3 4/6] block: Reduce the minimum value for the maximum DMA segment size Bart Van Assche
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Bart Van Assche @ 2026-03-27 21:13 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Damien Le Moal, Ming Lei,
	Bart Van Assche

Commit d690cb8ae14b ("block: add an API to atomically update queue limits")
introduced the following code:

       /*
        * By default there is no limit on the segment boundary alignment,
        * but if there is one it can't be smaller than the page size as
        * that would break all the normal I/O patterns.
        */
       if (!lim->seg_boundary_mask)
               lim->seg_boundary_mask = BLK_SEG_BOUNDARY_MASK;
       if (WARN_ON_ONCE(lim->seg_boundary_mask < PAGE_SIZE - 1))
               return -EINVAL;

The comment about "breaking normal I/O patterns" is no longer correct
since the block layer now supports DMA segments smaller than the page size.
Modify the check such that it still passes for all current block drivers.
The qedi iSCSI driver is an example of a driver that sets
.seg_boundary_mask to 0xfff.

This patch prepares for reducing the value of BLK_MIN_SEGMENT_SIZE.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-settings.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 56017098d2c7..e900405d0cc3 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -450,14 +450,14 @@ int blk_validate_limits(struct queue_limits *lim)
 
 	/*
 	 * By default there is no limit on the segment boundary alignment,
-	 * but if there is one it can't be smaller than the page size as
-	 * that would break all the normal I/O patterns.
+	 * but if there is one, check that it allows at least 4 KiB of data to
+	 * be submitted at once. All known block device DMA controllers support
+	 * 4 KiB DMA segments that do not cross 4 KiB boundaries.
 	 */
 	if (!lim->seg_boundary_mask)
 		lim->seg_boundary_mask = BLK_SEG_BOUNDARY_MASK;
-	if (WARN_ON_ONCE(lim->seg_boundary_mask < BLK_MIN_SEGMENT_SIZE - 1))
+	if (WARN_ON_ONCE(lim->seg_boundary_mask < SZ_4K - 1))
 		return -EINVAL;
-
 	/*
 	 * Stacking device may have both virtual boundary and max segment
 	 * size limit, so allow this setting now, and long-term the two

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 4/6] block: Reduce the minimum value for the maximum DMA segment size
  2026-03-27 21:13 [PATCH v3 0/6] Enable testing small DMA segment sizes Bart Van Assche
                   ` (2 preceding siblings ...)
  2026-03-27 21:13 ` [PATCH v3 3/6] block: Fix the DMA segment boundary mask check Bart Van Assche
@ 2026-03-27 21:13 ` Bart Van Assche
  2026-03-29 14:38   ` Ming Lei
  2026-03-27 21:13 ` [PATCH v3 5/6] null_blk: Support configuring " Bart Van Assche
  2026-03-27 21:13 ` [PATCH v3 6/6] scsi_debug: Support configuring the maximum " Bart Van Assche
  5 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2026-03-27 21:13 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Damien Le Moal, Ming Lei,
	Bart Van Assche

All block devices that are supported by the Linux kernel have a DMA engine
that supports DMA segments of 4 KiB or larger. Allow smaller DMA segment
sizes because these are useful for block layer testing. Reject values below
512 because such values would result in an excessive number of DMA
segments. Move the BLK_MIN_SEGMENT_SIZE definition into <linux/blkdev.h>.
This will allow the BLK_MIN_SEGMENT_SIZE constant to be used in the
null_blk and scsi_debug drivers.

The only code affected by this change is the following code:

	if (WARN_ON_ONCE(lim->max_segment_size < BLK_MIN_SEGMENT_SIZE))
		return -EINVAL;

Cc: Ming Lei <ming.lei@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk.h            | 1 -
 include/linux/blkdev.h | 1 +
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk.h b/block/blk.h
index 103cb1d0b9cb..b30ff8db3cac 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -23,7 +23,6 @@ struct elv_change_ctx;
 #define BLK_DEF_MAX_SECTORS_CAP	(SZ_4M >> SECTOR_SHIFT)
 
 #define	BLK_DEV_MAX_SECTORS	(LLONG_MAX >> 9)
-#define	BLK_MIN_SEGMENT_SIZE	4096
 
 /* Max future timer expiry for timeouts */
 #define BLK_MAX_TIMEOUT		(5 * HZ)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index b8e7f42aee71..109d5fa5e190 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1311,6 +1311,7 @@ static inline bool bdev_is_partition(struct block_device *bdev)
 enum blk_default_limits {
 	BLK_MAX_SEGMENTS	= 128,
 	BLK_SAFE_MAX_SECTORS	= 255,
+	BLK_MIN_SEGMENT_SIZE	= 512,
 	BLK_MAX_SEGMENT_SIZE	= 65536,
 	BLK_SEG_BOUNDARY_MASK	= 0xFFFFFFFFUL,
 };

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 5/6] null_blk: Support configuring the maximum DMA segment size
  2026-03-27 21:13 [PATCH v3 0/6] Enable testing small DMA segment sizes Bart Van Assche
                   ` (3 preceding siblings ...)
  2026-03-27 21:13 ` [PATCH v3 4/6] block: Reduce the minimum value for the maximum DMA segment size Bart Van Assche
@ 2026-03-27 21:13 ` Bart Van Assche
  2026-03-29 12:30   ` Nilay Shroff
  2026-03-27 21:13 ` [PATCH v3 6/6] scsi_debug: Support configuring the maximum " Bart Van Assche
  5 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2026-03-27 21:13 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Damien Le Moal, Ming Lei,
	Bart Van Assche, Damien Le Moal, Chaitanya Kulkarni, Keith Busch,
	Johannes Thumshirn, Christophe JAILLET, Thorsten Blum,
	Matthew Wilcox (Oracle), Hans Holmberg, Nilay Shroff, Kees Cook,
	Hannes Reinecke, Martin K. Petersen

Add support for configuring the maximum DMA segment size. The maximum DMA
segment size may be set to a value smaller than the virtual memory page
size. Reject invalid max_segment_size values.

Since rq_for_each_segment() may yield bvecs larger than the maximum DMA
segment size, add code in the rq_for_each_segment() loop that restricts
the bvec length to the maximum DMA segment size.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/block/null_blk/main.c     | 43 +++++++++++++++++++++++++++++++
 drivers/block/null_blk/null_blk.h |  1 +
 2 files changed, 44 insertions(+)

diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
index f8c0fd57e041..d5fbbc5d63ed 100644
--- a/drivers/block/null_blk/main.c
+++ b/drivers/block/null_blk/main.c
@@ -169,6 +169,32 @@ static int g_max_sectors;
 module_param_named(max_sectors, g_max_sectors, int, 0444);
 MODULE_PARM_DESC(max_sectors, "Maximum size of a command (in 512B sectors)");
 
+static unsigned int g_max_segment_size = BLK_MAX_SEGMENT_SIZE;
+
+static int nullb_set_max_segment_size(const char *val,
+				      const struct kernel_param *kp)
+{
+	int res;
+
+	res = kstrtouint(val, 0, &g_max_segment_size);
+	if (res < 0)
+		return res;
+
+	if (g_max_segment_size < BLK_MIN_SEGMENT_SIZE)
+		return -EINVAL;
+
+	return 0;
+}
+
+static const struct kernel_param_ops max_segment_size_ops = {
+	.set = nullb_set_max_segment_size,
+	.get = param_get_uint,
+};
+
+module_param_cb(max_segment_size, &max_segment_size_ops, &g_max_segment_size,
+		0444);
+MODULE_PARM_DESC(max_segment_size, "Maximum size of a DMA segment in bytes");
+
 static unsigned int nr_devices = 1;
 module_param(nr_devices, uint, 0444);
 MODULE_PARM_DESC(nr_devices, "Number of devices to register");
@@ -442,6 +468,14 @@ static int nullb_apply_poll_queues(struct nullb_device *dev,
 	return ret;
 }
 
+static int nullb_apply_max_segment_size(struct nullb_device *dev,
+					unsigned int max_segment_size)
+{
+	if (max_segment_size < BLK_MIN_SEGMENT_SIZE)
+		return -EINVAL;
+	return 0;
+}
+
 NULLB_DEVICE_ATTR(size, ulong, NULL);
 NULLB_DEVICE_ATTR(completion_nsec, ulong, NULL);
 NULLB_DEVICE_ATTR(submit_queues, uint, nullb_apply_submit_queues);
@@ -450,6 +484,7 @@ NULLB_DEVICE_ATTR(home_node, uint, NULL);
 NULLB_DEVICE_ATTR(queue_mode, uint, NULL);
 NULLB_DEVICE_ATTR(blocksize, uint, NULL);
 NULLB_DEVICE_ATTR(max_sectors, uint, NULL);
+NULLB_DEVICE_ATTR(max_segment_size, uint, nullb_apply_max_segment_size);
 NULLB_DEVICE_ATTR(irqmode, uint, NULL);
 NULLB_DEVICE_ATTR(hw_queue_depth, uint, NULL);
 NULLB_DEVICE_ATTR(index, uint, NULL);
@@ -608,6 +643,7 @@ static struct configfs_attribute *nullb_device_attrs[] = {
 	&nullb_device_attr_index,
 	&nullb_device_attr_irqmode,
 	&nullb_device_attr_max_sectors,
+	&nullb_device_attr_max_segment_size,
 	&nullb_device_attr_mbps,
 	&nullb_device_attr_memory_backed,
 	&nullb_device_attr_no_sched,
@@ -805,6 +841,7 @@ static struct nullb_device *null_alloc_dev(void)
 	dev->queue_mode = g_queue_mode;
 	dev->blocksize = g_bs;
 	dev->max_sectors = g_max_sectors;
+	dev->max_segment_size = g_max_segment_size;
 	dev->irqmode = g_irqmode;
 	dev->hw_queue_depth = g_hw_queue_depth;
 	dev->blocking = g_blocking;
@@ -1248,6 +1285,9 @@ static blk_status_t null_transfer(struct nullb *nullb, struct page *page,
 	unsigned int valid_len = len;
 	void *p;
 
+	WARN_ONCE(len > dev->max_segment_size, "%u > %u\n", len,
+		  dev->max_segment_size);
+
 	p = kmap_local_page(page) + off;
 	if (!is_write) {
 		if (dev->zoned) {
@@ -1295,6 +1335,8 @@ static blk_status_t null_handle_data_transfer(struct nullb_cmd *cmd,
 	spin_lock_irq(&nullb->lock);
 	rq_for_each_segment(bvec, rq, iter) {
 		len = bvec.bv_len;
+		len = min(bvec.bv_len, nullb->dev->max_segment_size);
+		bvec.bv_len = len;
 		if (transferred_bytes + len > max_bytes)
 			len = max_bytes - transferred_bytes;
 		err = null_transfer(nullb, bvec.bv_page, len, bvec.bv_offset,
@@ -1958,6 +2000,7 @@ static int null_add_dev(struct nullb_device *dev)
 		.logical_block_size	= dev->blocksize,
 		.physical_block_size	= dev->blocksize,
 		.max_hw_sectors		= dev->max_sectors,
+		.max_segment_size	= dev->max_segment_size,
 		.dma_alignment		= 1,
 	};
 
diff --git a/drivers/block/null_blk/null_blk.h b/drivers/block/null_blk/null_blk.h
index 6c4c4bbe7dad..43dc47789718 100644
--- a/drivers/block/null_blk/null_blk.h
+++ b/drivers/block/null_blk/null_blk.h
@@ -93,6 +93,7 @@ struct nullb_device {
 	unsigned int queue_mode; /* block interface */
 	unsigned int blocksize; /* block size */
 	unsigned int max_sectors; /* Max sectors per command */
+	unsigned int max_segment_size; /* Max size of a single DMA segment. */
 	unsigned int irqmode; /* IRQ completion handler */
 	unsigned int hw_queue_depth; /* queue depth */
 	unsigned int index; /* index of the disk, only valid with a disk */

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 6/6] scsi_debug: Support configuring the maximum segment size
  2026-03-27 21:13 [PATCH v3 0/6] Enable testing small DMA segment sizes Bart Van Assche
                   ` (4 preceding siblings ...)
  2026-03-27 21:13 ` [PATCH v3 5/6] null_blk: Support configuring " Bart Van Assche
@ 2026-03-27 21:13 ` Bart Van Assche
  5 siblings, 0 replies; 10+ messages in thread
From: Bart Van Assche @ 2026-03-27 21:13 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Damien Le Moal, Ming Lei,
	Bart Van Assche, John Garry, Doug Gilbert, James E.J. Bottomley,
	Martin K. Petersen

Add a kernel module parameter for configuring the maximum segment size.
Reject invalid max_segment_size values.

This patch enables testing SCSI support for segments smaller than the
page size.

Cc: John Garry <john.g.garry@oracle.com>
Cc: Doug Gilbert <dgilbert@interlog.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_debug.c | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 1515495fd9ea..7e5e171bfcda 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -915,6 +915,7 @@ static int sdebug_host_max_queue;	/* per host */
 static int sdebug_lowest_aligned = DEF_LOWEST_ALIGNED;
 static int sdebug_max_luns = DEF_MAX_LUNS;
 static int sdebug_max_queue = SDEBUG_CANQUEUE;	/* per submit queue */
+static unsigned int sdebug_max_segment_size = UINT_MAX;
 static unsigned int sdebug_medium_error_start = OPT_MEDIUM_ERR_ADDR;
 static int sdebug_medium_error_count = OPT_MEDIUM_ERR_NUM;
 static int sdebug_ndelay = DEF_NDELAY;	/* if > 0 then unit is nanoseconds */
@@ -1041,6 +1042,26 @@ static const int condition_met_result = SAM_STAT_CONDITION_MET;
 static struct dentry *sdebug_debugfs_root;
 static ASYNC_DOMAIN_EXCLUSIVE(sdebug_async_domain);
 
+static int sdebug_set_max_segment_size(const char *val,
+				       const struct kernel_param *kp)
+{
+	int res;
+
+	res = kstrtouint(val, 0, &sdebug_max_segment_size);
+	if (res < 0)
+		return res;
+
+	if (sdebug_max_segment_size < BLK_MIN_SEGMENT_SIZE)
+		return -EINVAL;
+
+	return 0;
+}
+
+static const struct kernel_param_ops max_segment_size_ops = {
+	.set = sdebug_set_max_segment_size,
+	.get = param_get_uint,
+};
+
 static u32 sdebug_get_devsel(struct scsi_device *sdp)
 {
 	unsigned char devtype = sdp->type;
@@ -7366,6 +7387,8 @@ module_param_named(lowest_aligned, sdebug_lowest_aligned, int, S_IRUGO);
 module_param_named(lun_format, sdebug_lun_am_i, int, S_IRUGO | S_IWUSR);
 module_param_named(max_luns, sdebug_max_luns, int, S_IRUGO | S_IWUSR);
 module_param_named(max_queue, sdebug_max_queue, int, S_IRUGO | S_IWUSR);
+module_param_cb(max_segment_size, &max_segment_size_ops,
+		&sdebug_max_segment_size, S_IRUGO);
 module_param_named(medium_error_count, sdebug_medium_error_count, int,
 		   S_IRUGO | S_IWUSR);
 module_param_named(medium_error_start, sdebug_medium_error_start, int,
@@ -7449,6 +7472,7 @@ MODULE_PARM_DESC(lowest_aligned, "lowest aligned lba (def=0)");
 MODULE_PARM_DESC(lun_format, "LUN format: 0->peripheral (def); 1 --> flat address method");
 MODULE_PARM_DESC(max_luns, "number of LUNs per target to simulate(def=1)");
 MODULE_PARM_DESC(max_queue, "max number of queued commands (1 to max(def))");
+MODULE_PARM_DESC(max_segment_size, "max bytes in a single DMA segment");
 MODULE_PARM_DESC(medium_error_count, "count of sectors to return follow on MEDIUM error");
 MODULE_PARM_DESC(medium_error_start, "starting sector number to return MEDIUM error");
 MODULE_PARM_DESC(ndelay, "response delay in nanoseconds (def=0 -> ignore)");
@@ -9539,7 +9563,6 @@ static const struct scsi_host_template sdebug_driver_template = {
 	.sg_tablesize =		SG_MAX_SEGMENTS,
 	.cmd_per_lun =		DEF_CMD_PER_LUN,
 	.max_sectors =		-1U,
-	.max_segment_size =	-1U,
 	.module =		THIS_MODULE,
 	.skip_settle_delay =	1,
 	.track_queue_depth =	1,
@@ -9566,6 +9589,7 @@ static int sdebug_driver_probe(struct device *dev)
 	}
 	hpnt->can_queue = sdebug_max_queue;
 	hpnt->cmd_per_lun = sdebug_max_queue;
+	hpnt->max_segment_size = sdebug_max_segment_size;
 	if (!sdebug_clustering)
 		hpnt->dma_boundary = PAGE_SIZE - 1;
 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 5/6] null_blk: Support configuring the maximum DMA segment size
  2026-03-27 21:13 ` [PATCH v3 5/6] null_blk: Support configuring " Bart Van Assche
@ 2026-03-29 12:30   ` Nilay Shroff
  2026-03-30  2:23     ` Ming Lei
  0 siblings, 1 reply; 10+ messages in thread
From: Nilay Shroff @ 2026-03-29 12:30 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Damien Le Moal, Ming Lei,
	Damien Le Moal, Chaitanya Kulkarni, Keith Busch,
	Johannes Thumshirn, Christophe JAILLET, Thorsten Blum,
	Matthew Wilcox (Oracle), Hans Holmberg, Kees Cook,
	Hannes Reinecke, Martin K. Petersen

On 3/28/26 2:43 AM, Bart Van Assche wrote:
> Add support for configuring the maximum DMA segment size. The maximum DMA
> segment size may be set to a value smaller than the virtual memory page
> size. Reject invalid max_segment_size values.
> 
> Since rq_for_each_segment() may yield bvecs larger than the maximum DMA
> segment size, add code in the rq_for_each_segment() loop that restricts
> the bvec length to the maximum DMA segment size.
> 
> Cc: Christoph Hellwig<hch@lst.de>
> Cc: Ming Lei<ming.lei@redhat.com>
> Cc: Damien Le Moal<damien.lemoal@opensource.wdc.com>
> Cc: Chaitanya Kulkarni<kch@nvidia.com>
> Signed-off-by: Bart Van Assche<bvanassche@acm.org>
> ---
>   drivers/block/null_blk/main.c     | 43 +++++++++++++++++++++++++++++++
>   drivers/block/null_blk/null_blk.h |  1 +
>   2 files changed, 44 insertions(+)
> 
> diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
> index f8c0fd57e041..d5fbbc5d63ed 100644
> --- a/drivers/block/null_blk/main.c
> +++ b/drivers/block/null_blk/main.c
> @@ -169,6 +169,32 @@ static int g_max_sectors;
>   module_param_named(max_sectors, g_max_sectors, int, 0444);
>   MODULE_PARM_DESC(max_sectors, "Maximum size of a command (in 512B sectors)");
>   
> +static unsigned int g_max_segment_size = BLK_MAX_SEGMENT_SIZE;
> +
> +static int nullb_set_max_segment_size(const char *val,
> +				      const struct kernel_param *kp)
> +{
> +	int res;
> +
> +	res = kstrtouint(val, 0, &g_max_segment_size);
> +	if (res < 0)
> +		return res;
> +
> +	if (g_max_segment_size < BLK_MIN_SEGMENT_SIZE)
> +		return -EINVAL;
> +
> +	return 0;
> +}
> +
> +static const struct kernel_param_ops max_segment_size_ops = {
> +	.set = nullb_set_max_segment_size,
> +	.get = param_get_uint,
> +};
> +
> +module_param_cb(max_segment_size, &max_segment_size_ops, &g_max_segment_size,
> +		0444);
> +MODULE_PARM_DESC(max_segment_size, "Maximum size of a DMA segment in bytes");
> +
>   static unsigned int nr_devices = 1;
>   module_param(nr_devices, uint, 0444);
>   MODULE_PARM_DESC(nr_devices, "Number of devices to register");
> @@ -442,6 +468,14 @@ static int nullb_apply_poll_queues(struct nullb_device *dev,
>   	return ret;
>   }
>   
> +static int nullb_apply_max_segment_size(struct nullb_device *dev,
> +					unsigned int max_segment_size)
> +{
> +	if (max_segment_size < BLK_MIN_SEGMENT_SIZE)
> +		return -EINVAL;
> +	return 0;
> +}
> +
>   NULLB_DEVICE_ATTR(size, ulong, NULL);
>   NULLB_DEVICE_ATTR(completion_nsec, ulong, NULL);
>   NULLB_DEVICE_ATTR(submit_queues, uint, nullb_apply_submit_queues);
> @@ -450,6 +484,7 @@ NULLB_DEVICE_ATTR(home_node, uint, NULL);
>   NULLB_DEVICE_ATTR(queue_mode, uint, NULL);
>   NULLB_DEVICE_ATTR(blocksize, uint, NULL);
>   NULLB_DEVICE_ATTR(max_sectors, uint, NULL);
> +NULLB_DEVICE_ATTR(max_segment_size, uint, nullb_apply_max_segment_size);
>   NULLB_DEVICE_ATTR(irqmode, uint, NULL);
>   NULLB_DEVICE_ATTR(hw_queue_depth, uint, NULL);
>   NULLB_DEVICE_ATTR(index, uint, NULL);
> @@ -608,6 +643,7 @@ static struct configfs_attribute *nullb_device_attrs[] = {
>   	&nullb_device_attr_index,
>   	&nullb_device_attr_irqmode,
>   	&nullb_device_attr_max_sectors,
> +	&nullb_device_attr_max_segment_size,
>   	&nullb_device_attr_mbps,
>   	&nullb_device_attr_memory_backed,
>   	&nullb_device_attr_no_sched,
> @@ -805,6 +841,7 @@ static struct nullb_device *null_alloc_dev(void)
>   	dev->queue_mode = g_queue_mode;
>   	dev->blocksize = g_bs;
>   	dev->max_sectors = g_max_sectors;
> +	dev->max_segment_size = g_max_segment_size;
>   	dev->irqmode = g_irqmode;
>   	dev->hw_queue_depth = g_hw_queue_depth;
>   	dev->blocking = g_blocking;
> @@ -1248,6 +1285,9 @@ static blk_status_t null_transfer(struct nullb *nullb, struct page *page,
>   	unsigned int valid_len = len;
>   	void *p;
>   
> +	WARN_ONCE(len > dev->max_segment_size, "%u > %u\n", len,
> +		  dev->max_segment_size);
> +
>   	p = kmap_local_page(page) + off;
>   	if (!is_write) {
>   		if (dev->zoned) {
> @@ -1295,6 +1335,8 @@ static blk_status_t null_handle_data_transfer(struct nullb_cmd *cmd,
>   	spin_lock_irq(&nullb->lock);
>   	rq_for_each_segment(bvec, rq, iter) {
>   		len = bvec.bv_len;
> +		len = min(bvec.bv_len, nullb->dev->max_segment_size);
> +		bvec.bv_len = len;
>   		if (transferred_bytes + len > max_bytes)
>   			len = max_bytes - transferred_bytes;
>   		err = null_transfer(nullb, bvec.bv_page, len, bvec.bv_offset,


IMO, since max_segment_size is now configurable, should we consider using
blk_rq_map_sg() instead of rq_for_each_segment()?

rq_for_each_segment() iterates over bio_vecs, and these bvecs are not
guaranteed to comply with max_segment_size and seg_boundary_mask. Simply
clamping bv_len inside the loop does not correctly model how DMA segments
are formed. In particular, this approach does not account for merging or
splitting behavior based on physical contiguity or segment boundaries.

blk_rq_map_sg(), on the other hand, constructs a scatter-gather list that
already respects max_segment_size, seg_boundary_mask, and max_segments, and
may merge physically contiguous bvecs.

Given that, it may be cleaner to:
1. Call blk_rq_map_sg() to build the SG list
2. Iterate over the resulting SG segments
3. Perform data transfer using null_transfer() per SG entry

This would avoid modifying bvecs directly and ensure that segment
constraints are handled consistently with the block layer.

Thanks,
--Nilay


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 4/6] block: Reduce the minimum value for the maximum DMA segment size
  2026-03-27 21:13 ` [PATCH v3 4/6] block: Reduce the minimum value for the maximum DMA segment size Bart Van Assche
@ 2026-03-29 14:38   ` Ming Lei
  0 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2026-03-29 14:38 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Damien Le Moal

On Fri, Mar 27, 2026 at 02:13:44PM -0700, Bart Van Assche wrote:
> All block devices that are supported by the Linux kernel have a DMA engine
> that supports DMA segments of 4 KiB or larger. Allow smaller DMA segment
> sizes because these are useful for block layer testing. Reject values below

Can you share why/what it is useful just for test purpose?

If there isn't such real device with 512 segment size, why do we want this
change for covering it?

> 512 because such values would result in an excessive number of DMA
> segments. Move the BLK_MIN_SEGMENT_SIZE definition into <linux/blkdev.h>.
> This will allow the BLK_MIN_SEGMENT_SIZE constant to be used in the
> null_blk and scsi_debug drivers.
> 
> The only code affected by this change is the following code:
> 
> 	if (WARN_ON_ONCE(lim->max_segment_size < BLK_MIN_SEGMENT_SIZE))
> 		return -EINVAL;
> 
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  block/blk.h            | 1 -
>  include/linux/blkdev.h | 1 +
>  2 files changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/blk.h b/block/blk.h
> index 103cb1d0b9cb..b30ff8db3cac 100644
> --- a/block/blk.h
> +++ b/block/blk.h
> @@ -23,7 +23,6 @@ struct elv_change_ctx;
>  #define BLK_DEF_MAX_SECTORS_CAP	(SZ_4M >> SECTOR_SHIFT)
>  
>  #define	BLK_DEV_MAX_SECTORS	(LLONG_MAX >> 9)
> -#define	BLK_MIN_SEGMENT_SIZE	4096
>  
>  /* Max future timer expiry for timeouts */
>  #define BLK_MAX_TIMEOUT		(5 * HZ)
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index b8e7f42aee71..109d5fa5e190 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1311,6 +1311,7 @@ static inline bool bdev_is_partition(struct block_device *bdev)
>  enum blk_default_limits {
>  	BLK_MAX_SEGMENTS	= 128,
>  	BLK_SAFE_MAX_SECTORS	= 255,
> +	BLK_MIN_SEGMENT_SIZE	= 512,
>  	BLK_MAX_SEGMENT_SIZE	= 65536,
>  	BLK_SEG_BOUNDARY_MASK	= 0xFFFFFFFFUL,

This change actually becomes not consistent with previous patch, in which ->seg_boundary_mask
can be 4095.


Thank,
Ming


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 5/6] null_blk: Support configuring the maximum DMA segment size
  2026-03-29 12:30   ` Nilay Shroff
@ 2026-03-30  2:23     ` Ming Lei
  0 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2026-03-30  2:23 UTC (permalink / raw)
  To: Nilay Shroff
  Cc: Bart Van Assche, Jens Axboe, linux-block, Christoph Hellwig,
	Damien Le Moal, Damien Le Moal, Chaitanya Kulkarni, Keith Busch,
	Johannes Thumshirn, Christophe JAILLET, Thorsten Blum,
	Matthew Wilcox (Oracle), Hans Holmberg, Kees Cook,
	Hannes Reinecke, Martin K. Petersen

On Sun, Mar 29, 2026 at 06:00:31PM +0530, Nilay Shroff wrote:
> On 3/28/26 2:43 AM, Bart Van Assche wrote:
> > Add support for configuring the maximum DMA segment size. The maximum DMA
> > segment size may be set to a value smaller than the virtual memory page
> > size. Reject invalid max_segment_size values.
> > 
> > Since rq_for_each_segment() may yield bvecs larger than the maximum DMA
> > segment size, add code in the rq_for_each_segment() loop that restricts
> > the bvec length to the maximum DMA segment size.
> > 
> > Cc: Christoph Hellwig<hch@lst.de>
> > Cc: Ming Lei<ming.lei@redhat.com>
> > Cc: Damien Le Moal<damien.lemoal@opensource.wdc.com>
> > Cc: Chaitanya Kulkarni<kch@nvidia.com>
> > Signed-off-by: Bart Van Assche<bvanassche@acm.org>
> > ---
> >   drivers/block/null_blk/main.c     | 43 +++++++++++++++++++++++++++++++
> >   drivers/block/null_blk/null_blk.h |  1 +
> >   2 files changed, 44 insertions(+)
> > 
> > diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
> > index f8c0fd57e041..d5fbbc5d63ed 100644
> > --- a/drivers/block/null_blk/main.c
> > +++ b/drivers/block/null_blk/main.c
> > @@ -169,6 +169,32 @@ static int g_max_sectors;
> >   module_param_named(max_sectors, g_max_sectors, int, 0444);
> >   MODULE_PARM_DESC(max_sectors, "Maximum size of a command (in 512B sectors)");
> > +static unsigned int g_max_segment_size = BLK_MAX_SEGMENT_SIZE;
> > +
> > +static int nullb_set_max_segment_size(const char *val,
> > +				      const struct kernel_param *kp)
> > +{
> > +	int res;
> > +
> > +	res = kstrtouint(val, 0, &g_max_segment_size);
> > +	if (res < 0)
> > +		return res;
> > +
> > +	if (g_max_segment_size < BLK_MIN_SEGMENT_SIZE)
> > +		return -EINVAL;
> > +
> > +	return 0;
> > +}
> > +
> > +static const struct kernel_param_ops max_segment_size_ops = {
> > +	.set = nullb_set_max_segment_size,
> > +	.get = param_get_uint,
> > +};
> > +
> > +module_param_cb(max_segment_size, &max_segment_size_ops, &g_max_segment_size,
> > +		0444);
> > +MODULE_PARM_DESC(max_segment_size, "Maximum size of a DMA segment in bytes");
> > +
> >   static unsigned int nr_devices = 1;
> >   module_param(nr_devices, uint, 0444);
> >   MODULE_PARM_DESC(nr_devices, "Number of devices to register");
> > @@ -442,6 +468,14 @@ static int nullb_apply_poll_queues(struct nullb_device *dev,
> >   	return ret;
> >   }
> > +static int nullb_apply_max_segment_size(struct nullb_device *dev,
> > +					unsigned int max_segment_size)
> > +{
> > +	if (max_segment_size < BLK_MIN_SEGMENT_SIZE)
> > +		return -EINVAL;
> > +	return 0;
> > +}
> > +
> >   NULLB_DEVICE_ATTR(size, ulong, NULL);
> >   NULLB_DEVICE_ATTR(completion_nsec, ulong, NULL);
> >   NULLB_DEVICE_ATTR(submit_queues, uint, nullb_apply_submit_queues);
> > @@ -450,6 +484,7 @@ NULLB_DEVICE_ATTR(home_node, uint, NULL);
> >   NULLB_DEVICE_ATTR(queue_mode, uint, NULL);
> >   NULLB_DEVICE_ATTR(blocksize, uint, NULL);
> >   NULLB_DEVICE_ATTR(max_sectors, uint, NULL);
> > +NULLB_DEVICE_ATTR(max_segment_size, uint, nullb_apply_max_segment_size);
> >   NULLB_DEVICE_ATTR(irqmode, uint, NULL);
> >   NULLB_DEVICE_ATTR(hw_queue_depth, uint, NULL);
> >   NULLB_DEVICE_ATTR(index, uint, NULL);
> > @@ -608,6 +643,7 @@ static struct configfs_attribute *nullb_device_attrs[] = {
> >   	&nullb_device_attr_index,
> >   	&nullb_device_attr_irqmode,
> >   	&nullb_device_attr_max_sectors,
> > +	&nullb_device_attr_max_segment_size,
> >   	&nullb_device_attr_mbps,
> >   	&nullb_device_attr_memory_backed,
> >   	&nullb_device_attr_no_sched,
> > @@ -805,6 +841,7 @@ static struct nullb_device *null_alloc_dev(void)
> >   	dev->queue_mode = g_queue_mode;
> >   	dev->blocksize = g_bs;
> >   	dev->max_sectors = g_max_sectors;
> > +	dev->max_segment_size = g_max_segment_size;
> >   	dev->irqmode = g_irqmode;
> >   	dev->hw_queue_depth = g_hw_queue_depth;
> >   	dev->blocking = g_blocking;
> > @@ -1248,6 +1285,9 @@ static blk_status_t null_transfer(struct nullb *nullb, struct page *page,
> >   	unsigned int valid_len = len;
> >   	void *p;
> > +	WARN_ONCE(len > dev->max_segment_size, "%u > %u\n", len,
> > +		  dev->max_segment_size);
> > +
> >   	p = kmap_local_page(page) + off;
> >   	if (!is_write) {
> >   		if (dev->zoned) {
> > @@ -1295,6 +1335,8 @@ static blk_status_t null_handle_data_transfer(struct nullb_cmd *cmd,
> >   	spin_lock_irq(&nullb->lock);
> >   	rq_for_each_segment(bvec, rq, iter) {
> >   		len = bvec.bv_len;
> > +		len = min(bvec.bv_len, nullb->dev->max_segment_size);
> > +		bvec.bv_len = len;
> >   		if (transferred_bytes + len > max_bytes)
> >   			len = max_bytes - transferred_bytes;
> >   		err = null_transfer(nullb, bvec.bv_page, len, bvec.bv_offset,
> 
> 
> IMO, since max_segment_size is now configurable, should we consider using
> blk_rq_map_sg() instead of rq_for_each_segment()?

blk_rq_map_sg() requires to allocate sgl, actually it can be done in
request sg iterator way:

	blk_rq_map_iter_init();
	while (blk_map_iter_next(rq, &iter, &vec)) {
		consume each segment;
	}

Just the two helpers needs to be exported.


thanks,
Ming


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-03-30  2:24 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-27 21:13 [PATCH v3 0/6] Enable testing small DMA segment sizes Bart Van Assche
2026-03-27 21:13 ` [PATCH v3 1/6] block: Fix a source code comment Bart Van Assche
2026-03-27 21:13 ` [PATCH v3 2/6] block: Fix the max_user_sectors lower bound Bart Van Assche
2026-03-27 21:13 ` [PATCH v3 3/6] block: Fix the DMA segment boundary mask check Bart Van Assche
2026-03-27 21:13 ` [PATCH v3 4/6] block: Reduce the minimum value for the maximum DMA segment size Bart Van Assche
2026-03-29 14:38   ` Ming Lei
2026-03-27 21:13 ` [PATCH v3 5/6] null_blk: Support configuring " Bart Van Assche
2026-03-29 12:30   ` Nilay Shroff
2026-03-30  2:23     ` Ming Lei
2026-03-27 21:13 ` [PATCH v3 6/6] scsi_debug: Support configuring the maximum " Bart Van Assche

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox