From: Damien Le Moal <dlemoal@kernel.org>
To: Nitesh Shetty <nj.shetty@samsung.com>,
Jens Axboe <axboe@kernel.dk>, Jonathan Corbet <corbet@lwn.net>,
Alasdair Kergon <agk@redhat.com>,
Mike Snitzer <snitzer@kernel.org>,
Mikulas Patocka <mpatocka@redhat.com>,
Keith Busch <kbusch@kernel.org>, Christoph Hellwig <hch@lst.de>,
Sagi Grimberg <sagi@grimberg.me>,
Chaitanya Kulkarni <kch@nvidia.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>
Cc: martin.petersen@oracle.com, bvanassche@acm.org,
david@fromorbit.com, hare@suse.de,
damien.lemoal@opensource.wdc.com, anuj20.g@samsung.com,
joshi.k@samsung.com, nitheshshetty@gmail.com,
gost.dev@samsung.com, linux-block@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
dm-devel@lists.linux.dev, linux-nvme@lists.infradead.org,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v20 01/12] block: Introduce queue limits and sysfs for copy-offload support
Date: Mon, 20 May 2024 16:33:15 +0200 [thread overview]
Message-ID: <c31f663f-36c0-4db2-8bf6-8e3c699073ca@kernel.org> (raw)
In-Reply-To: <20240520102033.9361-2-nj.shetty@samsung.com>
On 2024/05/20 12:20, Nitesh Shetty wrote:
> Add device limits as sysfs entries,
> - copy_max_bytes (RW)
> - copy_max_hw_bytes (RO)
>
> Above limits help to split the copy payload in block layer.
> copy_max_bytes: maximum total length of copy in single payload.
> copy_max_hw_bytes: Reflects the device supported maximum limit.
>
> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
> Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
> ---
> Documentation/ABI/stable/sysfs-block | 23 +++++++++++++++
> block/blk-settings.c | 34 ++++++++++++++++++++--
> block/blk-sysfs.c | 43 ++++++++++++++++++++++++++++
> include/linux/blkdev.h | 14 +++++++++
> 4 files changed, 112 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/ABI/stable/sysfs-block b/Documentation/ABI/stable/sysfs-block
> index 831f19a32e08..52d8a253bf8e 100644
> --- a/Documentation/ABI/stable/sysfs-block
> +++ b/Documentation/ABI/stable/sysfs-block
> @@ -165,6 +165,29 @@ Description:
> last zone of the device which may be smaller.
>
>
> +What: /sys/block/<disk>/queue/copy_max_bytes
> +Date: May 2024
> +Contact: linux-block@vger.kernel.org
> +Description:
> + [RW] This is the maximum number of bytes that the block layer
> + will allow for a copy request. This is always smaller or
> + equal to the maximum size allowed by the hardware, indicated by
> + 'copy_max_hw_bytes'. An attempt to set a value higher than
> + 'copy_max_hw_bytes' will truncate this to 'copy_max_hw_bytes'.
> + Writing '0' to this file will disable offloading copies for this
> + device, instead copy is done via emulation.
> +
> +
> +What: /sys/block/<disk>/queue/copy_max_hw_bytes
> +Date: May 2024
> +Contact: linux-block@vger.kernel.org
> +Description:
> + [RO] This is the maximum number of bytes that the hardware
> + will allow for single data copy request.
> + A value of 0 means that the device does not support
> + copy offload.
> +
> +
> What: /sys/block/<disk>/queue/crypto/
> Date: February 2022
> Contact: linux-block@vger.kernel.org
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index a7fe8e90240a..67010ed82422 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -52,6 +52,9 @@ void blk_set_stacking_limits(struct queue_limits *lim)
> lim->max_write_zeroes_sectors = UINT_MAX;
> lim->max_zone_append_sectors = UINT_MAX;
> lim->max_user_discard_sectors = UINT_MAX;
> + lim->max_copy_hw_sectors = UINT_MAX;
> + lim->max_copy_sectors = UINT_MAX;
> + lim->max_user_copy_sectors = UINT_MAX;
> }
> EXPORT_SYMBOL(blk_set_stacking_limits);
>
> @@ -219,6 +222,9 @@ static int blk_validate_limits(struct queue_limits *lim)
> lim->misaligned = 0;
> }
>
> + lim->max_copy_sectors =
> + min(lim->max_copy_hw_sectors, lim->max_user_copy_sectors);
> +
> return blk_validate_zoned_limits(lim);
> }
>
> @@ -231,10 +237,11 @@ int blk_set_default_limits(struct queue_limits *lim)
> {
> /*
> * Most defaults are set by capping the bounds in blk_validate_limits,
> - * but max_user_discard_sectors is special and needs an explicit
> - * initialization to the max value here.
> + * but max_user_discard_sectors and max_user_copy_sectors are special
> + * and needs an explicit initialization to the max value here.
s/needs/need
> */
> lim->max_user_discard_sectors = UINT_MAX;
> + lim->max_user_copy_sectors = UINT_MAX;
> return blk_validate_limits(lim);
> }
>
> @@ -316,6 +323,25 @@ void blk_queue_max_discard_sectors(struct request_queue *q,
> }
> EXPORT_SYMBOL(blk_queue_max_discard_sectors);
>
> +/*
> + * blk_queue_max_copy_hw_sectors - set max sectors for a single copy payload
> + * @q: the request queue for the device
> + * @max_copy_sectors: maximum number of sectors to copy
> + */
> +void blk_queue_max_copy_hw_sectors(struct request_queue *q,
> + unsigned int max_copy_sectors)
> +{
> + struct queue_limits *lim = &q->limits;
> +
> + if (max_copy_sectors > (BLK_COPY_MAX_BYTES >> SECTOR_SHIFT))
> + max_copy_sectors = BLK_COPY_MAX_BYTES >> SECTOR_SHIFT;
> +
> + lim->max_copy_hw_sectors = max_copy_sectors;
> + lim->max_copy_sectors =
> + min(max_copy_sectors, lim->max_user_copy_sectors);
> +}
> +EXPORT_SYMBOL_GPL(blk_queue_max_copy_hw_sectors);
Hmm... Such helper seems to not fit with Christoph's changes of the limits
initialization as that is not necessarily done using &q->limits but depending on
the driver, a different limit structure. So shouldn't this function be passed a
queue_limits struct pointer instead of the request queue pointer ?
> +
> /**
> * blk_queue_max_secure_erase_sectors - set max sectors for a secure erase
> * @q: the request queue for the device
> @@ -633,6 +659,10 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
> t->max_segment_size = min_not_zero(t->max_segment_size,
> b->max_segment_size);
>
> + t->max_copy_sectors = min(t->max_copy_sectors, b->max_copy_sectors);
> + t->max_copy_hw_sectors = min(t->max_copy_hw_sectors,
> + b->max_copy_hw_sectors);
> +
> t->misaligned |= b->misaligned;
>
> alignment = queue_limit_alignment_offset(b, start);
> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> index f0f9314ab65c..805c2b6b0393 100644
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -205,6 +205,44 @@ static ssize_t queue_discard_zeroes_data_show(struct request_queue *q, char *pag
> return queue_var_show(0, page);
> }
>
> +static ssize_t queue_copy_hw_max_show(struct request_queue *q, char *page)
> +{
> + return sprintf(page, "%llu\n", (unsigned long long)
> + q->limits.max_copy_hw_sectors << SECTOR_SHIFT);
> +}
> +
> +static ssize_t queue_copy_max_show(struct request_queue *q, char *page)
> +{
> + return sprintf(page, "%llu\n", (unsigned long long)
> + q->limits.max_copy_sectors << SECTOR_SHIFT);
> +}
Given that you repeat the same pattern twice, may be add a queue_var64_show()
helper ? (naming can be changed).
> +
> +static ssize_t queue_copy_max_store(struct request_queue *q, const char *page,
> + size_t count)
> +{
> + unsigned long max_copy_bytes;
> + struct queue_limits lim;
> + ssize_t ret;
> + int err;
> +
> + ret = queue_var_store(&max_copy_bytes, page, count);
> + if (ret < 0)
> + return ret;
> +
> + if (max_copy_bytes & (queue_logical_block_size(q) - 1))
> + return -EINVAL;
> +
> + blk_mq_freeze_queue(q);
> + lim = queue_limits_start_update(q);
> + lim.max_user_copy_sectors = max_copy_bytes >> SECTOR_SHIFT;
max_copy_bytes is an unsigned long, so 64 bits on 64-bit arch and
max_user_copy_sectors is an unsigned int, so 32-bits. There are thus no
guarantees that this will not overflow. A check is needed.
> + err = queue_limits_commit_update(q, &lim);
> + blk_mq_unfreeze_queue(q);
> +
> + if (err)
You can reuse ret here. No need for adding the err variable.
> + return err;
> + return count;
> +}
> +
> static ssize_t queue_write_same_max_show(struct request_queue *q, char *page)
> {
> return queue_var_show(0, page);
> @@ -505,6 +543,9 @@ QUEUE_RO_ENTRY(queue_nr_zones, "nr_zones");
> QUEUE_RO_ENTRY(queue_max_open_zones, "max_open_zones");
> QUEUE_RO_ENTRY(queue_max_active_zones, "max_active_zones");
>
> +QUEUE_RO_ENTRY(queue_copy_hw_max, "copy_max_hw_bytes");
> +QUEUE_RW_ENTRY(queue_copy_max, "copy_max_bytes");
> +
> QUEUE_RW_ENTRY(queue_nomerges, "nomerges");
> QUEUE_RW_ENTRY(queue_rq_affinity, "rq_affinity");
> QUEUE_RW_ENTRY(queue_poll, "io_poll");
> @@ -618,6 +659,8 @@ static struct attribute *queue_attrs[] = {
> &queue_discard_max_entry.attr,
> &queue_discard_max_hw_entry.attr,
> &queue_discard_zeroes_data_entry.attr,
> + &queue_copy_hw_max_entry.attr,
> + &queue_copy_max_entry.attr,
> &queue_write_same_max_entry.attr,
> &queue_write_zeroes_max_entry.attr,
> &queue_zone_append_max_entry.attr,
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index aefdda9f4ec7..109d9f905c3c 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -309,6 +309,10 @@ struct queue_limits {
> unsigned int discard_alignment;
> unsigned int zone_write_granularity;
>
> + unsigned int max_copy_hw_sectors;
> + unsigned int max_copy_sectors;
> + unsigned int max_user_copy_sectors;
> +
> unsigned short max_segments;
> unsigned short max_integrity_segments;
> unsigned short max_discard_segments;
> @@ -933,6 +937,8 @@ void blk_queue_max_secure_erase_sectors(struct request_queue *q,
> unsigned int max_sectors);
> extern void blk_queue_max_discard_sectors(struct request_queue *q,
> unsigned int max_discard_sectors);
> +extern void blk_queue_max_copy_hw_sectors(struct request_queue *q,
> + unsigned int max_copy_sectors);
> extern void blk_queue_max_write_zeroes_sectors(struct request_queue *q,
> unsigned int max_write_same_sectors);
> extern void blk_queue_logical_block_size(struct request_queue *, unsigned int);
> @@ -1271,6 +1277,14 @@ static inline unsigned int bdev_discard_granularity(struct block_device *bdev)
> return bdev_get_queue(bdev)->limits.discard_granularity;
> }
>
> +/* maximum copy offload length, this is set to 128MB based on current testing */
Current testing will not be current in a while... So may be simply say
"arbitrary" or something. Also please capitalize the first letter of the
comment. So something like:
/* Arbitrary absolute limit of 128 MB for copy offload. */
> +#define BLK_COPY_MAX_BYTES (1 << 27)
Also, it is not clear from the name if this is a soft limit or a cap on the
hardware limit... So at least please adjust the comment to say which one it is.
> +
> +static inline unsigned int bdev_max_copy_sectors(struct block_device *bdev)
> +{
> + return bdev_get_queue(bdev)->limits.max_copy_sectors;
> +}
> +
> static inline unsigned int
> bdev_max_secure_erase_sectors(struct block_device *bdev)
> {
--
Damien Le Moal
Western Digital Research
next prev parent reply other threads:[~2024-05-20 14:33 UTC|newest]
Thread overview: 89+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20240520102747epcas5p33497a911ca70c991e5da8e22c5d1336b@epcas5p3.samsung.com>
2024-05-20 10:20 ` [PATCH v20 00/12] Implement copy offload support Nitesh Shetty
[not found] ` <CGME20240520102830epcas5p27274901f3d0c2738c515709890b1dec4@epcas5p2.samsung.com>
2024-05-20 10:20 ` [PATCH v20 01/12] block: Introduce queue limits and sysfs for copy-offload support Nitesh Shetty
2024-05-20 14:33 ` Damien Le Moal [this message]
2024-05-21 8:15 ` Nitesh Shetty
2024-05-20 22:42 ` Bart Van Assche
2024-05-21 14:25 ` Nitesh Shetty
2024-05-22 17:49 ` Bart Van Assche
2024-05-23 7:05 ` Nitesh Shetty
2024-05-21 6:57 ` Hannes Reinecke
2024-06-01 5:53 ` Christoph Hellwig
2024-06-03 6:43 ` Nitesh Shetty
[not found] ` <CGME20240520102842epcas5p4949334c2587a15b8adab2c913daa622f@epcas5p4.samsung.com>
2024-05-20 10:20 ` [PATCH v20 02/12] Add infrastructure for copy offload in block and request layer Nitesh Shetty
2024-05-20 15:00 ` Damien Le Moal
2024-05-21 10:50 ` Nitesh Shetty
2024-05-20 23:00 ` Bart Van Assche
2024-05-21 11:17 ` Nitesh Shetty
2024-05-22 17:58 ` Bart Van Assche
2024-05-21 7:01 ` Hannes Reinecke
2024-05-24 6:54 ` Nitesh Shetty
[not found] ` <66503bc7.630a0220.56c85.8b9dSMTPIN_ADDED_BROKEN@mx.google.com>
2024-05-24 13:52 ` Bart Van Assche
2024-05-27 8:27 ` Nitesh Shetty
2024-05-28 14:07 ` Bart Van Assche
2024-05-22 18:05 ` Bart Van Assche
2024-05-23 11:34 ` Nitesh Shetty
2024-05-24 20:33 ` Bart Van Assche
2024-05-29 6:17 ` Nitesh Shetty
2024-05-29 7:48 ` Damien Le Moal
2024-05-29 22:41 ` Bart Van Assche
2024-05-30 7:16 ` Nitesh Shetty
[not found] ` <665850bd.050a0220.a5e6b.5b72SMTPIN_ADDED_BROKEN@mx.google.com>
2024-05-30 17:11 ` Bart Van Assche
2024-05-31 10:17 ` Nitesh Shetty
[not found] ` <6659b691.630a0220.90195.d0ebSMTPIN_ADDED_BROKEN@mx.google.com>
2024-05-31 23:45 ` Bart Van Assche
2024-06-01 5:59 ` Christoph Hellwig
2024-06-03 17:12 ` Bart Van Assche
2024-06-04 4:40 ` Christoph Hellwig
2024-06-04 11:44 ` Bart Van Assche
2024-06-05 8:20 ` Christoph Hellwig
[not found] ` <CGME20240624105121epcas5p3a5a8c73bd5ef19c02e922e5829a4dff0@epcas5p3.samsung.com>
2024-06-24 10:44 ` Nitesh Shetty
[not found] ` <6679526f.170a0220.9ffd.aefaSMTPIN_ADDED_BROKEN@mx.google.com>
2024-06-24 16:25 ` Bart Van Assche
2024-06-24 21:55 ` Damien Le Moal
2024-06-25 18:18 ` Bart Van Assche
2024-06-25 21:18 ` Damien Le Moal
2024-06-26 5:22 ` Christoph Hellwig
2024-06-28 13:53 ` Bart Van Assche
[not found] ` <66795280.630a0220.f3ccd.b80cSMTPIN_ADDED_BROKEN@mx.google.com>
2024-06-24 22:58 ` Keith Busch
[not found] ` <CGME20240606072827epcas5p285de8d4f3b0f6d3a87f8341414336b42@epcas5p2.samsung.com>
2024-06-06 7:28 ` Nitesh Shetty
[not found] ` <66618886.630a0220.4d4fc.1c9cSMTPIN_ADDED_BROKEN@mx.google.com>
2024-06-06 16:44 ` Bart Van Assche
2024-06-01 5:57 ` Christoph Hellwig
[not found] ` <CGME20240520102853epcas5p42d635d6712b8876ea22a45d730cb1378@epcas5p4.samsung.com>
2024-05-20 10:20 ` [PATCH v20 03/12] block: add copy offload support Nitesh Shetty
2024-06-01 6:16 ` Christoph Hellwig
2024-06-04 10:50 ` Nitesh Shetty
[not found] ` <CGME20240520102906epcas5p15b5a0b3c8edd0bf3073030a792a328bb@epcas5p1.samsung.com>
2024-05-20 10:20 ` [PATCH v20 04/12] block: add emulation for copy Nitesh Shetty
2024-05-21 7:06 ` Hannes Reinecke
2024-05-21 11:29 ` Nitesh Shetty
2024-06-01 6:18 ` Christoph Hellwig
[not found] ` <CGME20240520102917epcas5p1bda532309b9174bf2702081f6f58daf7@epcas5p1.samsung.com>
2024-05-20 10:20 ` [PATCH v20 05/12] fs/read_write: Enable copy_file_range for block device Nitesh Shetty
2024-05-21 7:07 ` Hannes Reinecke
2024-05-25 23:02 ` Dave Chinner
2024-05-28 5:57 ` Nitesh Shetty
[not found] ` <CGME20240520102929epcas5p2f4456f6fa0005d90769615eb2c2bf273@epcas5p2.samsung.com>
2024-05-20 10:20 ` [PATCH v20 06/12] fs, block: copy_file_range for def_blk_ops for direct " Nitesh Shetty
2024-05-25 23:09 ` Dave Chinner
2024-05-27 8:43 ` Nitesh Shetty
[not found] ` <CGME20240520102940epcas5p2b5f38ceabe94bed3905fb386a0d65ec7@epcas5p2.samsung.com>
2024-05-20 10:20 ` [PATCH v20 07/12] nvme: add copy offload support Nitesh Shetty
2024-06-01 6:22 ` Christoph Hellwig
2024-06-03 11:43 ` Nitesh Shetty
[not found] ` <CGME20240520102952epcas5p18716d203d1810c38397e7fcc9a26922a@epcas5p1.samsung.com>
2024-05-20 10:20 ` [PATCH v20 08/12] nvmet: add copy command support for bdev and file ns Nitesh Shetty
[not found] ` <CGME20240520103004epcas5p4a18f3f6ba0f218d57b0ab4bb84c6ff18@epcas5p4.samsung.com>
2024-05-20 10:20 ` [PATCH v20 09/12] dm: Add support for copy offload Nitesh Shetty
2024-05-21 7:11 ` Hannes Reinecke
2024-05-21 14:08 ` Nitesh Shetty
2024-05-22 6:22 ` Hannes Reinecke
2024-05-22 7:10 ` Nitesh Shetty
[not found] ` <CGME20240520103016epcas5p31b9a0f3637959626d49763609ebda6ef@epcas5p3.samsung.com>
2024-05-20 10:20 ` [PATCH v20 10/12] dm: Enable copy offload for dm-linear target Nitesh Shetty
2024-05-20 23:25 ` Bart Van Assche
2024-05-21 14:48 ` Nitesh Shetty
[not found] ` <CGME20240520103027epcas5p4789defe8ab3bff23bd2abcf019689fa2@epcas5p4.samsung.com>
2024-05-20 10:20 ` [PATCH v20 11/12] null: Enable trace capability for null block Nitesh Shetty
2024-05-20 23:27 ` Bart Van Assche
2024-06-01 6:23 ` Christoph Hellwig
2024-06-03 11:04 ` Nitesh Shetty
[not found] ` <CGME20240520103039epcas5p4373f7234162a32222ac225b976ae30ce@epcas5p4.samsung.com>
2024-05-20 10:20 ` [PATCH v20 12/12] null_blk: add support for copy offload Nitesh Shetty
2024-05-20 23:42 ` Bart Van Assche
2024-05-21 14:46 ` Nitesh Shetty
2024-05-22 17:52 ` Bart Van Assche
2024-05-23 6:55 ` Nitesh Shetty
2024-05-20 22:54 ` [PATCH v20 00/12] Implement copy offload support Bart Van Assche
2024-06-01 5:47 ` Christoph Hellwig
2024-06-03 10:53 ` Nitesh Shetty
2024-06-04 4:31 [PATCH v20 01/12] block: Introduce queue limits and sysfs for copy-offload support Christoph Hellwig
2024-06-04 7:05 ` Hannes Reinecke
2024-06-05 8:17 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c31f663f-36c0-4db2-8bf6-8e3c699073ca@kernel.org \
--to=dlemoal@kernel.org \
--cc=agk@redhat.com \
--cc=anuj20.g@samsung.com \
--cc=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=bvanassche@acm.org \
--cc=corbet@lwn.net \
--cc=damien.lemoal@opensource.wdc.com \
--cc=david@fromorbit.com \
--cc=dm-devel@lists.linux.dev \
--cc=gost.dev@samsung.com \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=joshi.k@samsung.com \
--cc=kbusch@kernel.org \
--cc=kch@nvidia.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=martin.petersen@oracle.com \
--cc=mpatocka@redhat.com \
--cc=nitheshshetty@gmail.com \
--cc=nj.shetty@samsung.com \
--cc=sagi@grimberg.me \
--cc=snitzer@kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).