public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche@acm.org>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org, linux-scsi@vger.kernel.org,
	linux-nvme@lists.infradead.org, Christoph Hellwig <hch@lst.de>,
	Nitesh Shetty <nj.shetty@samsung.com>,
	Bart Van Assche <bvanassche@acm.org>,
	Anuj Gupta <anuj20.g@samsung.com>
Subject: [PATCH 02/12] block: Add the REQ_OP_COPY_{SRC,DST} operations
Date: Fri, 24 Apr 2026 15:41:51 -0700	[thread overview]
Message-ID: <20260424224201.1949243-3-bvanassche@acm.org> (raw)
In-Reply-To: <20260424224201.1949243-1-bvanassche@acm.org>

From: Nitesh Shetty <nj.shetty@samsung.com>

Introduce the REQ_OP_COPY_SRC and REQ_OP_COPY_DST operations. The source
and destination LBA range information is in separate bios because any
other approach would require a rewrite of the device mapper. These bios
are associated with each other via the new bi_copy_ctx pointer. A new
pointer has been introduced because the copy offloading context
information must be preserved when cloning a bio and the bi_private bio
member must not be copied when cloning a bio.

This patch supports the following approach for copy offloading:
1. Allocate a struct bio_copy_offload_ctx instance and set phase to
   BLKDEV_TRANSLATE_LBAS.
2. Allocate REQ_OP_COPY_SRC and REQ_OP_COPY_DST bios. Set the
   bi_copy_ctx member of these bios.
3. Set the bio_count member of struct bio_copy_offload_ctx.
4. Submit all REQ_OP_COPY_* bios.
5. In submit_bio(), do the following for REQ_OP_COPY_* bios:
   - If bio->bi_bdev is a stacking device, submit the bio. This will
     send the bio to the device mapper. The device mapper will clone the
     bio, translate the LBAs and will submit the cloned bio. That will
     result in a recursive submit_bio() call.
   - If bio->bi_bdev is not a stacking device, add the bio to the
     copy_ctx->bios list and decrement copy_ctx->bio_count.
6. Once copy_ctx->bio_count == 0, call copy_ctx->translation_complete().
7. In the implementation of copy_ctx->translation_complete(), change
   copy_ctx->phase from BLKDEV_TRANSLATE_LBAS into BLKDEV_COPY.
8. Submit the first REQ_OP_COPY_* bio of the copy_ctx->bios list.
9. Once this bio reaches the block driver associated with the bio,
   retrieve the other bios involved in the copy operation from the copy
   context data structure and convert all these bios into a copy offload
   operation.
10. Once this bio completes, also complete all the other bios involved
    in the copy offload operation.

This patch increases the size of struct bio from 104 to 112 bytes on 64-bit
systems.

To be discussed further: whether adding a new member in struct bio is
acceptable or whether the new pointer perhaps should be stored in front of
the bio. bioset_init() supports front padding.

Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
[ bvanassche: changed the approach of this patch from combining the
  COPY_SRC and COPY_DST operations immediately to translating the LBA
  information first. ]
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/bio.c               |  1 +
 block/blk-core.c          | 38 ++++++++++++++++++++++++++++++++
 block/blk-merge.c         | 13 +++++++++++
 block/blk.h               |  5 +++++
 include/linux/blk-copy.h  | 46 +++++++++++++++++++++++++++++++++++++++
 include/linux/blk_types.h | 17 +++++++++++++++
 6 files changed, 120 insertions(+)
 create mode 100644 include/linux/blk-copy.h

diff --git a/block/bio.c b/block/bio.c
index b8972dba68a0..51480c9be27b 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -852,6 +852,7 @@ static int __bio_clone(struct bio *bio, struct bio *bio_src, gfp_t gfp)
 	bio->bi_write_hint = bio_src->bi_write_hint;
 	bio->bi_write_stream = bio_src->bi_write_stream;
 	bio->bi_iter = bio_src->bi_iter;
+	bio->bi_copy_ctx = bio_src->bi_copy_ctx;
 
 	if (bio->bi_bdev) {
 		if (bio->bi_bdev == bio_src->bi_bdev &&
diff --git a/block/blk-core.c b/block/blk-core.c
index 17450058ea6d..37c01e717202 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -16,6 +16,7 @@
 #include <linux/module.h>
 #include <linux/bio.h>
 #include <linux/blkdev.h>
+#include <linux/blk-copy.h>
 #include <linux/blk-pm.h>
 #include <linux/blk-integrity.h>
 #include <linux/highmem.h>
@@ -108,6 +109,8 @@ static const char *const blk_op_name[] = {
 	REQ_OP_NAME(ZONE_FINISH),
 	REQ_OP_NAME(ZONE_APPEND),
 	REQ_OP_NAME(WRITE_ZEROES),
+	REQ_OP_NAME(COPY_SRC),
+	REQ_OP_NAME(COPY_DST),
 	REQ_OP_NAME(DRV_IN),
 	REQ_OP_NAME(DRV_OUT),
 };
@@ -782,6 +785,8 @@ void submit_bio_noacct(struct bio *bio)
 	struct block_device *bdev = bio->bi_bdev;
 	struct request_queue *q = bdev_get_queue(bdev);
 	blk_status_t status = BLK_STS_IOERR;
+	struct bio_copy_offload_ctx *copy_ctx;
+	u32 bio_count;
 
 	might_sleep();
 
@@ -875,6 +880,39 @@ void submit_bio_noacct(struct bio *bio)
 		 * requests.
 		 */
 		fallthrough;
+	case REQ_OP_COPY_SRC:
+	case REQ_OP_COPY_DST:
+		copy_ctx = bio->bi_copy_ctx;
+		WARN_ON_ONCE(copy_ctx->phase == BLKDEV_COPY_DONE);
+		if (copy_ctx->phase == BLKDEV_COPY)
+			break;
+		/* If copy offloading is not supported, fail the bio. */
+		if (!q->limits.max_copy_sectors) {
+			scoped_guard(spinlock_irqsave, &copy_ctx->lock)
+				copy_ctx->bio_count--;
+			goto not_supported;
+		}
+		/*
+		 * If the block driver is a stacking driver that supports copy
+		 * offloading, submit the bio.
+		 */
+		if (q->limits.features & BLK_FEAT_STACKING_COPY_OFFL)
+			break;
+		/*
+		 * Append the bio at the end of the bio->bi_copy_ctx->bios list.
+		 */
+		scoped_guard(spinlock_irqsave, &copy_ctx->lock) {
+			if (copy_ctx->biotail)
+				copy_ctx->biotail->bi_next = bio;
+			else
+				copy_ctx->bios = bio;
+			copy_ctx->biotail = bio;
+			bio_count = --copy_ctx->bio_count;
+		}
+		WARN_ON_ONCE(bio_count < 0);
+		if (bio_count == 0)
+			copy_ctx->translation_complete(copy_ctx);
+		return;
 	default:
 		goto not_supported;
 	}
diff --git a/block/blk-merge.c b/block/blk-merge.c
index fcf09325b22e..4678131650d2 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -207,6 +207,19 @@ struct bio *bio_split_discard(struct bio *bio, const struct queue_limits *lim,
 	return __bio_split_discard(bio, lim, nsegs, max_sectors);
 }
 
+struct bio *bio_split_copy(struct bio *bio, const struct queue_limits *lim,
+			   unsigned int *nsegs)
+{
+	*nsegs = 1;
+	if (bio_sectors(bio) <= lim->max_copy_sectors)
+		return bio;
+
+	/* Splitting a REQ_OP_COPY_* bio is not supported. */
+	bio->bi_status = BLK_STS_NOTSUPP;
+	bio_endio(bio);
+	return NULL;
+}
+
 static inline unsigned int blk_boundary_sectors(const struct queue_limits *lim,
 						bool is_atomic)
 {
diff --git a/block/blk.h b/block/blk.h
index b998a7761faf..274c226e87ee 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -379,6 +379,8 @@ struct bio *bio_split_rw(struct bio *bio, const struct queue_limits *lim,
 		unsigned *nr_segs);
 struct bio *bio_split_zone_append(struct bio *bio,
 		const struct queue_limits *lim, unsigned *nr_segs);
+struct bio *bio_split_copy(struct bio *bio, const struct queue_limits *lim,
+			   unsigned int *nsegs);
 
 /*
  * All drivers must accept single-segments bios that are smaller than PAGE_SIZE.
@@ -435,6 +437,9 @@ static inline struct bio *__bio_split_to_limits(struct bio *bio,
 		return bio_split_discard(bio, lim, nr_segs);
 	case REQ_OP_WRITE_ZEROES:
 		return bio_split_write_zeroes(bio, lim, nr_segs);
+	case REQ_OP_COPY_SRC:
+	case REQ_OP_COPY_DST:
+		return bio_split_copy(bio, lim, nr_segs);
 	default:
 		/* other operations can't be split */
 		*nr_segs = 0;
diff --git a/include/linux/blk-copy.h b/include/linux/blk-copy.h
new file mode 100644
index 000000000000..5e38cfc14a71
--- /dev/null
+++ b/include/linux/blk-copy.h
@@ -0,0 +1,46 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __LINUX_BLK_COPY_H
+#define __LINUX_BLK_COPY_H
+
+#include <linux/blk_types.h>
+#include <linux/completion.h>
+#include <linux/list.h>
+#include <linux/spinlock_types.h>
+#include <linux/workqueue_types.h>
+
+struct blk_copy_params;
+struct request;
+
+enum blkdev_copy_phase {
+	BLKDEV_TRANSLATE_LBAS,
+	BLKDEV_COPY,
+	BLKDEV_COPY_DONE,
+};
+
+/*
+ * struct bio_copy_offload_ctx - context information for blkdev_copy_offload()
+ * @params: Input parameters passed to blkdev_copy_offload().
+ * @len: Number of bytes associated with this copy context.
+ * @phase: Copy offload phase: either translating LBAs or copying data.
+ * @lock: Protects @bios, @biotail and @bio_count.
+ * @bios: List with REQ_OP_COPY_* bios for which LBA translation completed.
+ * @biotail: Last element in the @bios list.
+ * @bio_count: Number bios for which LBA translation has not yet completed.
+ * @status: bio completion status.
+ * @translation_complete: Called after LBA translation has completed.
+ *	LBA translation has completed once bio_count drops to zero.
+ */
+struct bio_copy_offload_ctx {
+	struct blk_copy_params *params;
+	loff_t len;
+	enum blkdev_copy_phase phase;
+	spinlock_t lock;
+	struct bio *bios __guarded_by(&lock);
+	struct bio *biotail __guarded_by(&lock);
+	u32 bio_count __guarded_by(&lock);
+	blk_status_t status __guarded_by(&lock);
+	void (*translation_complete)(struct bio_copy_offload_ctx *ctx);
+};
+
+#endif /* __LINUX_BLK_COPY_H */
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 8808ee76e73c..4e448e810b87 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -284,6 +284,8 @@ struct bio {
 	atomic_t		__bi_cnt;	/* pin count */
 
 	struct bio_set		*bi_pool;
+
+	void			*bi_copy_ctx;
 };
 
 #define BIO_RESET_BYTES		offsetof(struct bio, bi_max_vecs)
@@ -370,6 +372,10 @@ enum req_op {
 	/** @REQ_OP_ZONE_RESET_ALL: reset all the zone present on the device */
 	REQ_OP_ZONE_RESET_ALL	= (__force blk_opf_t)19,
 
+	/* copy offload source and destination operations */
+	REQ_OP_COPY_SRC		= (__force blk_opf_t)20,
+	REQ_OP_COPY_DST		= (__force blk_opf_t)21,
+
 	/* Driver private requests */
 	/* private: */
 	REQ_OP_DRV_IN		= (__force blk_opf_t)34,
@@ -461,6 +467,17 @@ static inline bool op_is_write(blk_opf_t op)
 	return !!(op & (__force blk_opf_t)1);
 }
 
+static inline bool op_is_copy(blk_opf_t op)
+{
+	switch (op & REQ_OP_MASK) {
+	case REQ_OP_COPY_DST:
+	case REQ_OP_COPY_SRC:
+		return true;
+	default:
+		return false;
+	}
+}
+
 /*
  * Check if the bio or request is one that needs special treatment in the
  * flush state machine.

  parent reply	other threads:[~2026-04-24 22:42 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-24 22:41 [PATCH 00/12] Block storage copy offloading Bart Van Assche
2026-04-24 22:41 ` [PATCH 01/12] block: Introduce queue limits for " Bart Van Assche
2026-04-24 22:41 ` Bart Van Assche [this message]
2026-04-24 22:41 ` [PATCH 03/12] block: Introduce blkdev_copy_offload() Bart Van Assche
2026-04-24 22:41 ` [PATCH 04/12] block: Add an onloaded copy implementation Bart Van Assche
2026-04-24 22:41 ` [PATCH 05/12] block: Introduce accessor functions for copy offload bios Bart Van Assche
2026-04-24 22:41 ` [PATCH 06/12] fs/read_write: Generalize generic_copy_file_checks() Bart Van Assche
2026-04-24 22:41 ` [PATCH 07/12] fs, block: Add copy_file_range() support for block devices Bart Van Assche
2026-04-24 22:41 ` [PATCH 08/12] nvme: Add copy offloading support Bart Van Assche
2026-04-24 22:41 ` [PATCH 09/12] nvmet: Support the Copy command Bart Van Assche
2026-04-24 22:41 ` [PATCH 10/12] dm: Add support for copy offloading Bart Van Assche
2026-04-24 22:42 ` [PATCH 11/12] dm-linear: Enable " Bart Van Assche
2026-04-24 22:42 ` [PATCH 12/12] null_blk: Add support for REQ_OP_COPY_* Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260424224201.1949243-3-bvanassche@acm.org \
    --to=bvanassche@acm.org \
    --cc=anuj20.g@samsung.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=nj.shetty@samsung.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox