All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nitesh Shetty <nj.shetty@samsung.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: linux-nvme@lists.infradead.org, dm-devel@redhat.com, hch@lst.de,
	agk@redhat.com, naohiro.aota@wdc.com, sagi@grimberg.me,
	gost.dev@samsung.com, damien.lemoal@opensource.wdc.com,
	james.smart@broadcom.com, p.raghav@samsung.com, kch@nvidia.com,
	anuj20.g@samsung.com, snitzer@kernel.org,
	linux-block@vger.kernel.org, viro@zeniv.linux.org.uk,
	kbusch@kernel.org, axboe@kernel.dk, joshi.k@samsung.com,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	jth@kernel.org, nitheshshetty@gmail.com
Subject: Re: [dm-devel] [PATCH v5 02/10] block: Add copy offload support infrastructure
Date: Wed, 7 Dec 2022 11:24:00 +0530	[thread overview]
Message-ID: <20221207055400.GA6497@test-zns> (raw)
In-Reply-To: <20221129114428.GA16802@test-zns>

[-- Attachment #1: Type: text/plain, Size: 4809 bytes --]

On Tue, Nov 29, 2022 at 05:14:28PM +0530, Nitesh Shetty wrote:
> On Thu, Nov 24, 2022 at 08:03:56AM +0800, Ming Lei wrote:
> > On Wed, Nov 23, 2022 at 03:37:12PM +0530, Nitesh Shetty wrote:
> > > On Wed, Nov 23, 2022 at 04:04:18PM +0800, Ming Lei wrote:
> > > > On Wed, Nov 23, 2022 at 11:28:19AM +0530, Nitesh Shetty wrote:
> > > > > Introduce blkdev_issue_copy which supports source and destination bdevs,
> > > > > and an array of (source, destination and copy length) tuples.
> > > > > Introduce REQ_COPY copy offload operation flag. Create a read-write
> > > > > bio pair with a token as payload and submitted to the device in order.
> > > > > Read request populates token with source specific information which
> > > > > is then passed with write request.
> > > > > This design is courtesy Mikulas Patocka's token based copy
> > > > 
> > > > I thought this patchset is just for enabling copy command which is
> > > > supported by hardware. But turns out it isn't, because blk_copy_offload()
> > > > still submits read/write bios for doing the copy.
> > > > 
> > > > I am just wondering why not let copy_file_range() cover this kind of copy,
> > > > and the framework has been there.
> > > > 
> > > 
> > > Main goal was to enable copy command, but community suggested to add
> > > copy emulation as well.
> > > 
> > > blk_copy_offload - actually issues copy command in driver layer.
> > > The way read/write BIOs are percieved is different for copy offload.
> > > In copy offload we check REQ_COPY flag in NVMe driver layer to issue
> > > copy command. But we did missed it to add in other driver's, where they
> > > might be treated as normal READ/WRITE.
> > > 
> > > blk_copy_emulate - is used if we fail or if device doesn't support native
> > > copy offload command. Here we do READ/WRITE. Using copy_file_range for
> > > emulation might be possible, but we see 2 issues here.
> > > 1. We explored possibility of pulling dm-kcopyd to block layer so that we 
> > > can readily use it. But we found it had many dependecies from dm-layer.
> > > So later dropped that idea.
> > 
> > Is it just because dm-kcopyd supports async copy? If yes, I believe we
> > can reply on io_uring for implementing async copy_file_range, which will
> > be generic interface for async copy, and could get better perf.
> >
> 
> It supports both sync and async. But used only inside dm-layer.
> Async version of copy_file_range can help, using io-uring can be helpful
> for user , but in-kernel users can't use uring.
> 
> > > 2. copy_file_range, for block device atleast we saw few check's which fail
> > > it for raw block device. At this point I dont know much about the history of
> > > why such check is present.
> > 
> > Got it, but IMO the check in generic_copy_file_checks() can be
> > relaxed to cover blkdev cause splice does support blkdev.
> > 
> > Then your bdev offload copy work can be simplified into:
> > 
> > 1) implement .copy_file_range for def_blk_fops, suppose it is
> > blkdev_copy_file_range()
> > 
> > 2) inside blkdev_copy_file_range()
> > 
> > - if the bdev supports offload copy, just submit one bio to the device,
> > and this will be converted to one pt req to device
> > 
> > - otherwise, fallback to generic_copy_file_range()
> >
> 

Actually we sent initial version with single bio, but later community
suggested two bio's is must for offload, main reasoning being
dm-layer,Xcopy,copy across namespace compatibilty.

> We will check the feasibilty and try to implement the scheme in next versions.
> It would be helpful, if someone in community know's why such checks were
> present ? We see copy_file_range accepts only regular file. Was it
> designed only for regular files or can we extend it to regular block
> device.
>

As you suggested we were able to integrate def_blk_ops and
run with user application, but we see one main issue with this approach.
Using blkdev_copy_file_range requires having 2 file descriptors, which
is not possible for in kernel users such as fabrics/dm-kcopyd which has
only bdev descriptors.
Do you have any plumbing suggestions here ?

> > > 
> > > > When I was researching pipe/splice code for supporting ublk zero copy[1], I
> > > > have got idea for async copy_file_range(), such as: io uring based
> > > > direct splice, user backed intermediate buffer, still zero copy, if these
> > > > ideas are finally implemented, we could get super-fast generic offload copy,
> > > > and bdev copy is really covered too.
> > > > 
> > > > [1] https://lore.kernel.org/linux-block/20221103085004.1029763-1-ming.lei@redhat.com/
> > > > 
> > > 
> > > Seems interesting, We will take a look into this.
> > 
> > BTW, that is probably one direction of ublk's async zero copy IO too.
> > 
> > 
> > Thanks, 
> > Ming
> > 
> > 
> 
> 
> Thanks,
> Nitesh

Thanks,
Nitesh Shetty


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



[-- Attachment #3: Type: text/plain, Size: 98 bytes --]

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

WARNING: multiple messages have this Message-ID (diff)
From: Nitesh Shetty <nj.shetty@samsung.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: axboe@kernel.dk, agk@redhat.com, snitzer@kernel.org,
	dm-devel@redhat.com, kbusch@kernel.org, hch@lst.de,
	sagi@grimberg.me, james.smart@broadcom.com, kch@nvidia.com,
	damien.lemoal@opensource.wdc.com, naohiro.aota@wdc.com,
	jth@kernel.org, viro@zeniv.linux.org.uk,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org,
	anuj20.g@samsung.com, joshi.k@samsung.com, p.raghav@samsung.com,
	nitheshshetty@gmail.com, gost.dev@samsung.com
Subject: Re: [PATCH v5 02/10] block: Add copy offload support infrastructure
Date: Wed, 7 Dec 2022 11:24:00 +0530	[thread overview]
Message-ID: <20221207055400.GA6497@test-zns> (raw)
In-Reply-To: <20221129114428.GA16802@test-zns>

[-- Attachment #1: Type: text/plain, Size: 4809 bytes --]

On Tue, Nov 29, 2022 at 05:14:28PM +0530, Nitesh Shetty wrote:
> On Thu, Nov 24, 2022 at 08:03:56AM +0800, Ming Lei wrote:
> > On Wed, Nov 23, 2022 at 03:37:12PM +0530, Nitesh Shetty wrote:
> > > On Wed, Nov 23, 2022 at 04:04:18PM +0800, Ming Lei wrote:
> > > > On Wed, Nov 23, 2022 at 11:28:19AM +0530, Nitesh Shetty wrote:
> > > > > Introduce blkdev_issue_copy which supports source and destination bdevs,
> > > > > and an array of (source, destination and copy length) tuples.
> > > > > Introduce REQ_COPY copy offload operation flag. Create a read-write
> > > > > bio pair with a token as payload and submitted to the device in order.
> > > > > Read request populates token with source specific information which
> > > > > is then passed with write request.
> > > > > This design is courtesy Mikulas Patocka's token based copy
> > > > 
> > > > I thought this patchset is just for enabling copy command which is
> > > > supported by hardware. But turns out it isn't, because blk_copy_offload()
> > > > still submits read/write bios for doing the copy.
> > > > 
> > > > I am just wondering why not let copy_file_range() cover this kind of copy,
> > > > and the framework has been there.
> > > > 
> > > 
> > > Main goal was to enable copy command, but community suggested to add
> > > copy emulation as well.
> > > 
> > > blk_copy_offload - actually issues copy command in driver layer.
> > > The way read/write BIOs are percieved is different for copy offload.
> > > In copy offload we check REQ_COPY flag in NVMe driver layer to issue
> > > copy command. But we did missed it to add in other driver's, where they
> > > might be treated as normal READ/WRITE.
> > > 
> > > blk_copy_emulate - is used if we fail or if device doesn't support native
> > > copy offload command. Here we do READ/WRITE. Using copy_file_range for
> > > emulation might be possible, but we see 2 issues here.
> > > 1. We explored possibility of pulling dm-kcopyd to block layer so that we 
> > > can readily use it. But we found it had many dependecies from dm-layer.
> > > So later dropped that idea.
> > 
> > Is it just because dm-kcopyd supports async copy? If yes, I believe we
> > can reply on io_uring for implementing async copy_file_range, which will
> > be generic interface for async copy, and could get better perf.
> >
> 
> It supports both sync and async. But used only inside dm-layer.
> Async version of copy_file_range can help, using io-uring can be helpful
> for user , but in-kernel users can't use uring.
> 
> > > 2. copy_file_range, for block device atleast we saw few check's which fail
> > > it for raw block device. At this point I dont know much about the history of
> > > why such check is present.
> > 
> > Got it, but IMO the check in generic_copy_file_checks() can be
> > relaxed to cover blkdev cause splice does support blkdev.
> > 
> > Then your bdev offload copy work can be simplified into:
> > 
> > 1) implement .copy_file_range for def_blk_fops, suppose it is
> > blkdev_copy_file_range()
> > 
> > 2) inside blkdev_copy_file_range()
> > 
> > - if the bdev supports offload copy, just submit one bio to the device,
> > and this will be converted to one pt req to device
> > 
> > - otherwise, fallback to generic_copy_file_range()
> >
> 

Actually we sent initial version with single bio, but later community
suggested two bio's is must for offload, main reasoning being
dm-layer,Xcopy,copy across namespace compatibilty.

> We will check the feasibilty and try to implement the scheme in next versions.
> It would be helpful, if someone in community know's why such checks were
> present ? We see copy_file_range accepts only regular file. Was it
> designed only for regular files or can we extend it to regular block
> device.
>

As you suggested we were able to integrate def_blk_ops and
run with user application, but we see one main issue with this approach.
Using blkdev_copy_file_range requires having 2 file descriptors, which
is not possible for in kernel users such as fabrics/dm-kcopyd which has
only bdev descriptors.
Do you have any plumbing suggestions here ?

> > > 
> > > > When I was researching pipe/splice code for supporting ublk zero copy[1], I
> > > > have got idea for async copy_file_range(), such as: io uring based
> > > > direct splice, user backed intermediate buffer, still zero copy, if these
> > > > ideas are finally implemented, we could get super-fast generic offload copy,
> > > > and bdev copy is really covered too.
> > > > 
> > > > [1] https://lore.kernel.org/linux-block/20221103085004.1029763-1-ming.lei@redhat.com/
> > > > 
> > > 
> > > Seems interesting, We will take a look into this.
> > 
> > BTW, that is probably one direction of ublk's async zero copy IO too.
> > 
> > 
> > Thanks, 
> > Ming
> > 
> > 
> 
> 
> Thanks,
> Nitesh

Thanks,
Nitesh Shetty


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



  reply	other threads:[~2022-12-07 13:37 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20221123061010epcas5p21cef9d23e4362b01f2b19d1117e1cdf5@epcas5p2.samsung.com>
2022-11-23  5:58 ` [dm-devel] [PATCH v5 00/10] Implement copy offload support Nitesh Shetty
2022-11-23  5:58   ` Nitesh Shetty
2022-11-23  5:58   ` [dm-devel] [PATCH v5 01/10] block: Introduce queue limits for copy-offload support Nitesh Shetty
2022-11-23  5:58     ` Nitesh Shetty
2022-11-23  5:58   ` [dm-devel] [PATCH v5 02/10] block: Add copy offload support infrastructure Nitesh Shetty
2022-11-23  5:58     ` Nitesh Shetty
2022-11-23  8:04     ` [dm-devel] " Ming Lei
2022-11-23  8:04       ` Ming Lei
2022-11-23 10:07       ` [dm-devel] " Nitesh Shetty
2022-11-23 10:07         ` Nitesh Shetty
2022-11-24  0:03         ` [dm-devel] " Ming Lei
2022-11-24  0:03           ` Ming Lei
2022-11-29 11:44           ` [dm-devel] " Nitesh Shetty
2022-11-29 11:44             ` Nitesh Shetty
2022-12-07  5:54             ` Nitesh Shetty [this message]
2022-12-07  5:54               ` Nitesh Shetty
2022-12-07 11:19               ` [dm-devel] " Ming Lei
2022-12-07 11:19                 ` Ming Lei
2022-12-09  8:16                 ` [dm-devel] " Nitesh Shetty
2022-12-09  8:16                   ` Nitesh Shetty
2022-11-23  5:58   ` [dm-devel] [PATCH v5 03/10] block: add emulation for copy Nitesh Shetty
2022-11-23  5:58     ` Nitesh Shetty
2022-11-23  5:58   ` [dm-devel] [PATCH v5 04/10] block: Introduce a new ioctl " Nitesh Shetty
2022-11-23  5:58     ` Nitesh Shetty
2022-11-23  5:58   ` [dm-devel] [PATCH v5 05/10] nvme: add copy offload support Nitesh Shetty
2022-11-23  5:58     ` Nitesh Shetty
2022-11-23  5:58   ` [dm-devel] [PATCH v5 06/10] nvmet: add copy command support for bdev and file ns Nitesh Shetty
2022-11-23  5:58     ` Nitesh Shetty
2022-11-23  8:17     ` [dm-devel] " Guixin Liu
2022-11-23  9:39       ` Nitesh Shetty
2022-11-23  9:39         ` Nitesh Shetty
2022-12-06  9:22     ` [dm-devel] " kernel test robot
2022-12-06  9:22       ` kernel test robot
2022-11-23  5:58   ` [dm-devel] [PATCH v5 07/10] dm: Add support for copy offload Nitesh Shetty
2022-11-23  5:58     ` Nitesh Shetty
2022-11-23  5:58   ` [dm-devel] [PATCH v5 08/10] dm: Enable copy offload for dm-linear target Nitesh Shetty
2022-11-23  5:58     ` Nitesh Shetty
2022-11-23  5:58   ` [dm-devel] [PATCH v5 09/10] dm kcopyd: use copy offload support Nitesh Shetty
2022-11-23  5:58     ` Nitesh Shetty
2022-11-23  5:58   ` [dm-devel] [PATCH v5 10/10] fs: add support for copy file range in zonefs Nitesh Shetty
2022-11-23  5:58     ` Nitesh Shetty
2022-11-23  6:53     ` [dm-devel] " Amir Goldstein
2022-11-23  6:53       ` Amir Goldstein
2022-11-23 10:13       ` [dm-devel] " Nitesh Shetty
2022-11-23 10:13         ` Nitesh Shetty
2022-11-24  1:32     ` [dm-devel] " Damien Le Moal
2022-11-24  1:32       ` Damien Le Moal
2022-11-24  1:47       ` [dm-devel] " Damien Le Moal
2022-11-24  1:47         ` Damien Le Moal
2022-11-25  4:18         ` [dm-devel] " Al Viro
2022-11-25  4:18           ` Al Viro
2022-11-29 12:22         ` [dm-devel] " Nitesh Shetty
2022-11-29 12:22           ` Nitesh Shetty
2022-11-29 23:45           ` [dm-devel] " Damien Le Moal
2022-11-29 23:45             ` Damien Le Moal
2022-11-30  4:17             ` [dm-devel] " Nitesh Shetty
2022-11-30  4:17               ` Nitesh Shetty
2022-11-30  9:55               ` [dm-devel] " Damien Le Moal
2022-11-30  9:55                 ` Damien Le Moal
2022-11-23 22:56   ` [dm-devel] [PATCH v5 00/10] Implement copy offload support Chaitanya Kulkarni
2022-11-23 22:56     ` Chaitanya Kulkarni
2022-11-29 12:16     ` [dm-devel] " Nitesh Shetty
2022-11-29 12:16       ` Nitesh Shetty
2022-11-30  0:05       ` [dm-devel] " Chaitanya Kulkarni
2022-11-30  0:05         ` Chaitanya Kulkarni
2022-11-30  4:14         ` [dm-devel] " Nitesh Shetty
2022-11-30  4:14           ` Nitesh Shetty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221207055400.GA6497@test-zns \
    --to=nj.shetty@samsung.com \
    --cc=agk@redhat.com \
    --cc=anuj20.g@samsung.com \
    --cc=axboe@kernel.dk \
    --cc=damien.lemoal@opensource.wdc.com \
    --cc=dm-devel@redhat.com \
    --cc=gost.dev@samsung.com \
    --cc=hch@lst.de \
    --cc=james.smart@broadcom.com \
    --cc=joshi.k@samsung.com \
    --cc=jth@kernel.org \
    --cc=kbusch@kernel.org \
    --cc=kch@nvidia.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=ming.lei@redhat.com \
    --cc=naohiro.aota@wdc.com \
    --cc=nitheshshetty@gmail.com \
    --cc=p.raghav@samsung.com \
    --cc=sagi@grimberg.me \
    --cc=snitzer@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.