From: Nitesh Shetty <nj.shetty@samsung.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: axboe@kernel.dk, agk@redhat.com, snitzer@kernel.org,
dm-devel@redhat.com, kbusch@kernel.org, hch@lst.de,
sagi@grimberg.me, james.smart@broadcom.com, kch@nvidia.com,
damien.lemoal@opensource.wdc.com, naohiro.aota@wdc.com,
jth@kernel.org, viro@zeniv.linux.org.uk,
linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org,
anuj20.g@samsung.com, joshi.k@samsung.com, p.raghav@samsung.com,
nitheshshetty@gmail.com, gost.dev@samsung.com
Subject: Re: [PATCH v5 02/10] block: Add copy offload support infrastructure
Date: Wed, 7 Dec 2022 11:24:00 +0530 [thread overview]
Message-ID: <20221207055400.GA6497@test-zns> (raw)
In-Reply-To: <20221129114428.GA16802@test-zns>
[-- Attachment #1: Type: text/plain, Size: 4809 bytes --]
On Tue, Nov 29, 2022 at 05:14:28PM +0530, Nitesh Shetty wrote:
> On Thu, Nov 24, 2022 at 08:03:56AM +0800, Ming Lei wrote:
> > On Wed, Nov 23, 2022 at 03:37:12PM +0530, Nitesh Shetty wrote:
> > > On Wed, Nov 23, 2022 at 04:04:18PM +0800, Ming Lei wrote:
> > > > On Wed, Nov 23, 2022 at 11:28:19AM +0530, Nitesh Shetty wrote:
> > > > > Introduce blkdev_issue_copy which supports source and destination bdevs,
> > > > > and an array of (source, destination and copy length) tuples.
> > > > > Introduce REQ_COPY copy offload operation flag. Create a read-write
> > > > > bio pair with a token as payload and submitted to the device in order.
> > > > > Read request populates token with source specific information which
> > > > > is then passed with write request.
> > > > > This design is courtesy Mikulas Patocka's token based copy
> > > >
> > > > I thought this patchset is just for enabling copy command which is
> > > > supported by hardware. But turns out it isn't, because blk_copy_offload()
> > > > still submits read/write bios for doing the copy.
> > > >
> > > > I am just wondering why not let copy_file_range() cover this kind of copy,
> > > > and the framework has been there.
> > > >
> > >
> > > Main goal was to enable copy command, but community suggested to add
> > > copy emulation as well.
> > >
> > > blk_copy_offload - actually issues copy command in driver layer.
> > > The way read/write BIOs are percieved is different for copy offload.
> > > In copy offload we check REQ_COPY flag in NVMe driver layer to issue
> > > copy command. But we did missed it to add in other driver's, where they
> > > might be treated as normal READ/WRITE.
> > >
> > > blk_copy_emulate - is used if we fail or if device doesn't support native
> > > copy offload command. Here we do READ/WRITE. Using copy_file_range for
> > > emulation might be possible, but we see 2 issues here.
> > > 1. We explored possibility of pulling dm-kcopyd to block layer so that we
> > > can readily use it. But we found it had many dependecies from dm-layer.
> > > So later dropped that idea.
> >
> > Is it just because dm-kcopyd supports async copy? If yes, I believe we
> > can reply on io_uring for implementing async copy_file_range, which will
> > be generic interface for async copy, and could get better perf.
> >
>
> It supports both sync and async. But used only inside dm-layer.
> Async version of copy_file_range can help, using io-uring can be helpful
> for user , but in-kernel users can't use uring.
>
> > > 2. copy_file_range, for block device atleast we saw few check's which fail
> > > it for raw block device. At this point I dont know much about the history of
> > > why such check is present.
> >
> > Got it, but IMO the check in generic_copy_file_checks() can be
> > relaxed to cover blkdev cause splice does support blkdev.
> >
> > Then your bdev offload copy work can be simplified into:
> >
> > 1) implement .copy_file_range for def_blk_fops, suppose it is
> > blkdev_copy_file_range()
> >
> > 2) inside blkdev_copy_file_range()
> >
> > - if the bdev supports offload copy, just submit one bio to the device,
> > and this will be converted to one pt req to device
> >
> > - otherwise, fallback to generic_copy_file_range()
> >
>
Actually we sent initial version with single bio, but later community
suggested two bio's is must for offload, main reasoning being
dm-layer,Xcopy,copy across namespace compatibilty.
> We will check the feasibilty and try to implement the scheme in next versions.
> It would be helpful, if someone in community know's why such checks were
> present ? We see copy_file_range accepts only regular file. Was it
> designed only for regular files or can we extend it to regular block
> device.
>
As you suggested we were able to integrate def_blk_ops and
run with user application, but we see one main issue with this approach.
Using blkdev_copy_file_range requires having 2 file descriptors, which
is not possible for in kernel users such as fabrics/dm-kcopyd which has
only bdev descriptors.
Do you have any plumbing suggestions here ?
> > >
> > > > When I was researching pipe/splice code for supporting ublk zero copy[1], I
> > > > have got idea for async copy_file_range(), such as: io uring based
> > > > direct splice, user backed intermediate buffer, still zero copy, if these
> > > > ideas are finally implemented, we could get super-fast generic offload copy,
> > > > and bdev copy is really covered too.
> > > >
> > > > [1] https://lore.kernel.org/linux-block/20221103085004.1029763-1-ming.lei@redhat.com/
> > > >
> > >
> > > Seems interesting, We will take a look into this.
> >
> > BTW, that is probably one direction of ublk's async zero copy IO too.
> >
> >
> > Thanks,
> > Ming
> >
> >
>
>
> Thanks,
> Nitesh
Thanks,
Nitesh Shetty
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
next prev parent reply other threads:[~2022-12-07 6:26 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20221123061010epcas5p21cef9d23e4362b01f2b19d1117e1cdf5@epcas5p2.samsung.com>
2022-11-23 5:58 ` [PATCH v5 00/10] Implement copy offload support Nitesh Shetty
[not found] ` <CGME20221123061014epcas5p150fd8add12fe6d09b63c56972818e6a2@epcas5p1.samsung.com>
2022-11-23 5:58 ` [PATCH v5 01/10] block: Introduce queue limits for copy-offload support Nitesh Shetty
[not found] ` <CGME20221123061017epcas5p246a589e20eac655ac340cfda6028ff35@epcas5p2.samsung.com>
2022-11-23 5:58 ` [PATCH v5 02/10] block: Add copy offload support infrastructure Nitesh Shetty
2022-11-23 8:04 ` Ming Lei
2022-11-23 10:07 ` Nitesh Shetty
2022-11-24 0:03 ` Ming Lei
2022-11-29 11:44 ` Nitesh Shetty
2022-12-07 5:54 ` Nitesh Shetty [this message]
2022-12-07 11:19 ` Ming Lei
2022-12-09 8:16 ` Nitesh Shetty
[not found] ` <CGME20221123061021epcas5p276b6d48db889932282d017b27c9a3291@epcas5p2.samsung.com>
2022-11-23 5:58 ` [PATCH v5 03/10] block: add emulation for copy Nitesh Shetty
[not found] ` <CGME20221123061024epcas5p28fd0296018950d722b5a97e2875cf391@epcas5p2.samsung.com>
2022-11-23 5:58 ` [PATCH v5 04/10] block: Introduce a new ioctl " Nitesh Shetty
[not found] ` <CGME20221123061028epcas5p1aecd27b2f4f694b5a18b51d3df5d7432@epcas5p1.samsung.com>
2022-11-23 5:58 ` [PATCH v5 05/10] nvme: add copy offload support Nitesh Shetty
[not found] ` <CGME20221123061031epcas5p3745558c2caffd2fd21d15feff00495e9@epcas5p3.samsung.com>
2022-11-23 5:58 ` [PATCH v5 06/10] nvmet: add copy command support for bdev and file ns Nitesh Shetty
[not found] ` <482586a3-f45d-a17b-7630-341fb0e1ee96@linux.alibaba.com>
2022-11-23 9:39 ` Nitesh Shetty
2022-12-06 9:22 ` kernel test robot
[not found] ` <CGME20221123061034epcas5p3fe90293ad08df4901f98bae2d7cfc1ba@epcas5p3.samsung.com>
2022-11-23 5:58 ` [PATCH v5 07/10] dm: Add support for copy offload Nitesh Shetty
[not found] ` <CGME20221123061037epcas5p4d57436204fbe0065819b156eeeddbfac@epcas5p4.samsung.com>
2022-11-23 5:58 ` [PATCH v5 08/10] dm: Enable copy offload for dm-linear target Nitesh Shetty
[not found] ` <CGME20221123061041epcas5p4413569a46ee730cd3033a9025c8f134a@epcas5p4.samsung.com>
2022-11-23 5:58 ` [PATCH v5 09/10] dm kcopyd: use copy offload support Nitesh Shetty
[not found] ` <CGME20221123061044epcas5p2ac082a91fc8197821f29e84278b6203c@epcas5p2.samsung.com>
2022-11-23 5:58 ` [PATCH v5 10/10] fs: add support for copy file range in zonefs Nitesh Shetty
2022-11-23 6:53 ` Amir Goldstein
2022-11-23 10:13 ` Nitesh Shetty
2022-11-24 1:32 ` Damien Le Moal
2022-11-24 1:47 ` Damien Le Moal
2022-11-25 4:18 ` Al Viro
2022-11-29 12:22 ` Nitesh Shetty
2022-11-29 23:45 ` Damien Le Moal
2022-11-30 4:17 ` Nitesh Shetty
2022-11-30 9:55 ` Damien Le Moal
2022-11-23 22:56 ` [PATCH v5 00/10] Implement copy offload support Chaitanya Kulkarni
2022-11-29 12:16 ` Nitesh Shetty
2022-11-30 0:05 ` Chaitanya Kulkarni
2022-11-30 4:14 ` Nitesh Shetty
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221207055400.GA6497@test-zns \
--to=nj.shetty@samsung.com \
--cc=agk@redhat.com \
--cc=anuj20.g@samsung.com \
--cc=axboe@kernel.dk \
--cc=damien.lemoal@opensource.wdc.com \
--cc=dm-devel@redhat.com \
--cc=gost.dev@samsung.com \
--cc=hch@lst.de \
--cc=james.smart@broadcom.com \
--cc=joshi.k@samsung.com \
--cc=jth@kernel.org \
--cc=kbusch@kernel.org \
--cc=kch@nvidia.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=ming.lei@redhat.com \
--cc=naohiro.aota@wdc.com \
--cc=nitheshshetty@gmail.com \
--cc=p.raghav@samsung.com \
--cc=sagi@grimberg.me \
--cc=snitzer@kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).