From: Ric Wheeler <rwheeler-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Miklos Szeredi <miklos-sUDqSbJrdHQHWmgEVkV9KA@public.gmane.org>
Cc: "J. Bruce Fields"
<bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>,
"Myklebust,
Trond" <Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>,
Zach Brown <zab-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Anna Schumaker
<schumaker.anna-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Kernel Mailing List
<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Linux-Fsdevel
<linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
"linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
"Schumaker,
Bryan" <Bryan.Schumaker-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>,
"Martin K. Petersen" <mkp-30zCAauEzIw@public.gmane.org>,
Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>,
Mark Fasheh <mfasheh-IBi9RG/b67k@public.gmane.org>,
Joel Becker <jlbec-aKy9MeLSZ9dg9hUCZPvPmw@public.gmane.org>,
Eric Wong <normalperson-rMlxZR9MS24@public.gmane.org>
Subject: Re: [RFC] extending splice for copy offloading
Date: Mon, 30 Sep 2013 09:41:58 -0500 [thread overview]
Message-ID: <52498DB6.7060901@redhat.com> (raw)
In-Reply-To: <CAJfpegv_C6cLOuA-mNtgtf2QbmmmcHwjQVo8mAnhf_wbJ8iRhg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On 09/30/2013 10:38 AM, Miklos Szeredi wrote:
> On Mon, Sep 30, 2013 at 4:28 PM, Ric Wheeler <rwheeler-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>> On 09/30/2013 10:24 AM, Miklos Szeredi wrote:
>>> On Mon, Sep 30, 2013 at 4:52 PM, Ric Wheeler <rwheeler-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>>>> On 09/30/2013 10:51 AM, Miklos Szeredi wrote:
>>>>> On Mon, Sep 30, 2013 at 4:34 PM, J. Bruce Fields <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
>>>>> wrote:
>>>>>>> My other worry is about interruptibility/restartability. Ideas?
>>>>>>>
>>>>>>> What happens on splice(from, to, 4G) and it's a non-reflink copy?
>>>>>>> Can the page cache copy be made restartable? Or should splice() be
>>>>>>> allowed to return a short count? What happens on (non-reflink) remote
>>>>>>> copies and huge request sizes?
>>>>>> If I were writing an application that required copies to be
>>>>>> restartable,
>>>>>> I'd probably use the largest possible range in the reflink case but
>>>>>> break the copy into smaller chunks in the splice case.
>>>>>>
>>>>> The app really doesn't want to care about that. And it doesn't want
>>>>> to care about restartability, etc.. It's something the *kernel* has
>>>>> to care about. You just can't have uninterruptible syscalls that
>>>>> sleep for a "long" time, otherwise first you'll just have annoyed
>>>>> users pressing ^C in vain; then, if the sleep is even longer, warnings
>>>>> about task sleeping too long.
>>>>>
>>>>> One idea is letting splice() return a short count, and so the app can
>>>>> safely issue SIZE_MAX requests and the kernel can decide if it can
>>>>> copy the whole file in one go or if it wants to do it in smaller
>>>>> chunks.
>>>>>
>>>> You cannot rely on a short count. That implies that an offloaded copy
>>>> starts
>>>> at byte 0 and the short count first bytes are all valid.
>>> Huh?
>>>
>>> - app calls splice(from, 0, to, 0, SIZE_MAX)
>>> 1) VFS calls ->direct_splice(from, 0, to, 0, SIZE_MAX)
>>> 1.a) fs reflinks the whole file in a jiffy and returns the size of
>>> the file
>>> 1 b) fs does copy offload of, say, 64MB and returns 64M
>>> 2) VFS does page copy of, say, 1MB and returns 1MB
>>> - app calls splice(from, X, to, X, SIZE_MAX) where X is the new offset
>>> ...
>>>
>>> The point is: the app is always doing the same (incrementing offset
>>> with the return value from splice) and the kernel can decide what is
>>> the best size it can service within a single uninterruptible syscall.
>>>
>>> Wouldn't that work?
>>>
>> No.
>>
>> Keep in mind that the offload operation in (1) might fail partially. The
>> target file (the copy) is allocated, the question is what ranges have valid
>> data.
> You are talking about case 1.a, right? So if the offload copy 0-64MB
> fails partially, we return failure from splice, yet some of the copy
> did succeed. Is that the problem? Why?
>
> Thanks,
> Miklos
The way the array based offload (and some software side reflink works) is not a
byte by byte copy. We cannot assume that a valid count can be returned or that
such a count would be an indication of a sequential segment of good data. The
whole thing would normally have to be reissued.
To make that a true assumption, you would have to mandate that in each of the
specifications (and sw targets)...
ric
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-09-30 14:41 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-11 17:06 [RFC] extending splice for copy offloading Zach Brown
[not found] ` <1378919210-10372-1-git-send-email-zab-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-09-11 17:06 ` [PATCH 1/3] splice: add DIRECT flag for splicing between files Zach Brown
2013-09-11 17:06 ` [PATCH 2/3] splice: add f_op->splice_direct Zach Brown
2013-09-11 17:06 ` [PATCH 3/3] btrfs: implement .splice_direct extent copying Zach Brown
2013-09-20 9:49 ` [RFC] extending splice for copy offloading Szeredi Miklos
[not found] ` <CAELBmZBGD4rph=gjLCPKCdEj+nzEQ-F=DExoL+h3vRm7qF7dCQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-25 18:38 ` Zach Brown
2013-09-25 19:02 ` Anna Schumaker
[not found] ` <CAFX2JfnyF8kyMYzCdqdr2JkoyQCom1bFLpFj89wODjoju54-Ow-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-25 19:06 ` Zach Brown
[not found] ` <20130925190620.GB30372-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org>
2013-09-25 19:55 ` J. Bruce Fields
2013-09-25 21:07 ` Zach Brown
2013-09-26 8:58 ` Miklos Szeredi
2013-09-26 15:34 ` J. Bruce Fields
[not found] ` <20130926153359.GE704-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2013-09-26 16:46 ` Ric Wheeler
2013-09-26 18:06 ` Miklos Szeredi
2013-09-26 19:06 ` Zach Brown
2013-09-26 19:53 ` Miklos Szeredi
[not found] ` <CAJfpegvvWhs+jv2J9kOQrB31PEO3kyn_sLm_e2w9YKp=y6EDhA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-26 21:23 ` Ric Wheeler
[not found] ` <5244A5E7.90808-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-09-27 4:47 ` Miklos Szeredi
2013-09-27 14:00 ` Ric Wheeler
2013-09-27 14:39 ` Miklos Szeredi
[not found] ` <CAJfpegsUchb0eX+Hi3rN5Ypje3Y-dgo=pxgM1Y3BQbHVp=1hSw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-06 8:42 ` Rob Landley
2013-09-26 18:55 ` Zach Brown
[not found] ` <20130926185508.GO30372-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org>
2013-09-26 21:26 ` Ric Wheeler
[not found] ` <5244A68F.906-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-09-27 20:05 ` J. Bruce Fields
2013-09-27 20:50 ` Zach Brown
2013-09-28 5:49 ` Miklos Szeredi
2013-09-28 15:20 ` Myklebust, Trond
2013-09-28 21:20 ` Ric Wheeler
[not found] ` <52474839.2080201-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-09-30 12:20 ` Miklos Szeredi
2013-09-30 14:34 ` J. Bruce Fields
2013-09-30 14:48 ` Ric Wheeler
2013-09-30 14:51 ` Miklos Szeredi
[not found] ` <CAJfpeguMCzv-UhrXrG7e9Q7F_0aEe3_ZMumFwLu3hxcewA_7gA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-30 14:52 ` Ric Wheeler
2013-09-30 15:24 ` Miklos Szeredi
2013-09-30 14:28 ` Ric Wheeler
[not found] ` <CAJfpegv_C6cLOuA-mNtgtf2QbmmmcHwjQVo8mA nhf_wbJ8iRhg@mail.gmail.com>
[not found] ` <CAJfpegv_C6cLOuA-mNtgtf2QbmmmcHwjQVo8mAnhf_wbJ8iRhg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-30 14:41 ` Ric Wheeler [this message]
[not found] ` <52498DB6.7060901-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-09-30 15:46 ` Miklos Szeredi
2013-09-30 14:49 ` Ric Wheeler
[not found] ` <52498F68.8050200-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-09-30 15:57 ` Miklos Szeredi
[not found] ` <CAJfpegvvN_5c5oMv8UoODXQHc-DQnijhOtPDXmNamVpQLDoWMQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-30 16:31 ` Miklos Szeredi
2013-09-30 17:17 ` Bernd Schubert
[not found] ` <5249B21E.70603-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
2013-09-30 17:44 ` Myklebust, Trond
[not found] ` <1380563050.6501.15.camel-5lNtUQgoD8Pfa3cDbr2K10B+6BGkLq7r@public.gmane.org>
2013-09-30 17:48 ` Bernd Schubert
[not found] ` <5249B987.8020807-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
2013-09-30 18:02 ` Myklebust, Trond
2013-09-30 18:49 ` Bernd Schubert
2013-09-30 19:34 ` Myklebust, Trond
2013-09-30 20:00 ` Bernd Schubert
2013-09-30 20:08 ` Ric Wheeler
[not found] ` <5249DA50.5060105-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-09-30 20:27 ` Myklebust, Trond
[not found] ` <5249D86A.7080603-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
2013-09-30 20:10 ` Myklebust, Trond
[not found] ` <CAJfpegsvrr7x3MbdpvxUmzq0ZfDHfZkzAar6Od2G7wg8DgPLYQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-01 18:42 ` J. Bruce Fields
2013-09-30 15:33 ` Myklebust, Trond
[not found] ` <52498AA8.2090204-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-09-30 15:38 ` Miklos Szeredi
[not found] ` <CAJfpegtpXuh9070ALGy16Y8kdgioBqSf4JQqBBCF4FHvFJWAWQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-01 19:58 ` Zach Brown
[not found] ` <20131001195817.GE10831-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org>
2013-10-02 12:58 ` Jan Kara
2013-10-02 13:31 ` David Lang
2013-12-18 12:41 ` Christoph Hellwig
2013-12-18 17:10 ` Zach Brown
2013-12-18 17:26 ` Anna Schumaker
2013-09-11 21:17 ` Eric Wong
[not found] ` <20130911211722.GA9725-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org>
2013-09-16 19:44 ` Rob Landley
2013-09-19 12:59 ` Jeff Layton
-- strict thread matches above, loose matches on Subject: below --
2013-09-26 17:22 Steve French
[not found] ` <CAH2r5muBuTK7ZZ+aKGC4q35gqaSWF4o07eoHypLKiNn5Y83RbQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-26 19:34 ` David Disseldorp
2013-10-10 2:18 ` Steve French
2013-10-01 21:05 ` J. Bruce Fields
[not found] ` <20131001210531.GA7093-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2013-10-02 1:19 ` Steve French
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52498DB6.7060901@redhat.com \
--to=rwheeler-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
--cc=Bryan.Schumaker-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org \
--cc=Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org \
--cc=axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org \
--cc=bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org \
--cc=jlbec-aKy9MeLSZ9dg9hUCZPvPmw@public.gmane.org \
--cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=mfasheh-IBi9RG/b67k@public.gmane.org \
--cc=miklos-sUDqSbJrdHQHWmgEVkV9KA@public.gmane.org \
--cc=mkp-30zCAauEzIw@public.gmane.org \
--cc=normalperson-rMlxZR9MS24@public.gmane.org \
--cc=schumaker.anna-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=zab-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).