From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ric Wheeler Subject: Re: [RFC] extending splice for copy offloading Date: Thu, 26 Sep 2013 17:26:39 -0400 Message-ID: <5244A68F.906@redhat.com> References: <1378919210-10372-1-git-send-email-zab@redhat.com> <20130925183828.GA30372@lenny.home.zabbo.net> <20130925190620.GB30372@lenny.home.zabbo.net> <20130925195526.GA18971@fieldses.org> <20130925210742.GG30372@lenny.home.zabbo.net> <20130926185508.GO30372@lenny.home.zabbo.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Miklos Szeredi , "J. Bruce Fields" , Anna Schumaker , Kernel Mailing List , Linux-Fsdevel , "linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , Trond Myklebust , Bryan Schumaker , "Martin K. Petersen" , Jens Axboe , Mark Fasheh , Joel Becker , Eric Wong To: Zach Brown Return-path: In-Reply-To: <20130926185508.GO30372-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org> Sender: linux-nfs-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: linux-fsdevel.vger.kernel.org On 09/26/2013 02:55 PM, Zach Brown wrote: > On Thu, Sep 26, 2013 at 10:58:05AM +0200, Miklos Szeredi wrote: >> On Wed, Sep 25, 2013 at 11:07 PM, Zach Brown wrote: >>>> A client-side copy will be slower, but I guess it does have the >>>> advantage that the application can track progress to some degree, and >>>> abort it fairly quickly without leaving the file in a totally undefined >>>> state--and both might be useful if the copy's not a simple constant-time >>>> operation. >>> I suppose, but can't the app achieve a nice middle ground by copying the >>> file in smaller syscalls? Avoid bulk data motion back to the client, >>> but still get notification every, I dunno, few hundred meg? >> Yes. And if "cp" could just be switched from a read+write syscall >> pair to a single splice syscall using the same buffer size. And then >> the user would only notice that things got faster in case of server >> side copy. No problems with long blocking times (at least not much >> worse than it was). > Hmm, yes, that would be a nice outcome. > >> However "cp" doesn't do reflinking by default, it has a switch for >> that. If we just want "cp" and the like to use splice without fearing >> side effects then by default we should try to be as close to >> read+write behavior as possible. No? > I guess? I don't find requiring --reflink hugely compelling. But there > it is. > >> That's what I'm really >> worrying about when you want to wire up splice to reflink by default. >> I do think there should be a flag for that. And if on the block level >> some magic happens, so be it. It's not the fs deverloper's worry any >> more ;) > Sure. So we'd have: > > - no flag default that forbids knowingly copying with shared references > so that it will be used by default by people who feel strongly about > their assumptions about independent write durability. > > - a flag that allows shared references for people who would otherwise > use the file system shared reference ioctls (ocfs2 reflink, btrfs > clone) but would like it to also do server-side read/write copies > over nfs without additional intervention. > > - a flag that requires shared references for callers who don't want > giant copies to take forever if they aren't instant. (The qemu guys > asked for this at Plumbers.) > > I think I can live with that. > > - z This last flag should not prevent a remote target device (NFS or SCSI array) copy from working though since they often do reflink like operations inside of the remote target device.... ric -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html