From mboxrd@z Thu Jan 1 00:00:00 1970 From: Al Viro Subject: Re: [PATCH 0/5] splice: locking changes and code refactoring Date: Fri, 7 Feb 2014 17:10:23 +0000 Message-ID: <20140207171023.GU10323@ZenIV.linux.org.uk> References: <20140118074649.GF10323@ZenIV.linux.org.uk> <20140118082730.GH10323@ZenIV.linux.org.uk> <20140118.004453.1800341321580114709.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: torvalds@linux-foundation.org, hch@infradead.org, axboe@kernel.dk, mfasheh@suse.com, jlbec@evilplan.org, linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com, sage@inktank.com, sfrench@samba.org To: David Miller Return-path: Received: from zeniv.linux.org.uk ([195.92.253.2]:56181 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751316AbaBGRKg (ORCPT ); Fri, 7 Feb 2014 12:10:36 -0500 Content-Disposition: inline In-Reply-To: <20140118.004453.1800341321580114709.davem@davemloft.net> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Sat, Jan 18, 2014 at 12:44:53AM -0800, David Miller wrote: > From: Al Viro > Date: Sat, 18 Jan 2014 08:27:30 +0000 > > > BTW, would sockets benefit from having ->sendpages() that would take an > > array of (page, offset, len) triples? It would be trivial to do and > > some of the helpers that are falling out of writing that writev-based > > default_file_splice_write() look like they could be reused for > > calling that one... Dave? > > That's originally how the sendpage method was implemented, but back then > Linus asked us to only pass one page at a time. > > I don't remember the details beyond that. FWIW, I wonder if what we are doing with ->msg_iov is the right thing. We modify the iovecs in array as we drain it. And that's inconvenient for at least some callers (see e.g. complaints in fs/ncpfs about the need to copy the array, etc.). What if we embed iov_iter into the sucker and replace memcpy_{to,from}iovec* with variants taking iov_iter *? If nothing else, it'll be marginally more efficient (no more skipping the already-emptied iovecs) and it seems to be more convenient for callers. If we are lucky, that might even eliminate the need of ->sendpage() - just set the iov_iter over array instead of iovec one and let ->sendmsg() do the smart thing if it knows how. I hadn't done comparison of {tcp,udp}_send{page,msg}, though - there might be dragons... Even if that will turn out to be infeasible, it will at least drive the kmap/kunmap done by sock_no_sendpage() down into memcpy_from_iter(), turning them into kmap_atomic/kunmap_atomic. The obvious price is that kernel-side msghdr diverges from the userland one, so copy_msghdr_from_user() needs to deal with that, but I really doubt that you'll find a load where the price of copying it in two chunks instead of one would be measurable. What else am I missing?