All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 3/2] splice: increase pipe size in splice_direct_to_actor()
Date: Wed, 14 Nov 2018 08:41:00 +1100	[thread overview]
Message-ID: <20181113214100.GS19305@dastard> (raw)
In-Reply-To: <20181113162237.GF4235@magnolia>

On Tue, Nov 13, 2018 at 08:22:37AM -0800, Darrick J. Wong wrote:
> On Fri, Nov 09, 2018 at 11:54:10AM +1100, Dave Chinner wrote:
> > 
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > When copy_file_range() is called on files that have been opened with
> > O_DIRECT, do_splice_direct() does a manual copy of the range one
> > pipe buffer at a time. The default is 16 pages, which means on
> > x86_64 it is limited to 64kB IO. This is extremely slow - 64k
> > synchrnous read/write will run at maybe 5-10MB/s on a spinning disk
> > and be seek bound. It will be faster on SSDs, but still very
> > inefficient.
> > 
> > Increase the pipe size to the maximum allowed user size so that we
> > can get decent throughput for this highly sub-optimal copy loop. Add
> > a new function to the pipe code that lets us set the pipe size to
> > the maximum allowed without root permissions to keep things really
> > simple. We also don't care if changing the pipe size fails - that
> > will just result in a slower copy.
> > 
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > ---
> >  fs/pipe.c                 | 10 ++++++++++
> >  fs/splice.c               |  7 +++++++
> >  include/linux/pipe_fs_i.h |  1 +
> >  3 files changed, 18 insertions(+)
> > 
> > diff --git a/fs/pipe.c b/fs/pipe.c
> > index bdc5d3c0977d..436bc0464569 100644
> > --- a/fs/pipe.c
> > +++ b/fs/pipe.c
> > @@ -1109,6 +1109,16 @@ static long pipe_set_size(struct pipe_inode_info *pipe, unsigned long arg)
> >  	return ret;
> >  }
> >  
> > +/*
> > + * Set the pipe to the maximum allowable user size. Advisory only, will
> > + * swallow any errors and return the resultant pipe size.
> > + */
> > +long pipe_set_max_safe_size(struct pipe_inode_info *pipe)
> > +{
> > +	pipe_set_size(pipe, pipe_max_size);
> > +	return pipe->buffers * PAGE_SIZE;
> > +}
> > +
> >  /*
> >   * After the inode slimming patch, i_pipe/i_bdev/i_cdev share the same
> >   * location, so checking ->i_pipe is not enough to verify that this is a
> > diff --git a/fs/splice.c b/fs/splice.c
> > index 3553f1956508..9749139da731 100644
> > --- a/fs/splice.c
> > +++ b/fs/splice.c
> > @@ -931,6 +931,13 @@ ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd,
> >  		current->splice_pipe = pipe;
> >  	}
> >  
> > +	/*
> > +	 * Try to increase the data holding capacity of the pipe so we can do
> > +	 * larger IOs. This may not increase the size at all because maximum
> > +	 * user pipe size is administrator controlled, but we still should try.
> > +	 */
> > +	pipe_set_max_safe_size(pipe);
> 
> I get where you're going with this, but I have two questions:
> 
> - Is it safe to be enlarging the pipe buffer size unconditionally?

Don't see why it would be unsafe.

> - Especially if we didn't just create the splice pipe?  Suppose someone
>   comes along later trying to splice things and doesn't realize the pipe
>   is now 1MB...

The splice code is supposed to handle arbitrary pipe sizes
correctly. if something breaks because it has assumptions about how
much data a pipe can hold, it's already broken.

> Then I started wondering about the splice_pipe lifetime and couldn't
> figure out if it ever gets detached from current prior to do_exit.
> I don't think it does, which means that we're stuck with the 1MB
> kernel memory allocation until the process dies.

If you are using do_splice_direct(), you either have a short term
process (i.e. a cp type utility) or you are moving bulk data around,
in which case 1MB of extra memory isn't a big deal.

And given that the default pipe size is dependent on PAGE_SIZE (i.e.
the default is 16 pages, not 64kB) then on 64k page architectures we
are already using pipes of 1MB capacity by default.

I could make this contingent on O_DIRECT, but then we have the
problem that 64k pipes aren't big enough for efficient buffered IO
with 64k block size filesystems, either. IOWs, the pipe size in
do_splice_direct needs to be increased whichever way we look at it.

This all said, I really think we need to imlpement our own
->copy_file_range() code that uses iomap to iterate data-only
extents (i.e. hole-preserving) copying and does well formed IO.
IOWs, only fall back to do_splice_direct() when doing copies to/from
non-XFS filesystems.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2018-11-14  7:41 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-08 22:19 [PATCH 0/2]: dedupe/copy_file_range fixes Dave Chinner
2018-11-08 22:19 ` [PATCH 1/2] vfs: vfs_dedupe_file_range() doesn't return EOPNOTSUPP Dave Chinner
2018-11-08 23:03   ` Darrick J. Wong
2018-11-09  6:47     ` Darrick J. Wong
2018-11-16  5:33       ` Dave Chinner
2018-11-08 22:19 ` [PATCH 2/2] iomap: dio data corruption and spurious errors when pipes fill Dave Chinner
2018-11-13 16:25   ` Darrick J. Wong
2018-11-15 10:15   ` Christoph Hellwig
2018-11-09  0:54 ` [PATCH 3/2] splice: increase pipe size in splice_direct_to_actor() Dave Chinner
2018-11-13 16:22   ` Darrick J. Wong
2018-11-13 21:41     ` Dave Chinner [this message]
2018-11-15 10:17   ` Christoph Hellwig
2018-11-16  5:42     ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181113214100.GS19305@dastard \
    --to=david@fromorbit.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.