All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jens Axboe <axboe@kernel.dk>, Steve French <sfrench@samba.org>,
	Sage Weil <sage@inktank.com>, Mark Fasheh <mfasheh@suse.com>,
	xfs@oss.sgi.com, Christoph Hellwig <hch@infradead.org>,
	Kent Overstreet <kmo@daterainc.com>,
	Dave Kleikamp <dave.kleikamp@oracle.com>,
	Joel Becker <jlbec@evilplan.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Zach Brown <zab@zabbo.net>, Anton Altaparmakov <anton@tuxera.com>
Subject: Re: [RFC] unifying write variants for filesystems
Date: Tue, 4 Feb 2014 12:44:09 +0000	[thread overview]
Message-ID: <20140204124409.GG10323@ZenIV.linux.org.uk> (raw)
In-Reply-To: <52EFC271.3090205@oracle.com>

On Mon, Feb 03, 2014 at 10:23:13AM -0600, Dave Kleikamp wrote:

> Thanks for the feedback. I'd been asking for feedback on this patchset
> for some time now, and have not received very much.
> 
> This is all based on some years-old work by Zach Brown that he probably
> wishes would have disappeared by now. I pretty much left what I could
> alone since 1) it was working, and 2) I didn't hear any objections
> (until now).
> 
> It's clear now that the patchset isn't close to mergable, so treat it
> like a proof-of-concept and we can come up with a better container and
> read/write interface. I won't respond individually to your comments, but
> will take them all into consideration going forward.

FWIW, I suspect that the right way to deal with dio side of things would
be a primitive along the lines of "get first N <page,start,len> for the
iov_iter".  With get_user_pages_fast() for iovec-backed ones and "just
grab references" for array-of-page-subranges ones.

_IF_ direct-io.c can be massaged to use that (and it looks like it should
be able to - AFAICS, we don't really care if pages are mapped in userland or
kernel space there), we get something really neat out of that: not only can
we get rid of generic_file_splice_write(), but we get full zero-copy
sendfile() - just have the target opened with O_DIRECT and everything will
work; ->splice_read() will trigger reads to source pagecache and with that
massage done, ->splice_write() will issue writes directly from those
pages, with no memory-to-memory copying in sight...  We can also get rid of
that kmap() in __swap_writepage(), while we are at it.

I'm going through direct-io.c guts right now and so far that looks feasible,
but I'd really appreciate comments from the folks more familiar with the
damn thing.

The queue so far is in vfs.git#iov_iter; I've gone after the low-hanging
fruits in the review I've posted upthread and I more or less like the
results so far...

Comments?

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

WARNING: multiple messages have this Message-ID (diff)
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	Jens Axboe <axboe@kernel.dk>, Mark Fasheh <mfasheh@suse.com>,
	Joel Becker <jlbec@evilplan.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	xfs@oss.sgi.com, Sage Weil <sage@inktank.com>,
	Steve French <sfrench@samba.org>,
	Anton Altaparmakov <anton@tuxera.com>, Zach Brown <zab@zabbo.net>,
	Kent Overstreet <kmo@daterainc.com>,
	Dave Kleikamp <dave.kleikamp@oracle.com>
Subject: Re: [RFC] unifying write variants for filesystems
Date: Tue, 4 Feb 2014 12:44:09 +0000	[thread overview]
Message-ID: <20140204124409.GG10323@ZenIV.linux.org.uk> (raw)
In-Reply-To: <52EFC271.3090205@oracle.com>

On Mon, Feb 03, 2014 at 10:23:13AM -0600, Dave Kleikamp wrote:

> Thanks for the feedback. I'd been asking for feedback on this patchset
> for some time now, and have not received very much.
> 
> This is all based on some years-old work by Zach Brown that he probably
> wishes would have disappeared by now. I pretty much left what I could
> alone since 1) it was working, and 2) I didn't hear any objections
> (until now).
> 
> It's clear now that the patchset isn't close to mergable, so treat it
> like a proof-of-concept and we can come up with a better container and
> read/write interface. I won't respond individually to your comments, but
> will take them all into consideration going forward.

FWIW, I suspect that the right way to deal with dio side of things would
be a primitive along the lines of "get first N <page,start,len> for the
iov_iter".  With get_user_pages_fast() for iovec-backed ones and "just
grab references" for array-of-page-subranges ones.

_IF_ direct-io.c can be massaged to use that (and it looks like it should
be able to - AFAICS, we don't really care if pages are mapped in userland or
kernel space there), we get something really neat out of that: not only can
we get rid of generic_file_splice_write(), but we get full zero-copy
sendfile() - just have the target opened with O_DIRECT and everything will
work; ->splice_read() will trigger reads to source pagecache and with that
massage done, ->splice_write() will issue writes directly from those
pages, with no memory-to-memory copying in sight...  We can also get rid of
that kmap() in __swap_writepage(), while we are at it.

I'm going through direct-io.c guts right now and so far that looks feasible,
but I'd really appreciate comments from the folks more familiar with the
damn thing.

The queue so far is in vfs.git#iov_iter; I've gone after the low-hanging
fruits in the review I've posted upthread and I more or less like the
results so far...

Comments?

  reply	other threads:[~2014-02-04 12:44 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-12 18:14 [PATCH 0/5] splice: locking changes and code refactoring Christoph Hellwig
2013-12-12 18:14 ` Christoph Hellwig
2013-12-12 18:15 ` [PATCH 1/5] splice: move balance_dirty_pages_ratelimited into pipe_to_file Christoph Hellwig
2013-12-12 18:15   ` Christoph Hellwig
2013-12-12 18:15 ` [PATCH 2/5] splice: nest i_mutex outside pipe_lock Christoph Hellwig
2013-12-12 18:15   ` Christoph Hellwig
2013-12-12 18:15 ` [PATCH 3/5] splice: use splice_from_pipe in generic_file_splice_write Christoph Hellwig
2013-12-12 18:15   ` Christoph Hellwig
2013-12-12 18:15 ` [PATCH 4/5] xfs: fix splice_write locking Christoph Hellwig
2013-12-12 18:15   ` Christoph Hellwig
2013-12-12 18:15 ` [PATCH 5/5] splice: stop exporting splice_from_pipe implementation details Christoph Hellwig
2013-12-12 18:15   ` Christoph Hellwig
2014-01-13 14:14 ` [PATCH 0/5] splice: locking changes and code refactoring Christoph Hellwig
2014-01-13 14:14   ` Christoph Hellwig
2014-01-13 23:56   ` Al Viro
2014-01-13 23:56     ` Al Viro
2014-01-14 13:22     ` Christoph Hellwig
2014-01-14 13:22       ` Christoph Hellwig
2014-01-14 17:20       ` Al Viro
2014-01-14 17:20         ` Al Viro
2014-01-15 18:10         ` Al Viro
2014-01-15 18:10           ` Al Viro
2014-01-18  6:40         ` Al Viro
2014-01-18  7:22           ` Linus Torvalds
2014-01-18  7:22             ` Linus Torvalds
2014-01-18  7:46             ` Al Viro
2014-01-18  7:56               ` Al Viro
2014-01-18  7:56                 ` Al Viro
2014-01-18  8:27               ` Al Viro
2014-01-18  8:44                 ` David Miller
2014-01-18  8:44                   ` David Miller
2014-02-07 17:10                   ` Al Viro
2014-02-07 17:10                     ` Al Viro
2014-01-18 19:59               ` Linus Torvalds
2014-01-18 20:10                 ` Al Viro
2014-01-18 20:27                   ` Al Viro
2014-01-18 20:27                     ` Al Viro
2014-01-18 20:30                     ` Al Viro
2014-01-18 20:30                       ` Al Viro
2014-01-19  5:13                   ` [RFC] unifying write variants for filesystems Al Viro
2014-01-19  5:13                     ` Al Viro
2014-01-20 13:55                     ` Christoph Hellwig
2014-01-20 13:55                       ` Christoph Hellwig
2014-01-20 20:32                       ` Linus Torvalds
2014-01-20 20:32                         ` Linus Torvalds
2014-02-01 22:43                         ` Al Viro
2014-02-01 22:43                           ` Al Viro
2014-02-02  0:13                           ` Linus Torvalds
2014-02-02  2:02                             ` Al Viro
2014-02-02  2:02                               ` Al Viro
2014-02-02 19:21                           ` Al Viro
2014-02-02 19:21                             ` Al Viro
2014-02-02 19:23                             ` Al Viro
2014-02-02 19:23                               ` Al Viro
2014-02-03 14:41                             ` Miklos Szeredi
2014-02-03 14:41                               ` Miklos Szeredi
2014-02-03 15:33                               ` Al Viro
2014-02-03 15:33                                 ` Al Viro
2014-02-02 23:16                           ` Anton Altaparmakov
2014-02-02 23:16                             ` Anton Altaparmakov
2014-02-03 15:12                           ` Christoph Hellwig
2014-02-03 16:24                             ` Al Viro
2014-02-03 16:50                             ` Dave Kleikamp
2014-02-03 16:23                           ` Dave Kleikamp
2014-02-04 12:44                             ` Al Viro [this message]
2014-02-04 12:44                               ` Al Viro
2014-02-04 12:52                               ` Kent Overstreet
2014-02-04 12:52                                 ` Kent Overstreet
2014-02-04 15:17                                 ` Al Viro
2014-02-04 15:17                                   ` Al Viro
2014-02-04 17:27                                   ` Zach Brown
2014-02-04 17:35                                     ` Kent Overstreet
2014-02-04 18:08                                       ` Al Viro
2014-02-04 18:08                                         ` Al Viro
2014-02-04 18:00                                     ` Al Viro
2014-02-04 18:00                                       ` Al Viro
2014-02-04 18:33                                       ` Zach Brown
2014-02-04 18:36                                         ` Al Viro
2014-02-04 18:36                                           ` Al Viro
2014-02-05 19:58                                           ` Al Viro
2014-02-05 20:42                                             ` Zach Brown
2014-02-06  9:08                                             ` Kent Overstreet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140204124409.GG10323@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=anton@tuxera.com \
    --cc=axboe@kernel.dk \
    --cc=dave.kleikamp@oracle.com \
    --cc=hch@infradead.org \
    --cc=jlbec@evilplan.org \
    --cc=kmo@daterainc.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=mfasheh@suse.com \
    --cc=sage@inktank.com \
    --cc=sfrench@samba.org \
    --cc=torvalds@linux-foundation.org \
    --cc=xfs@oss.sgi.com \
    --cc=zab@zabbo.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.