From: Al Viro <viro@ZenIV.linux.org.uk>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jan Kara <jack@suse.cz>, David Howells <dhowells@redhat.com>,
torvalds@linux-foundation.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, hch@infradead.org,
akpm@linux-foundation.org, apw@canonical.com, nbd@openwrt.org,
neilb@suse.de, jordipujolp@gmail.com, ezk@fsl.cs.sunysb.edu,
sedat.dilek@googlemail.com, hooanon05@yahoo.co.jp,
mszeredi@suse.cz
Subject: Re: [PATCH 2/9] vfs: export do_splice_direct() to modules
Date: Tue, 19 Mar 2013 17:03:24 +0000 [thread overview]
Message-ID: <20130319170324.GI21522@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CAJfpegt8Zi3f5VxWc0kGRFcra3x_rze8j6dKCCTb4t+ViMCVYA@mail.gmail.com>
On Tue, Mar 19, 2013 at 11:29:41AM +0100, Miklos Szeredi wrote:
> Copy up is a once-in-a-lifetime event for an object. Optimizing it is
> way down in the list of things to do. I'd drop splice in a jiffy if
> it's in the way.
What makes you think that write is any better? Same deadlock there - check
generic_file_aio_write(), it calls the same sb_start_write()... IOW,
switching from splice to write won't help at all.
> Much more interesting question: what happens if we crash during a
> rename? Whiteout implemented in the filesystem won't save us. And
> the results are interesting: old versions of files become visible and
> similar fun. Far from likely to happen, but ...
>
> Add a rename-with-whiteout primitive on filesystems? That one is not
> going to be as simple as plain whiteout. Or?
Umm... If/when we start caring about that kind of atomicity (and I agree
that we ought to) overlayfs approach to whiteouts will actually have much
harder time - it doesn't take much to teach a journalling fs how to do that
kind of ->rename() in a single transaction; the only question is how to tell
it that we want to leave a whiteout behind us. Hell knows; one variant is
to add a flag, of course. Another might be more interesting - we want some
kind of "directory is opaque" flag, so if we start reshuffling the methods,
we might try to merge unlink/rmdir/whiteout. Rules:
* victim is negative => create a whiteout
* victim is a directory, parent opaque => rmdir
* victim is a non-directory, parent opaque => unlink
* victim is positive, parent _not_ opaque => replace with whiteout
* old_dir in case of ->rename() is opaque => normal rename
* old_dir in case of ->rename() is not opaque => leave whiteout behind
Non-unioned => opaque, of course (nothing showing through it).
Getting good behaviour on rename interrupted by crash is going to be _very_
tricky with any strategy other than whiteouts-in-fs, AFAICS.
Again, I have no problem whatsoever with changing the set of directory
methods, as long as the replacement is sane. We'd done that kind of thing
before and it's not a problem.
next prev parent reply other threads:[~2013-03-19 17:03 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-13 14:16 [PATCH 0/9] overlay filesystem: request for inclusion (v17) Miklos Szeredi
2013-03-13 14:16 ` [PATCH 1/9] vfs: add i_op->dentry_open() Miklos Szeredi
2013-03-13 22:44 ` Andrew Morton
2013-03-14 11:15 ` Miklos Szeredi
2013-03-13 14:16 ` [PATCH 2/9] vfs: export do_splice_direct() to modules Miklos Szeredi
2013-03-13 22:45 ` Andrew Morton
2013-03-17 13:06 ` David Howells
2013-03-18 2:31 ` Dave Chinner
2013-03-18 15:39 ` Jan Kara
2013-03-18 21:53 ` Al Viro
2013-03-18 23:01 ` Al Viro
2013-03-19 1:38 ` Al Viro
2013-03-19 9:00 ` J. R. Okajima
2013-03-19 10:29 ` Miklos Szeredi
2013-03-19 11:04 ` David Howells
2013-03-19 11:40 ` Miklos Szeredi
2013-03-19 17:03 ` Al Viro [this message]
2013-03-19 18:32 ` Miklos Szeredi
2013-03-19 21:24 ` Al Viro
2013-03-20 9:15 ` Miklos Szeredi
2013-03-19 20:54 ` Jan Kara
2013-03-19 20:25 ` Jan Kara
2013-03-19 21:38 ` Al Viro
2013-03-19 22:10 ` Al Viro
2013-03-20 2:33 ` Al Viro
2013-03-20 12:30 ` David Howells
2013-03-20 19:52 ` Jan Kara
2013-03-20 21:48 ` Al Viro
2013-03-20 22:19 ` Jan Kara
2013-03-22 17:37 ` J. R. Okajima
2013-03-22 18:11 ` Al Viro
2013-03-22 18:21 ` Al Viro
2013-03-23 2:49 ` J. R. Okajima
2013-03-23 4:41 ` Al Viro
2013-03-23 5:37 ` J. R. Okajima
2013-03-13 14:16 ` [PATCH 3/9] vfs: export __inode_permission() " Miklos Szeredi
2013-03-13 14:16 ` [PATCH 4/9] vfs: introduce clone_private_mount() Miklos Szeredi
2013-03-13 22:48 ` Andrew Morton
2013-03-14 13:28 ` Miklos Szeredi
2013-03-13 14:16 ` [PATCH 5/9] overlay filesystem Miklos Szeredi
2013-03-13 22:53 ` Andrew Morton
2013-03-13 14:16 ` [PATCH 6/9] overlayfs: add statfs support Miklos Szeredi
2013-03-13 14:16 ` [PATCH 7/9] overlayfs: implement show_options Miklos Szeredi
2013-03-13 14:16 ` [PATCH 8/9] overlay: overlay filesystem documentation Miklos Szeredi
2013-03-13 23:06 ` Andrew Morton
2013-03-14 13:35 ` Miklos Szeredi
2013-03-13 14:16 ` [PATCH 9/9] fs: limit filesystem stacking depth Miklos Szeredi
2013-03-13 14:31 ` [PATCH 0/9] overlay filesystem: request for inclusion (v17) Sedat Dilek
2013-03-13 15:13 ` Sedat Dilek
2013-03-13 15:18 ` Miklos Szeredi
2013-03-13 15:26 ` Sedat Dilek
2013-03-13 15:53 ` Sedat Dilek
2013-03-13 16:10 ` Sedat Dilek
2013-03-13 16:21 ` Miklos Szeredi
2013-03-13 16:35 ` Sedat Dilek
2013-03-13 16:51 ` Sedat Dilek
2013-03-13 18:12 ` Robin Holt
2013-03-13 18:37 ` Felix Fietkau
2013-03-13 19:10 ` Sedat Dilek
2013-03-13 19:54 ` Eric W. Biederman
2013-03-13 19:58 ` Linus Torvalds
2013-03-13 20:27 ` Sedat Dilek
[not found] ` <CAB3woddVfZ9PdYPpzidJLBMmUeRx0Rxgb5Pc8bTM9U-tkcS_uA@mail.gmail.com>
2013-03-13 20:32 ` Sedat Dilek
2013-03-13 20:36 ` Phillip Lougher
2013-03-13 23:08 ` Andrew Morton
2013-03-14 13:43 ` Miklos Szeredi
2013-03-15 1:25 ` Al Viro
2013-03-15 4:15 ` J. R. Okajima
2013-03-15 4:44 ` Al Viro
2013-03-15 5:09 ` J. R. Okajima
2013-03-15 5:13 ` Al Viro
2013-03-15 8:15 ` James Bottomley
2013-03-15 12:12 ` Al Viro
2013-03-15 18:57 ` J. R. Okajima
2013-03-15 19:26 ` Erez Zadok
2013-03-15 20:30 ` Al Viro
2013-03-16 13:55 ` J. R. Okajima
2013-03-15 19:11 ` Linus Torvalds
2013-03-16 13:57 ` J. R. Okajima
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130319170324.GI21522@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=akpm@linux-foundation.org \
--cc=apw@canonical.com \
--cc=dhowells@redhat.com \
--cc=ezk@fsl.cs.sunysb.edu \
--cc=hch@infradead.org \
--cc=hooanon05@yahoo.co.jp \
--cc=jack@suse.cz \
--cc=jordipujolp@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=mszeredi@suse.cz \
--cc=nbd@openwrt.org \
--cc=neilb@suse.de \
--cc=sedat.dilek@googlemail.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.