From: Al Viro <viro@ZenIV.linux.org.uk>
To: Jan Kara <jack@suse.cz>
Cc: David Howells <dhowells@redhat.com>,
Miklos Szeredi <miklos@szeredi.hu>,
torvalds@linux-foundation.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, hch@infradead.org,
akpm@linux-foundation.org, apw@canonical.com, nbd@openwrt.org,
neilb@suse.de, jordipujolp@gmail.com, ezk@fsl.cs.sunysb.edu,
sedat.dilek@googlemail.com, hooanon05@yahoo.co.jp,
mszeredi@suse.cz
Subject: Re: [PATCH 2/9] vfs: export do_splice_direct() to modules
Date: Wed, 20 Mar 2013 02:33:08 +0000 [thread overview]
Message-ID: <20130320023308.GM21522@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20130319221032.GL21522@ZenIV.linux.org.uk>
On Tue, Mar 19, 2013 at 10:10:32PM +0000, Al Viro wrote:
> OK, it's going to be an interesting series - aforementioned tentative patch
> was badly incomplete ;-/
The interesting question is how far do we want to lift that. ->aio_write()
part is trivial - see vfs.git#experimental; the trouble begins with
->splice_write(). For *everything* except default_file_splice_write(),
lifting into the caller (do_splice_from()) is the right thing to do.
default_file_splice_write(), however, it trickier; there we end up calling
vfs_write() (via an ugly callchain). And _that_ is a real bitch. Granted,
vfs_write() is somewhat an overkill there (we'd already done rw_verify_area()
and access_ok() is pointless due to set_fs() we do around vfs_write()
there) and we'd already lifted it up to do_sync_write(). But if we lift
it any further, we'll need to deal with ->write() callers in the tree.
Current situation:
fs/coredump.c:662: return access_ok(VERIFY_READ, addr, nr) && file->f_op->write(file, addr, nr, &file->f_pos) == nr;
arch/powerpc/platforms/cell/spufs/coredump.c:63: written = file->f_op->write(file, addr, nr, &file->f_pos);
for these guys we might actually want to lift all way up to do_coredump()
drivers/staging/comedi/drivers/serial2002.c:91: result = f->f_op->write(f, buf, count, &f->f_pos);
fs/autofs4/waitq.c:73: (wr = file->f_op->write(file,data,bytes,&file->f_pos)) > 0) {
not regular files, unless I'm seriously misreading the code.
kernel/acct.c:553: file->f_op->write(file, (char *)&ac,
BTW, this is probably where we want to deal with your acct deadlock.
fs/compat.c:1103: fn = (io_fn_t)file->f_op->write;
fs/read_write.c:435: ret = file->f_op->write(file, buf, count, pos);
fs/read_write.c:732: fn = (io_fn_t)file->f_op->write;
syscalls - the question here is whether we lift it up to vfs_write/vfs_writev/
compat_writev, or actually take it further.
fs/cachefiles/rdwr.c:967: ret = file->f_op->write(
cachefiles_write_page(); no fucking idea what locks might be held by caller
and potentially that's a rather nasty source of PITA
fs/coda/file.c:84: ret = host_file->f_op->write(host_file, buf, count, ppos);
coda writing to file in cache on local fs. Potentially a nasty bugger, since
it's hard to lift any further - the caller has no idea that the thing is
on CODA, let alone what happens to hold the local cache.
drivers/block/loop.c:234: bw = file->f_op->write(file, buf, len, &pos);
do_bio_filebacked(), with some ugliness between that and callsite. Note,
BTW, that we have a pair of possible vfs_fsync() calls in there; how do those
interact with freeze?
This does *not* touch the current callers of vfs_write()/vfs_writev(); any of
those called while holding ->i_mutex on a directory (or mnt_want_write(), for
that matter) is a deadlock right now.
And we'd better start thinking about how we'll backport that crap - deadlock
in e.g. xfs ->splice_write() had been there since last summer ;-/
next prev parent reply other threads:[~2013-03-20 2:33 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-13 14:16 [PATCH 0/9] overlay filesystem: request for inclusion (v17) Miklos Szeredi
2013-03-13 14:16 ` [PATCH 1/9] vfs: add i_op->dentry_open() Miklos Szeredi
2013-03-13 22:44 ` Andrew Morton
2013-03-14 11:15 ` Miklos Szeredi
2013-03-13 14:16 ` [PATCH 2/9] vfs: export do_splice_direct() to modules Miklos Szeredi
2013-03-13 22:45 ` Andrew Morton
2013-03-17 13:06 ` David Howells
2013-03-18 2:31 ` Dave Chinner
2013-03-18 15:39 ` Jan Kara
2013-03-18 21:53 ` Al Viro
2013-03-18 23:01 ` Al Viro
2013-03-19 1:38 ` Al Viro
2013-03-19 9:00 ` J. R. Okajima
2013-03-19 10:29 ` Miklos Szeredi
2013-03-19 11:04 ` David Howells
2013-03-19 11:40 ` Miklos Szeredi
2013-03-19 17:03 ` Al Viro
2013-03-19 18:32 ` Miklos Szeredi
2013-03-19 21:24 ` Al Viro
2013-03-20 9:15 ` Miklos Szeredi
2013-03-19 20:54 ` Jan Kara
2013-03-19 20:25 ` Jan Kara
2013-03-19 21:38 ` Al Viro
2013-03-19 22:10 ` Al Viro
2013-03-20 2:33 ` Al Viro [this message]
2013-03-20 12:30 ` David Howells
2013-03-20 19:52 ` Jan Kara
2013-03-20 21:48 ` Al Viro
2013-03-20 22:19 ` Jan Kara
2013-03-22 17:37 ` J. R. Okajima
2013-03-22 18:11 ` Al Viro
2013-03-22 18:21 ` Al Viro
2013-03-23 2:49 ` J. R. Okajima
2013-03-23 4:41 ` Al Viro
2013-03-23 5:37 ` J. R. Okajima
2013-03-13 14:16 ` [PATCH 3/9] vfs: export __inode_permission() " Miklos Szeredi
2013-03-13 14:16 ` [PATCH 4/9] vfs: introduce clone_private_mount() Miklos Szeredi
2013-03-13 22:48 ` Andrew Morton
2013-03-14 13:28 ` Miklos Szeredi
2013-03-13 14:16 ` [PATCH 5/9] overlay filesystem Miklos Szeredi
2013-03-13 22:53 ` Andrew Morton
2013-03-13 14:16 ` [PATCH 6/9] overlayfs: add statfs support Miklos Szeredi
2013-03-13 14:16 ` [PATCH 7/9] overlayfs: implement show_options Miklos Szeredi
2013-03-13 14:16 ` [PATCH 8/9] overlay: overlay filesystem documentation Miklos Szeredi
2013-03-13 23:06 ` Andrew Morton
2013-03-14 13:35 ` Miklos Szeredi
2013-03-13 14:16 ` [PATCH 9/9] fs: limit filesystem stacking depth Miklos Szeredi
2013-03-13 14:31 ` [PATCH 0/9] overlay filesystem: request for inclusion (v17) Sedat Dilek
2013-03-13 15:13 ` Sedat Dilek
2013-03-13 15:18 ` Miklos Szeredi
2013-03-13 15:26 ` Sedat Dilek
2013-03-13 15:53 ` Sedat Dilek
2013-03-13 16:10 ` Sedat Dilek
2013-03-13 16:21 ` Miklos Szeredi
2013-03-13 16:35 ` Sedat Dilek
2013-03-13 16:51 ` Sedat Dilek
2013-03-13 18:12 ` Robin Holt
2013-03-13 18:37 ` Felix Fietkau
2013-03-13 19:10 ` Sedat Dilek
2013-03-13 19:54 ` Eric W. Biederman
2013-03-13 19:58 ` Linus Torvalds
2013-03-13 20:27 ` Sedat Dilek
[not found] ` <CAB3woddVfZ9PdYPpzidJLBMmUeRx0Rxgb5Pc8bTM9U-tkcS_uA@mail.gmail.com>
2013-03-13 20:32 ` Sedat Dilek
2013-03-13 20:36 ` Phillip Lougher
2013-03-13 23:08 ` Andrew Morton
2013-03-14 13:43 ` Miklos Szeredi
2013-03-15 1:25 ` Al Viro
2013-03-15 4:15 ` J. R. Okajima
2013-03-15 4:44 ` Al Viro
2013-03-15 5:09 ` J. R. Okajima
2013-03-15 5:13 ` Al Viro
2013-03-15 8:15 ` James Bottomley
2013-03-15 12:12 ` Al Viro
2013-03-15 18:57 ` J. R. Okajima
2013-03-15 19:26 ` Erez Zadok
2013-03-15 20:30 ` Al Viro
2013-03-16 13:55 ` J. R. Okajima
2013-03-15 19:11 ` Linus Torvalds
2013-03-16 13:57 ` J. R. Okajima
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130320023308.GM21522@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=akpm@linux-foundation.org \
--cc=apw@canonical.com \
--cc=dhowells@redhat.com \
--cc=ezk@fsl.cs.sunysb.edu \
--cc=hch@infradead.org \
--cc=hooanon05@yahoo.co.jp \
--cc=jack@suse.cz \
--cc=jordipujolp@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=mszeredi@suse.cz \
--cc=nbd@openwrt.org \
--cc=neilb@suse.de \
--cc=sedat.dilek@googlemail.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.