From: Jan Kara <jack@suse.cz>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Jan Kara <jack@suse.cz>, David Howells <dhowells@redhat.com>,
Miklos Szeredi <miklos@szeredi.hu>,
torvalds@linux-foundation.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, hch@infradead.org,
akpm@linux-foundation.org, apw@canonical.com, nbd@openwrt.org,
neilb@suse.de, jordipujolp@gmail.com, ezk@fsl.cs.sunysb.edu,
sedat.dilek@googlemail.com, hooanon05@yahoo.co.jp,
mszeredi@suse.cz
Subject: Re: [PATCH 2/9] vfs: export do_splice_direct() to modules
Date: Wed, 20 Mar 2013 20:52:22 +0100 [thread overview]
Message-ID: <20130320195222.GG13294@quack.suse.cz> (raw)
In-Reply-To: <20130320023308.GM21522@ZenIV.linux.org.uk>
On Wed 20-03-13 02:33:08, Al Viro wrote:
> On Tue, Mar 19, 2013 at 10:10:32PM +0000, Al Viro wrote:
>
> > OK, it's going to be an interesting series - aforementioned tentative patch
> > was badly incomplete ;-/
>
> The interesting question is how far do we want to lift that. ->aio_write()
> part is trivial - see vfs.git#experimental; the trouble begins with
> ->splice_write(). For *everything* except default_file_splice_write(),
> lifting into the caller (do_splice_from()) is the right thing to do.
>
> default_file_splice_write(), however, it trickier; there we end up calling
> vfs_write() (via an ugly callchain). And _that_ is a real bitch. Granted,
> vfs_write() is somewhat an overkill there (we'd already done rw_verify_area()
> and access_ok() is pointless due to set_fs() we do around vfs_write()
> there) and we'd already lifted it up to do_sync_write(). But if we lift
> it any further, we'll need to deal with ->write() callers in the tree.
> Current situation:
>
> fs/coredump.c:662: return access_ok(VERIFY_READ, addr, nr) && file->f_op->write(file, addr, nr, &file->f_pos) == nr;
> arch/powerpc/platforms/cell/spufs/coredump.c:63: written = file->f_op->write(file, addr, nr, &file->f_pos);
>
> for these guys we might actually want to lift all way up to do_coredump()
>
> drivers/staging/comedi/drivers/serial2002.c:91: result = f->f_op->write(f, buf, count, &f->f_pos);
> fs/autofs4/waitq.c:73: (wr = file->f_op->write(file,data,bytes,&file->f_pos)) > 0) {
>
> not regular files, unless I'm seriously misreading the code.
>
> kernel/acct.c:553: file->f_op->write(file, (char *)&ac,
>
> BTW, this is probably where we want to deal with your acct deadlock.
>
> fs/compat.c:1103: fn = (io_fn_t)file->f_op->write;
> fs/read_write.c:435: ret = file->f_op->write(file, buf, count, pos);
> fs/read_write.c:732: fn = (io_fn_t)file->f_op->write;
>
> syscalls - the question here is whether we lift it up to vfs_write/vfs_writev/
> compat_writev, or actually take it further.
>
> fs/cachefiles/rdwr.c:967: ret = file->f_op->write(
>
> cachefiles_write_page(); no fucking idea what locks might be held by caller
> and potentially that's a rather nasty source of PITA
>
> fs/coda/file.c:84: ret = host_file->f_op->write(host_file, buf, count, ppos);
>
> coda writing to file in cache on local fs. Potentially a nasty bugger, since
> it's hard to lift any further - the caller has no idea that the thing is
> on CODA, let alone what happens to hold the local cache.
>
> drivers/block/loop.c:234: bw = file->f_op->write(file, buf, len, &pos);
>
> do_bio_filebacked(), with some ugliness between that and callsite. Note,
> BTW, that we have a pair of possible vfs_fsync() calls in there; how do those
> interact with freeze?
Freezing code takes care that all dirty data is synced before fs is
frozen and no new dirty data can be created before fs is thawed. So
vfs_fsync() should just return without doing anything on frozen filesystem.
> This does *not* touch the current callers of vfs_write()/vfs_writev(); any of
> those called while holding ->i_mutex on a directory (or mnt_want_write(), for
> that matter) is a deadlock right now.
>
> And we'd better start thinking about how we'll backport that crap - deadlock
> in e.g. xfs ->splice_write() had been there since last summer ;-/
Yeah but noone really noticed because in practice the code isn't
stressed much. Much bigger problems had been there for years before they
were fixed last summer without anybody complaining... So I'm not sure how
hard do we want to try to backport this.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
next prev parent reply other threads:[~2013-03-20 19:52 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-13 14:16 [PATCH 0/9] overlay filesystem: request for inclusion (v17) Miklos Szeredi
2013-03-13 14:16 ` [PATCH 1/9] vfs: add i_op->dentry_open() Miklos Szeredi
2013-03-13 22:44 ` Andrew Morton
2013-03-14 11:15 ` Miklos Szeredi
2013-03-13 14:16 ` [PATCH 2/9] vfs: export do_splice_direct() to modules Miklos Szeredi
2013-03-13 22:45 ` Andrew Morton
2013-03-13 14:16 ` [PATCH 3/9] vfs: export __inode_permission() " Miklos Szeredi
2013-03-13 14:16 ` [PATCH 4/9] vfs: introduce clone_private_mount() Miklos Szeredi
2013-03-13 22:48 ` Andrew Morton
2013-03-14 13:28 ` Miklos Szeredi
2013-03-13 14:16 ` [PATCH 5/9] overlay filesystem Miklos Szeredi
2013-03-13 22:53 ` Andrew Morton
2013-03-13 14:16 ` [PATCH 6/9] overlayfs: add statfs support Miklos Szeredi
2013-03-13 14:16 ` [PATCH 7/9] overlayfs: implement show_options Miklos Szeredi
2013-03-13 14:16 ` [PATCH 8/9] overlay: overlay filesystem documentation Miklos Szeredi
2013-03-13 23:06 ` Andrew Morton
2013-03-14 13:35 ` Miklos Szeredi
2013-03-13 14:16 ` [PATCH 9/9] fs: limit filesystem stacking depth Miklos Szeredi
2013-03-13 14:31 ` [PATCH 0/9] overlay filesystem: request for inclusion (v17) Sedat Dilek
2013-03-13 15:13 ` Sedat Dilek
2013-03-13 15:18 ` Miklos Szeredi
2013-03-13 15:26 ` Sedat Dilek
2013-03-13 15:53 ` Sedat Dilek
2013-03-13 16:10 ` Sedat Dilek
2013-03-13 16:21 ` Miklos Szeredi
2013-03-13 16:35 ` Sedat Dilek
2013-03-13 16:51 ` Sedat Dilek
2013-03-13 18:12 ` Robin Holt
2013-03-13 18:37 ` Felix Fietkau
2013-03-13 19:10 ` Sedat Dilek
2013-03-13 19:54 ` Eric W. Biederman
2013-03-13 19:58 ` Linus Torvalds
2013-03-13 20:27 ` Sedat Dilek
[not found] ` <CAB3woddVfZ9PdYPpzidJLBMmUeRx0Rxgb5Pc8bTM9U-tkcS_uA@mail.gmail.com>
2013-03-13 20:32 ` Sedat Dilek
2013-03-13 20:36 ` Phillip Lougher
2013-03-13 23:08 ` Andrew Morton
2013-03-14 13:43 ` Miklos Szeredi
2013-03-15 1:25 ` Al Viro
2013-03-15 4:15 ` J. R. Okajima
2013-03-15 4:44 ` Al Viro
2013-03-15 5:09 ` J. R. Okajima
2013-03-15 5:13 ` Al Viro
2013-03-15 8:15 ` James Bottomley
2013-03-15 12:12 ` Al Viro
2013-03-15 18:57 ` J. R. Okajima
2013-03-15 19:26 ` Erez Zadok
2013-03-15 20:30 ` Al Viro
2013-03-16 13:55 ` J. R. Okajima
2013-03-15 19:11 ` Linus Torvalds
2013-03-16 13:57 ` J. R. Okajima
2013-03-17 13:06 ` [PATCH 2/9] vfs: export do_splice_direct() to modules David Howells
2013-03-18 2:31 ` Dave Chinner
2013-03-18 15:39 ` Jan Kara
2013-03-18 21:53 ` Al Viro
2013-03-18 23:01 ` Al Viro
2013-03-19 1:38 ` Al Viro
2013-03-19 9:00 ` J. R. Okajima
2013-03-19 10:29 ` Miklos Szeredi
2013-03-19 17:03 ` Al Viro
2013-03-19 18:32 ` Miklos Szeredi
2013-03-19 21:24 ` Al Viro
2013-03-20 9:15 ` Miklos Szeredi
2013-03-19 11:04 ` David Howells
2013-03-19 11:40 ` Miklos Szeredi
2013-03-19 20:54 ` Jan Kara
2013-03-19 20:25 ` Jan Kara
2013-03-19 21:38 ` Al Viro
2013-03-19 22:10 ` Al Viro
2013-03-20 2:33 ` Al Viro
2013-03-20 19:52 ` Jan Kara [this message]
2013-03-20 21:48 ` Al Viro
2013-03-20 22:19 ` Jan Kara
2013-03-20 12:30 ` David Howells
2013-03-22 17:37 ` J. R. Okajima
2013-03-22 18:11 ` Al Viro
2013-03-22 18:21 ` Al Viro
2013-03-23 2:49 ` J. R. Okajima
2013-03-23 4:41 ` Al Viro
2013-03-23 5:37 ` J. R. Okajima
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130320195222.GG13294@quack.suse.cz \
--to=jack@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=apw@canonical.com \
--cc=dhowells@redhat.com \
--cc=ezk@fsl.cs.sunysb.edu \
--cc=hch@infradead.org \
--cc=hooanon05@yahoo.co.jp \
--cc=jordipujolp@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=mszeredi@suse.cz \
--cc=nbd@openwrt.org \
--cc=neilb@suse.de \
--cc=sedat.dilek@googlemail.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).