linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Jan Kara <jack@suse.cz>
Cc: David Howells <dhowells@redhat.com>,
	Miklos Szeredi <miklos@szeredi.hu>,
	torvalds@linux-foundation.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, hch@infradead.org,
	akpm@linux-foundation.org, apw@canonical.com, nbd@openwrt.org,
	neilb@suse.de, jordipujolp@gmail.com, ezk@fsl.cs.sunysb.edu,
	sedat.dilek@googlemail.com, hooanon05@yahoo.co.jp,
	mszeredi@suse.cz
Subject: Re: [PATCH 2/9] vfs: export do_splice_direct() to modules
Date: Wed, 20 Mar 2013 02:33:08 +0000	[thread overview]
Message-ID: <20130320023308.GM21522@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20130319221032.GL21522@ZenIV.linux.org.uk>

On Tue, Mar 19, 2013 at 10:10:32PM +0000, Al Viro wrote:

> OK, it's going to be an interesting series - aforementioned tentative patch
> was badly incomplete ;-/

The interesting question is how far do we want to lift that.  ->aio_write()
part is trivial - see vfs.git#experimental; the trouble begins with
->splice_write().  For *everything* except default_file_splice_write(),
lifting into the caller (do_splice_from()) is the right thing to do.

default_file_splice_write(), however, it trickier; there we end up calling
vfs_write() (via an ugly callchain).  And _that_ is a real bitch.  Granted,
vfs_write() is somewhat an overkill there (we'd already done rw_verify_area()
and access_ok() is pointless due to set_fs() we do around vfs_write()
there) and we'd already lifted it up to do_sync_write().  But if we lift
it any further, we'll need to deal with ->write() callers in the tree.
Current situation:

fs/coredump.c:662:      return access_ok(VERIFY_READ, addr, nr) && file->f_op->write(file, addr, nr, &file->f_pos) == nr;
arch/powerpc/platforms/cell/spufs/coredump.c:63:        written = file->f_op->write(file, addr, nr, &file->f_pos);

for these guys we might actually want to lift all way up to do_coredump()

drivers/staging/comedi/drivers/serial2002.c:91: result = f->f_op->write(f, buf, count, &f->f_pos);
fs/autofs4/waitq.c:73:         (wr = file->f_op->write(file,data,bytes,&file->f_pos)) > 0) {

not regular files, unless I'm seriously misreading the code.

kernel/acct.c:553:      file->f_op->write(file, (char *)&ac,

BTW, this is probably where we want to deal with your acct deadlock.

fs/compat.c:1103:               fn = (io_fn_t)file->f_op->write;
fs/read_write.c:435:                    ret = file->f_op->write(file, buf, count, pos);
fs/read_write.c:732:            fn = (io_fn_t)file->f_op->write;

syscalls - the question here is whether we lift it up to vfs_write/vfs_writev/
compat_writev, or actually take it further.

fs/cachefiles/rdwr.c:967:                       ret = file->f_op->write(

cachefiles_write_page(); no fucking idea what locks might be held by caller
and potentially that's a rather nasty source of PITA

fs/coda/file.c:84:      ret = host_file->f_op->write(host_file, buf, count, ppos);

coda writing to file in cache on local fs.  Potentially a nasty bugger, since
it's hard to lift any further - the caller has no idea that the thing is
on CODA, let alone what happens to hold the local cache.

drivers/block/loop.c:234:       bw = file->f_op->write(file, buf, len, &pos);

do_bio_filebacked(), with some ugliness between that and callsite.  Note,
BTW, that we have a pair of possible vfs_fsync() calls in there; how do those
interact with freeze?

This does *not* touch the current callers of vfs_write()/vfs_writev(); any of
those called while holding ->i_mutex on a directory (or mnt_want_write(), for
that matter) is a deadlock right now.

And we'd better start thinking about how we'll backport that crap - deadlock
in e.g. xfs ->splice_write() had been there since last summer ;-/

  reply	other threads:[~2013-03-20  2:33 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-13 14:16 [PATCH 0/9] overlay filesystem: request for inclusion (v17) Miklos Szeredi
2013-03-13 14:16 ` [PATCH 1/9] vfs: add i_op->dentry_open() Miklos Szeredi
2013-03-13 22:44   ` Andrew Morton
2013-03-14 11:15     ` Miklos Szeredi
2013-03-13 14:16 ` [PATCH 2/9] vfs: export do_splice_direct() to modules Miklos Szeredi
2013-03-13 22:45   ` Andrew Morton
2013-03-13 14:16 ` [PATCH 3/9] vfs: export __inode_permission() " Miklos Szeredi
2013-03-13 14:16 ` [PATCH 4/9] vfs: introduce clone_private_mount() Miklos Szeredi
2013-03-13 22:48   ` Andrew Morton
2013-03-14 13:28     ` Miklos Szeredi
2013-03-13 14:16 ` [PATCH 5/9] overlay filesystem Miklos Szeredi
2013-03-13 22:53   ` Andrew Morton
2013-03-13 14:16 ` [PATCH 6/9] overlayfs: add statfs support Miklos Szeredi
2013-03-13 14:16 ` [PATCH 7/9] overlayfs: implement show_options Miklos Szeredi
2013-03-13 14:16 ` [PATCH 8/9] overlay: overlay filesystem documentation Miklos Szeredi
2013-03-13 23:06   ` Andrew Morton
2013-03-14 13:35     ` Miklos Szeredi
2013-03-13 14:16 ` [PATCH 9/9] fs: limit filesystem stacking depth Miklos Szeredi
2013-03-13 14:31 ` [PATCH 0/9] overlay filesystem: request for inclusion (v17) Sedat Dilek
2013-03-13 15:13   ` Sedat Dilek
2013-03-13 15:18     ` Miklos Szeredi
2013-03-13 15:26       ` Sedat Dilek
2013-03-13 15:53         ` Sedat Dilek
2013-03-13 16:10           ` Sedat Dilek
2013-03-13 16:21             ` Miklos Szeredi
2013-03-13 16:35               ` Sedat Dilek
2013-03-13 16:51               ` Sedat Dilek
2013-03-13 18:12                 ` Robin Holt
2013-03-13 18:37                   ` Felix Fietkau
2013-03-13 19:10                     ` Sedat Dilek
2013-03-13 19:54                       ` Eric W. Biederman
2013-03-13 19:58                         ` Linus Torvalds
2013-03-13 20:27                           ` Sedat Dilek
     [not found]             ` <CAB3woddVfZ9PdYPpzidJLBMmUeRx0Rxgb5Pc8bTM9U-tkcS_uA@mail.gmail.com>
2013-03-13 20:32               ` Sedat Dilek
2013-03-13 20:36             ` Phillip Lougher
2013-03-13 23:08 ` Andrew Morton
2013-03-14 13:43   ` Miklos Szeredi
2013-03-15  1:25     ` Al Viro
2013-03-15  4:15       ` J. R. Okajima
2013-03-15  4:44         ` Al Viro
2013-03-15  5:09           ` J. R. Okajima
2013-03-15  5:13             ` Al Viro
2013-03-15  8:15               ` James Bottomley
2013-03-15 12:12                 ` Al Viro
2013-03-15 18:57                   ` J. R. Okajima
2013-03-15 19:26                     ` Erez Zadok
2013-03-15 20:30                     ` Al Viro
2013-03-16 13:55                       ` J. R. Okajima
2013-03-15 19:11             ` Linus Torvalds
2013-03-16 13:57               ` J. R. Okajima
2013-03-17 13:06 ` [PATCH 2/9] vfs: export do_splice_direct() to modules David Howells
2013-03-18  2:31   ` Dave Chinner
2013-03-18 15:39   ` Jan Kara
2013-03-18 21:53     ` Al Viro
2013-03-18 23:01       ` Al Viro
2013-03-19  1:38         ` Al Viro
2013-03-19  9:00           ` J. R. Okajima
2013-03-19 10:29           ` Miklos Szeredi
2013-03-19 17:03             ` Al Viro
2013-03-19 18:32               ` Miklos Szeredi
2013-03-19 21:24                 ` Al Viro
2013-03-20  9:15                   ` Miklos Szeredi
2013-03-19 11:04           ` David Howells
2013-03-19 11:40             ` Miklos Szeredi
2013-03-19 20:54         ` Jan Kara
2013-03-19 20:25       ` Jan Kara
2013-03-19 21:38         ` Al Viro
2013-03-19 22:10           ` Al Viro
2013-03-20  2:33             ` Al Viro [this message]
2013-03-20 19:52               ` Jan Kara
2013-03-20 21:48                 ` Al Viro
2013-03-20 22:19                   ` Jan Kara
2013-03-20 12:30             ` David Howells
2013-03-22 17:37   ` J. R. Okajima
2013-03-22 18:11     ` Al Viro
2013-03-22 18:21       ` Al Viro
2013-03-23  2:49         ` J. R. Okajima
2013-03-23  4:41           ` Al Viro
2013-03-23  5:37             ` J. R. Okajima

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130320023308.GM21522@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=akpm@linux-foundation.org \
    --cc=apw@canonical.com \
    --cc=dhowells@redhat.com \
    --cc=ezk@fsl.cs.sunysb.edu \
    --cc=hch@infradead.org \
    --cc=hooanon05@yahoo.co.jp \
    --cc=jack@suse.cz \
    --cc=jordipujolp@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=mszeredi@suse.cz \
    --cc=nbd@openwrt.org \
    --cc=neilb@suse.de \
    --cc=sedat.dilek@googlemail.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).