From: Jan Kara <jack@suse.cz>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Jan Kara <jack@suse.cz>, David Howells <dhowells@redhat.com>,
Miklos Szeredi <miklos@szeredi.hu>,
torvalds@linux-foundation.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, hch@infradead.org,
akpm@linux-foundation.org, apw@canonical.com, nbd@openwrt.org,
neilb@suse.de, jordipujolp@gmail.com, ezk@fsl.cs.sunysb.edu,
sedat.dilek@googlemail.com, hooanon05@yahoo.co.jp,
mszeredi@suse.cz
Subject: Re: [PATCH 2/9] vfs: export do_splice_direct() to modules
Date: Tue, 19 Mar 2013 21:25:43 +0100 [thread overview]
Message-ID: <20130319202543.GF5222@quack.suse.cz> (raw)
In-Reply-To: <20130318215333.GE21522@ZenIV.linux.org.uk>
On Mon 18-03-13 21:53:34, Al Viro wrote:
> On Mon, Mar 18, 2013 at 04:39:36PM +0100, Jan Kara wrote:
> > IMO the deadlock is real. In freeze_super() we wait for all writers to
> > the filesystem to finish while blocking beginning of any further writes. So
> > we have a deadlock scenario like:
> >
> > THREAD1 THREAD2 THREAD3
> > mnt_want_write() mutex_lock(&inode->i_mutex);
> > ... freeze_super()
> > block on mutex_lock(&inode->i_mutex)
> > sb_wait_write(sb, SB_FREEZE_WRITE);
> > block in sb_start_write()
>
> The bug is on fsfreeze side and this is not the only problem related to it.
> I've missed the implications when I applied "fs: Add freezing handling
> to mnt_want_write() / mnt_drop_write()" last June ;-/
>
> The thing is, until then mnt_want_write() used to be a counter; it could be
> nested. Now any such nesting is a deadlock you've just described. This
> is seriously wrong, IMO.
Well, but sb_start_write() has to be blocking (blocks when fs is frozen)
and you have to get it somewhere. It seems only natural to get the counter
from original mnt_want_write() at the same place and use one function for
that. Whether I should have changed the name from mnt_want_write() to
something else is questionable...
> BTW, having sb_start_write() buried in individual ->splice_write() is
> asking for trouble; could you describe the rules for that? E.g. where
> does it nest wrt filesystem-private locks? XFS iolock, for example...
Generally, the freeze protection should be the outermost lock taken (so
that we mitigate possibility of blocking readers when waiting for fs to
unfreeze). So it ranks above i_mutex, or XFS' ilock and iolock.
It seems that I screwed this up for ->splice_write() :-| If we are going to
move out sb_start_write() out of filesystems' hands into do_splice_from()
then we should likely do the same with ->aio_write(). Hmm?
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
next prev parent reply other threads:[~2013-03-19 20:25 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-13 14:16 [PATCH 0/9] overlay filesystem: request for inclusion (v17) Miklos Szeredi
2013-03-13 14:16 ` [PATCH 1/9] vfs: add i_op->dentry_open() Miklos Szeredi
2013-03-13 22:44 ` Andrew Morton
2013-03-14 11:15 ` Miklos Szeredi
2013-03-13 14:16 ` [PATCH 2/9] vfs: export do_splice_direct() to modules Miklos Szeredi
2013-03-13 22:45 ` Andrew Morton
2013-03-13 14:16 ` [PATCH 3/9] vfs: export __inode_permission() " Miklos Szeredi
2013-03-13 14:16 ` [PATCH 4/9] vfs: introduce clone_private_mount() Miklos Szeredi
2013-03-13 22:48 ` Andrew Morton
2013-03-14 13:28 ` Miklos Szeredi
2013-03-13 14:16 ` [PATCH 5/9] overlay filesystem Miklos Szeredi
2013-03-13 22:53 ` Andrew Morton
2013-03-13 14:16 ` [PATCH 6/9] overlayfs: add statfs support Miklos Szeredi
2013-03-13 14:16 ` [PATCH 7/9] overlayfs: implement show_options Miklos Szeredi
2013-03-13 14:16 ` [PATCH 8/9] overlay: overlay filesystem documentation Miklos Szeredi
2013-03-13 23:06 ` Andrew Morton
2013-03-14 13:35 ` Miklos Szeredi
2013-03-13 14:16 ` [PATCH 9/9] fs: limit filesystem stacking depth Miklos Szeredi
2013-03-13 14:31 ` [PATCH 0/9] overlay filesystem: request for inclusion (v17) Sedat Dilek
2013-03-13 15:13 ` Sedat Dilek
2013-03-13 15:18 ` Miklos Szeredi
2013-03-13 15:26 ` Sedat Dilek
2013-03-13 15:53 ` Sedat Dilek
2013-03-13 16:10 ` Sedat Dilek
2013-03-13 16:21 ` Miklos Szeredi
2013-03-13 16:35 ` Sedat Dilek
2013-03-13 16:51 ` Sedat Dilek
2013-03-13 18:12 ` Robin Holt
2013-03-13 18:37 ` Felix Fietkau
2013-03-13 19:10 ` Sedat Dilek
2013-03-13 19:54 ` Eric W. Biederman
2013-03-13 19:58 ` Linus Torvalds
2013-03-13 20:27 ` Sedat Dilek
[not found] ` <CAB3woddVfZ9PdYPpzidJLBMmUeRx0Rxgb5Pc8bTM9U-tkcS_uA@mail.gmail.com>
2013-03-13 20:32 ` Sedat Dilek
2013-03-13 20:36 ` Phillip Lougher
2013-03-13 23:08 ` Andrew Morton
2013-03-14 13:43 ` Miklos Szeredi
2013-03-15 1:25 ` Al Viro
2013-03-15 4:15 ` J. R. Okajima
2013-03-15 4:44 ` Al Viro
2013-03-15 5:09 ` J. R. Okajima
2013-03-15 5:13 ` Al Viro
2013-03-15 8:15 ` James Bottomley
2013-03-15 12:12 ` Al Viro
2013-03-15 18:57 ` J. R. Okajima
2013-03-15 19:26 ` Erez Zadok
2013-03-15 20:30 ` Al Viro
2013-03-16 13:55 ` J. R. Okajima
2013-03-15 19:11 ` Linus Torvalds
2013-03-16 13:57 ` J. R. Okajima
2013-03-17 13:06 ` [PATCH 2/9] vfs: export do_splice_direct() to modules David Howells
2013-03-18 2:31 ` Dave Chinner
2013-03-18 15:39 ` Jan Kara
2013-03-18 21:53 ` Al Viro
2013-03-18 23:01 ` Al Viro
2013-03-19 1:38 ` Al Viro
2013-03-19 9:00 ` J. R. Okajima
2013-03-19 10:29 ` Miklos Szeredi
2013-03-19 17:03 ` Al Viro
2013-03-19 18:32 ` Miklos Szeredi
2013-03-19 21:24 ` Al Viro
2013-03-20 9:15 ` Miklos Szeredi
2013-03-19 11:04 ` David Howells
2013-03-19 11:40 ` Miklos Szeredi
2013-03-19 20:54 ` Jan Kara
2013-03-19 20:25 ` Jan Kara [this message]
2013-03-19 21:38 ` Al Viro
2013-03-19 22:10 ` Al Viro
2013-03-20 2:33 ` Al Viro
2013-03-20 19:52 ` Jan Kara
2013-03-20 21:48 ` Al Viro
2013-03-20 22:19 ` Jan Kara
2013-03-20 12:30 ` David Howells
2013-03-22 17:37 ` J. R. Okajima
2013-03-22 18:11 ` Al Viro
2013-03-22 18:21 ` Al Viro
2013-03-23 2:49 ` J. R. Okajima
2013-03-23 4:41 ` Al Viro
2013-03-23 5:37 ` J. R. Okajima
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130319202543.GF5222@quack.suse.cz \
--to=jack@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=apw@canonical.com \
--cc=dhowells@redhat.com \
--cc=ezk@fsl.cs.sunysb.edu \
--cc=hch@infradead.org \
--cc=hooanon05@yahoo.co.jp \
--cc=jordipujolp@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=mszeredi@suse.cz \
--cc=nbd@openwrt.org \
--cc=neilb@suse.de \
--cc=sedat.dilek@googlemail.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).