linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Christoph Hellwig <hch@infradead.org>
Cc: Kamal Mostafa <kamal@canonical.com>, Jan Kara <jack@suse.cz>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Matthew Wilcox <matthew@wil.cx>,
	Randy Dunlap <rdunlap@xenotime.net>, Theodore Tso <tytso@mit.edu>,
	linux-doc@vger.kernel.org, linux-ext4@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Surbhi Palande <csurbhi@gmail.com>,
	Valerie Aurora <val@vaaconsulting.com>,
	Christopher Chaltain <christopher.chaltain@canonical.com>,
	"Peter M. Petrakis" <peter.petrakis@canonical.com>,
	Mikulas Patocka <mpatocka@redhat.com>,
	Miao Xie <miaox@cn.fujitsu.com>
Subject: Re: [PATCH 3/5 resend] VFS: Fix s_umount thaw/write deadlock
Date: Thu, 8 Dec 2011 00:16:58 +0100	[thread overview]
Message-ID: <20111207231658.GQ4622@quack.suse.cz> (raw)
In-Reply-To: <20111206113544.GA21589@infradead.org>

On Tue 06-12-11 06:35:44, Christoph Hellwig wrote:
> On Mon, Dec 05, 2011 at 12:54:47PM -0800, Kamal Mostafa wrote:
> > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> > index 73c3992..dbeaede 100644
> > --- a/fs/fs-writeback.c
> > +++ b/fs/fs-writeback.c
> > @@ -554,6 +554,14 @@ static long writeback_sb_inodes(struct super_block *sb,
> >  	while (!list_empty(&wb->b_io)) {
> >  		struct inode *inode = wb_inode(wb->b_io.prev);
> >  
> > +		if (inode->i_sb == sb && vfs_is_frozen(sb)) {
> > +			/*
> > +			 * Inode is on a frozen superblock; skip it for now.
> > +			 */
> > +			redirty_tail(inode, wb);
> > +			continue;
> > +		}
> > +
> 
> We make sure to not dirty any new inodes after the first phase of the
> freeze, so this should be a BUG_ON/WARN_ON.
  This is not really true in presence of mmaped writes. To block mmaped
writes on a frozen filesystem, we need some synchronization between
page_mkwrite() and freezing code. Currently, to avoid any additional
locking overhead, we set page dirty and *then* check for filesystem being
frozen. Only this order can make sure either the page is written (and
write-protected) or the frozen check triggers and we wait... (see the
comment in block_page_mkwrite()). The nasty sideeffect of this is that
there can be dirty pages & inodes on a frozen filesystem. We are blocked in
the page fault of these pages so user cannot write any data to these pages
but still they are marked dirty.

Alternatively we could have a different mechanism (rw semaphore?) to
synchronize page faults and freezing but I'd hate the overhead for the case
almost noone cares about...

> While we're at it:  can anyway suggest a less cumbersome name than
> writeback_inodes_sb_if_idle?  I'd go for
> try_to_writeback_inodes_sb(_nr).
  Sounds good.

> > index 35f4b0e..47983d9 100644
> > --- a/fs/quota/quota.c
> > +++ b/fs/quota/quota.c
> > @@ -47,7 +47,7 @@ static int check_quotactl_permission(struct super_block *sb, int type, int cmd,
> >  
> >  static void quota_sync_one(struct super_block *sb, void *arg)
> >  {
> > -	if (sb->s_qcop && sb->s_qcop->quota_sync)
> > +	if (sb->s_qcop && sb->s_qcop->quota_sync && !vfs_is_frozen(sb))
> >  		sb->s_qcop->quota_sync(sb, *(int *)arg, 1);
> >  }
> 
> Again, this should be a BUG_ON/WARN_ON.  We shouldn't redirty quotas
> after the freeze.
  You cannot really turn the check into WARN_ON because we iterate over all
superblocks and only in ->quota_sync() check whether something is dirty.
But the check seems to be unnecessary, that is correct.

> > @@ -368,8 +368,18 @@ SYSCALL_DEFINE4(quotactl, unsigned int, cmd, const char __user *, special,
> >  		goto out;
> >  	}
> >  
> > +	/*
> > +	 * It's not clear which quota functions may write to the file
> > +	 * system (all?).  Check for a frozen fs and bail out now.
> > +	 */
> > +	if (vfs_is_frozen(sb)) {
> > +		ret = -EBUSY;
> > +		goto out_drop_super;
> > +	}
> 
> How about spending the three minutes to figure it out?
> Q_GETFMT/Q_GETINFO/Q_XGETQSTAT and Q_GETQUOTA are the obvious read-only
> candidates.
  Q_GETQUOTA can actually cause filesystem modification (reservation of
space in quota file) but the others are read-only. Also after some thought
I'd prefer that quotactl(8) just blocks to be consistent with how other
syscalls behave...

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2011-12-07 23:17 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-05 20:54 [PATCH 0/5 resend] fix s_umount thaw/write and journal deadlock Kamal Mostafa
2011-12-05 20:54 ` [PATCH 1/5 resend] Adding support to freeze and unfreeze a journal Kamal Mostafa
2011-12-06 14:25   ` Matthew Wilcox
2011-12-05 20:54 ` [PATCH 2/5 resend] Thaw the journal when you unfreeze the fs Kamal Mostafa
2011-12-06 11:07   ` Matt Fleming
2011-12-06 16:22     ` Eric Sandeen
2011-12-05 20:54 ` [PATCH 3/5 resend] VFS: Fix s_umount thaw/write deadlock Kamal Mostafa
2011-12-06 11:35   ` Christoph Hellwig
2011-12-07 23:16     ` Jan Kara [this message]
2011-12-07 23:49       ` Matthew Wilcox
2011-12-08  0:05         ` Jan Kara
2011-12-09 11:47       ` Christoph Hellwig
2011-12-05 20:54 ` [PATCH 4/5 resend] VFS: Rename vfs_check_frozen() to vfs_block_until_thawed() Kamal Mostafa
2011-12-06 11:40   ` Christoph Hellwig
2011-12-07 22:46     ` Jan Kara
2011-12-05 20:54 ` [PATCH 5/5 resend] Documentation: Correct s_umount state for freeze_fs/unfreeze_fs Kamal Mostafa
2011-12-06 11:40   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111207231658.GQ4622@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=adilger.kernel@dilger.ca \
    --cc=christopher.chaltain@canonical.com \
    --cc=csurbhi@gmail.com \
    --cc=hch@infradead.org \
    --cc=kamal@canonical.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthew@wil.cx \
    --cc=miaox@cn.fujitsu.com \
    --cc=mpatocka@redhat.com \
    --cc=peter.petrakis@canonical.com \
    --cc=rdunlap@xenotime.net \
    --cc=tytso@mit.edu \
    --cc=val@vaaconsulting.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).