All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ilya Dryomov <idryomov@gmail.com>
To: Miao Xie <miaox@cn.fujitsu.com>
Cc: Chris Mason <chris.mason@oracle.com>,
	Christoph Hellwig <hch@infradead.org>,
	viro <viro@zeniv.linux.org.uk>,
	Linux Btrfs <linux-btrfs@vger.kernel.org>,
	Linux Fsdevel <linux-fsdevel@vger.kernel.org>,
	Ito <t-itoh@jp.fujitsu.com>
Subject: Re: [PATCH 2/2] Btrfs: fix deadlock on sb->s_umount when doing umount
Date: Wed, 7 Dec 2011 13:11:58 +0200	[thread overview]
Message-ID: <20111207111158.GA4929@zambezi.lan> (raw)
In-Reply-To: <4EDED007.2070904@cn.fujitsu.com>

On Wed, Dec 07, 2011 at 10:31:35AM +0800, Miao Xie wrote:
> On tue, 6 Dec 2011 16:36:11 -0500, Chris Mason wrote:
> > On Tue, Dec 06, 2011 at 06:23:23AM -0500, Christoph Hellwig wrote:
> >> On Tue, Dec 06, 2011 at 07:06:40PM +0800, Miao Xie wrote:
> >>>> I can't see why you need the writeout when the trylocks fails.  Umount
> >>>> needs to take care of writing out all pending file data anyway, so doing
> >>>> it from the cleaner thread in addition doesn't sound like it would help.
> >>>
> >>> umount invokes sync_fs() and write out all the dirty file data. For the
> >>> other file systems, its OK because the file system does not introduce dirty pages
> >>> by itself. But btrfs is different. Its automatic defragment will make lots of dirty
> >>> pages after sync_fs() and reserve lots of meta-data space for those pages.
> >>> And then the cleaner thread may find there is no enough space to reserve, it must
> >>> sync the dirty file data and release the reserved space which is for the dirty
> >>> file data.
> >>
> >> I think the safest way to fix is is to write out all dirty data again
> >> once the cleaner thread has been safely stopped.
> >>
> > 
> > Said another way we want to stop the autodefrag code before the unmount
> > is ready to continue.  We also want to stop balancing, scrub etc.
> 
> But there is no good interface to do it before umount gets s_umount lock.
> I think trylock(in writeback_inodes_sb_nr_if_idle()) + dirty data flush
> can help us to fix the bug perfectly.

But it won't fix the umount while balancing family of deadlocks (they
are really of the same nature, vfs grabs s_umount mutex and we need it
to proceed).  (Balance cancelling code is part of restriper patches,
it's just a hook in close_ctree() that waits until we are done
relocating a chunk - very similar to cleaner wait)

One example would be that balancing code while dirtying pages calls
balance_dirty_pages_ratelimited() for each dirtied page, as it should.
And if balance_dirty_pages() then decides to initiate writeback we are
stuck schedule()ing forever, because writeback can't proceed w/o
read-taking s_umount mutex which is fully held by vfs - it just skips
the relocation inode.

Thanks,

		Ilya

  reply	other threads:[~2011-12-07 11:11 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-06  5:35 [PATCH 2/2] Btrfs: fix deadlock on sb->s_umount when doing umount Miao Xie
2011-12-06  5:49 ` Al Viro
2011-12-06  6:52   ` Miao Xie
2011-12-06  9:59 ` Christoph Hellwig
2011-12-06 11:06   ` Miao Xie
2011-12-06 11:23     ` Christoph Hellwig
2011-12-06 21:36       ` Chris Mason
2011-12-07  2:31         ` Miao Xie
2011-12-07 11:11           ` Ilya Dryomov [this message]
2011-12-08  3:46             ` Miao Xie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111207111158.GA4929@zambezi.lan \
    --to=idryomov@gmail.com \
    --cc=chris.mason@oracle.com \
    --cc=hch@infradead.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=miaox@cn.fujitsu.com \
    --cc=t-itoh@jp.fujitsu.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.