linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Nick Piggin <npiggin@kernel.dk>
Cc: Jan Kara <jack@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Ted Ts'o <tytso@mit.edu>, Eric Sandeen <sandeen@redhat.com>,
	linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
	linux-btrfs@vger.kernel.org
Subject: Re: [patch] fix up lock order reversal in writeback
Date: Mon, 22 Nov 2010 19:16:55 +0100	[thread overview]
Message-ID: <20101122181655.GF5012@quack.suse.cz> (raw)
In-Reply-To: <20101119051619.GE3284@amd>

On Fri 19-11-10 16:16:19, Nick Piggin wrote:
> On Fri, Nov 19, 2010 at 01:45:52AM +0100, Jan Kara wrote:
> > On Wed 17-11-10 22:28:34, Andrew Morton wrote:
> > > The fact that a call to ->write_begin can randomly return with s_umount
> > > held, to be randomly released at some random time in the future is a
> > > bit ugly, isn't it?  write_begin is a pretty low-level, per-inode
> > > thing.
> >   I guess you missed that writeback_inodes_sb_nr() (called from _if_idle
> > variants) does:
> >         bdi_queue_work(sb->s_bdi, &work);
> >         wait_for_completion(&done);
> >   So we return only after all the IO has been submitted and unlock s_umount
> > in writeback_inodes_sb_if_idle(). And we cannot really submit the IO ourselves
> > because we are holding i_mutex and we need to get and put references
> > to other inodes while doing writeback (those would be really horrible lock
> > dependencies - writeback thread can put the last reference to an unlinked
> > inode...).
> 
> But if we're waiting for it, with the lock held, then surely it can
> deadlock just the same as if we submit it ourself?
  Yes, that's what I realized as well and what needs fixing. It was an
unintentional consequence of locking changes Christoph did to the writeback
code to fix other races.

> BTW. are you taking i_mutex inside writeback? I mutex can be held
> while entering page reclaim, and thus writepage... so it could be a
> bug too.
  No, writeback does not need i_mutex.

> > In fact, as I'm speaking about it, pushing things to writeback thread and
> > waiting on the work does not help a bit with the locking issues (we didn't
> > wait for the work previously but that had other issues). Bug, sigh.
> > 
> > What might be better interface for usecases like above is to allow
> > filesystem to kick flusher thread to start doing background writeback
> > (regardless of dirty limits). Then the caller can wait for some delayed
> > allocation reservations to get freed (easy enough to check in
> > ->writepage() and wake the waiters) - possibly with a reasonable timeout
> > so that we don't stall forever.
> 
> We really need to throttle the producer without any locks held, no?
> So the filesystem would like to to hook into dirtying path somewhere
> without i_mutex held (and without implementing your own .write). Eg.
> around the dirty throttling calls somewhere I suppose.
  Well, the particular case where we need to do delayed allocation but the
filesystem has already all blocks reserved is hard to detect without any
locks. At least we need some locks to find out how many blocks do we need
to reserve for that write(). So for filesystems it would solve the locking
issues if we could bail out of write() path up to the point where we hold
no locks, wait there / do necessary writeback and then restart the write...

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2010-11-22 18:16 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-16 11:00 [patch] fix up lock order reversal in writeback Nick Piggin
2010-11-16 13:01 ` Jan Kara
2010-11-17  4:30   ` Eric Sandeen
2010-11-17  4:38     ` Nick Piggin
2010-11-17  5:05       ` Eric Sandeen
2010-11-17  6:10         ` Nick Piggin
2010-11-18  3:06           ` Ted Ts'o
2010-11-18  3:29             ` Andrew Morton
2010-11-18  6:00               ` Nick Piggin
2010-11-18  6:28                 ` Andrew Morton
2010-11-18  8:18                   ` Nick Piggin
2010-11-18 10:51                     ` Theodore Tso
2010-11-18 17:58                     ` Andrew Morton
2010-11-19  5:10                       ` Nick Piggin
2010-11-19 12:07                         ` Theodore Tso
2010-11-18 14:55                   ` Eric Sandeen
2010-11-18 17:10                     ` Andrew Morton
2010-11-18 18:04                       ` Eric Sandeen
2010-11-18 18:24                         ` Eric Sandeen
2010-11-18 18:39                           ` Chris Mason
2010-11-18 18:36                         ` Andrew Morton
2010-11-18 18:51                           ` Chris Mason
2010-11-18 20:22                             ` Andrew Morton
2010-11-18 20:36                               ` Chris Mason
2010-11-18 19:02                           ` Eric Sandeen
2010-11-18 20:17                             ` Andrew Morton
2010-11-18 18:33                   ` Chris Mason
2010-11-18 23:58                     ` Jan Kara
2010-11-19  0:45                   ` Jan Kara
2010-11-19  5:16                     ` Nick Piggin
2010-11-22 18:16                       ` Jan Kara [this message]
2010-11-23  8:07                         ` Nick Piggin
2010-11-23 13:32                           ` Jan Kara
2010-11-23  8:15                         ` Nick Piggin
2010-11-18 18:53             ` Al Viro
2010-11-18  3:18           ` Eric Sandeen
2010-11-22 23:43             ` Andrew Morton
2010-11-16 20:32 ` Andrew Morton
2010-11-17  3:56   ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101122181655.GF5012@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=npiggin@kernel.dk \
    --cc=sandeen@redhat.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).