From: Ben Myers <bpm@sgi.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH] xfs: use s_umount sema in xfs_sync_worker
Date: Wed, 16 May 2012 12:04:03 -0500 [thread overview]
Message-ID: <20120516170402.GD3963@sgi.com> (raw)
In-Reply-To: <20120516015626.GN25351@dastard>
Hey Dave,
On Wed, May 16, 2012 at 11:56:26AM +1000, Dave Chinner wrote:
> On Mon, May 14, 2012 at 03:34:49PM -0500, Ben Myers wrote:
> > I'm still hitting this on a regular basis. Here is some analysis from a recent
> > crash dump which you may want to skip. The fix is at the end.
> ....
> > ===================================================================
> >
> > xfs: use s_umount sema in xfs_sync_worker
> >
> > xfs_sync_worker checks the MS_ACTIVE flag in sb->s_flags to avoid doing work
> > during mount and unmount. This flag can be cleared by unmount after the
> > xfs_sync_worker checks it but before the work is completed.
>
> Then there are problems all over the place in different filesystems
> if the straight MS_ACTIVE check is not sufficient.
Eh, I won't speak to the problems in other filesystems. ;)
MS_ACTIVE certainly isn't adequate in the case before us.
> > Protect xfs_sync_worker by using the s_umount semaphore at the read level to
> > provide exclusion with unmount while work is progressing.
>
> I don't think that is the right fix for the given problem.
>
> The problem is, as you've stated:
>
> "Looks like the problem is that the sync worker is still running
> after the log has been torn down, and it calls xfs_fs_log_dummy
> which generates log traffic."
That's one problem, but we also want to protect against running this code at
mount time. s_umount sema is the tool that can do both. Maybe there are some
other options.
> Why did we allow a new transaction to start while/after the log was
> torn down?
> Isn't that the problem we need to fix because it leads to
> invalid entries in the physical log that might cause recovery
> failures?
> Further, any asynchronous worker thread that does
> transactions could have this same problem regardless of whether we
> are umounting or cleaning up after a failed mount, so it is not
> unique to the xfs_sync_worker....
> That is, if we've already started to tear down or torn down the log,
> we must not allow new transactions to start. Likewise, we can't
> finalise tear down the log until transactions in progress have
> completed. Using the s_umount lock here avoids then race, but it
> really is a VFS level lock not an filesystem level lock) and is,
> IMO, just papering over the real problem....
I think you have a good point here, but this isn't limited to transactions. We
shouldn't even call xfs_log_need_covered without some protection from
xfs_fs_put_super; xfs_fs_writable doesn't cut the mustard. I'd better have a
look at the other workqueues too. Thanks for pointing this out.
Regards,
Ben
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2012-05-16 16:59 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-23 17:43 BUG in xlog_get_lowest_lsn Ben Myers
2012-05-14 20:34 ` [PATCH] xfs: use s_umount sema in xfs_sync_worker Ben Myers
2012-05-15 18:30 ` Mark Tinguely
2012-05-15 19:06 ` Ben Myers
2012-05-16 1:56 ` Dave Chinner
2012-05-16 17:04 ` Ben Myers [this message]
2012-05-17 7:16 ` Dave Chinner
2012-05-23 9:02 ` Dave Chinner
2012-05-23 16:45 ` Ben Myers
2012-05-24 22:39 ` Ben Myers
2012-05-25 20:45 ` [PATCH] xfs: shutdown xfs_sync_worker before the log Ben Myers
2012-05-29 15:07 ` Ben Myers
2012-05-29 15:36 ` Brian Foster
2012-05-29 17:04 ` Ben Myers
2012-05-29 17:54 ` Brian Foster
2012-05-31 16:23 ` Mark Tinguely
2012-06-06 4:26 ` Dave Chinner
2012-06-11 20:45 ` Ben Myers
2012-06-11 21:11 ` Mark Tinguely
2012-06-11 23:36 ` Dave Chinner
2012-06-14 17:13 ` Mark Tinguely
2012-06-14 23:56 ` Dave Chinner
2012-06-20 7:44 ` Christoph Hellwig
2012-06-20 7:36 ` Christoph Hellwig
2012-06-20 17:18 ` Ben Myers
2012-06-20 22:59 ` Dave Chinner
2012-06-21 7:12 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120516170402.GD3963@sgi.com \
--to=bpm@sgi.com \
--cc=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox