From: Mark Tinguely <tinguely@sgi.com>
To: Dave Chinner <david@fromorbit.com>
Cc: bob.mastors@solidfire.com, snitzer@redhat.com, xfs@oss.sgi.com
Subject: Re: [PATCH] xfs: unmount does not wait for shutdown during unmount
Date: Thu, 10 Apr 2014 08:29:00 -0500 [thread overview]
Message-ID: <53469C9C.8010604@sgi.com> (raw)
In-Reply-To: <1397104955-7247-1-git-send-email-david@fromorbit.com>
On 04/09/14 23:42, Dave Chinner wrote:
> From: Dave Chinner<dchinner@redhat.com>
>
> And interesting situation can occur if a log IO error occurs during
> the unmount of a filesystem. The cases reported have the same
> signature - the update of the superblock counters fails due to a log
> write IO error:
>
> XFS (dm-16): xfs_do_force_shutdown(0x2) called from line 1170 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa08a44a1
> XFS (dm-16): Log I/O Error Detected. Shutting down filesystem
> XFS (dm-16): Unable to update superblock counters. Freespace may not be correct on next mount.
> XFS (dm-16): xfs_log_force: error 5 returned.
> XFS (¿-¿¿¿): Please umount the filesystem and rectify the problem(s)
>
> It can be seen that the last line of output contains a corrupt
> device name - this is because the log and xfs_mount structures have
> already been freed by the time this message is printed. A kernel
> oops closely follows.
>
> The issue is that the shutdown is occurring in a separate IO
> completion thread to the unmount. Once the shutdown processing has
> started and all the iclogs are marked with XLOG_STATE_IOERROR, the
> log shutdown code wakes anyone waiting on a log force so they can
> process the shutdown error. This wakes up the unmount code that
> is doing a synchronous transaction to update the superblock
> counters.
>
> The unmount path now sees all the iclogs are marked with
> XLOG_STATE_IOERROR and so never waits on them again, knowing that if
> it does, there will not be a wakeup trigger for it and we will hang
> the unmount if we do. Hence the unmount runs through all the
> remaining code and frees all the filesystem structures while the
> xlog_iodone() is still processing the shutdown. When the log
> shutdown processing completes, xfs_do_force_shutdown() emits the
> "Please umount the filesystem and rectify the problem(s)" message,
> and xlog_iodone() then aborts all the objects attached to the iclog.
> An iclog that has already been freed....
>
> The real issue here is that there is no serialisation point between
> the log IO and the unmount. We have serialisations points for log
> writes, log forces, reservations, etc, but we don't actually have
> any code that wakes for log IO to fully complete. We do that for all
> other types of object, so why not iclogbufs?
>
> Well, it turns out that we can easily do this. We've got xfs_buf
> handles, and that's what everyone else uses for IO serialisation.
> i.e. bp->b_sema. So, lets hold iclogbufs locked over IO, and only
> release the lock in xlog_iodone() when we are finished with the
> buffer. That way before we tear down the iclog, we can lock and
> unlock the buffer to ensure IO completion has finished completely
> before we tear it down.
>
> Signed-off-by: Dave Chinner<dchinner@redhat.com>
The wait queue "xc_commit_wait" is used for two purposes, first to start
the next ic_log buffer completion and also to wake those waiting for a
syncronous event. Shutdown does syncronous cil pushes but it does not
wait for the IO to happen.
Why not wait for the IO to happen or fail before waking out the sync
waiters? If you want to keep the speedier completion of the next cil
push add another wait queue. There only a few (typically 8) per filesystem.
--Mark.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2014-04-10 13:29 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-10 4:42 [PATCH] xfs: unmount does not wait for shutdown during unmount Dave Chinner
2014-04-10 13:29 ` Mark Tinguely [this message]
2014-04-10 21:52 ` Dave Chinner
2014-04-10 16:25 ` Mike Snitzer
2014-04-11 19:21 ` [PATCH] " Bob Mastors
2014-04-14 8:21 ` Dave Chinner
2014-04-14 19:28 ` Brian Foster
2014-04-15 2:15 ` Dave Chinner
2014-04-15 14:59 ` Brian Foster
2014-04-16 14:27 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53469C9C.8010604@sgi.com \
--to=tinguely@sgi.com \
--cc=bob.mastors@solidfire.com \
--cc=david@fromorbit.com \
--cc=snitzer@redhat.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).