From: Dave Chinner <david@fromorbit.com>
To: Bob Mastors <bob.mastors@solidfire.com>
Cc: xfs@oss.sgi.com
Subject: Re: xfs umount with i/o error hang/memory corruption
Date: Sat, 5 Apr 2014 08:20:16 +1100 [thread overview]
Message-ID: <20140404212016.GY17603@dastard> (raw)
In-Reply-To: <CALjwKZAJ-R8dS13Rsj3+K3hM9p0z08qvi4ZVTYbDWKT1Biu=-Q@mail.gmail.com>
On Fri, Apr 04, 2014 at 12:15:23PM -0600, Bob Mastors wrote:
> Greetings,
>
> I am new to xfs and am running into a problem
> and would appreciate any guidance on how to proceed.
>
> After an i/o error from the block device that xfs is using,
> an umount results in a message like:
> [ 370.636473] XFS (sdx): Log I/O Error Detected. Shutting down filesystem
> [ 370.644073] XFS (h ���h"h ���H#h ���bsg):
> Please umount the filesystem and rectify the problem(s)
> Note the garbage on the previous line which suggests memory corruption.
> About half the time I get the garbled log message. About half the time
> umount hangs.
I got an email about this last night with a different trigger - thin
provisioning failing log IO in the unmount path. I know what the
problem is, I just don't have a fix for it yet.
To confirm it's the same problem, can you post the entirity of the
dmesg where the error occurs?
In essence, the log IO failure is triggering a shutdown, and as part
of the shutdown process it wakes anyone waiting on a log force.
The log quiesce code that waits for log completion during unmount
uses a log force to ensure the log is idle before tearing down all
the log structures and finishing the unmount. Unfortunatey, the log
force the unmount blocks on is woken prematurely by the shutdown,
and hence it runs before the log IO processing is completed. Hence
the use after free.
> And then I get this kind of error and the system is unresponsive:
> Message from syslogd@debian at Apr 4 09:27:40 ...
> kernel:[ 680.080022] BUG: soft lockup - CPU#2 stuck for 22s! [umount:2849]
>
> The problem appears to be similar to this issue:
> http://www.spinics.net/lists/linux-xfs/msg00061.html
Similar symptoms, but not the same bug.
> The problem is triggered by doing an iscsi logout which causes
> the block device to return i/o errors to xfs.
> Steps to reproduce the problem are below.
Seems similar to the thinp ENOSPC issue I mentioned above - data IO
errors occur, then you do an unmount, which causes a log IO error
writing the superblock, and then this happens....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2014-04-04 21:20 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-04 18:15 xfs umount with i/o error hang/memory corruption Bob Mastors
2014-04-04 18:50 ` Stan Hoeppner
2014-04-04 19:47 ` Bob Mastors
2014-04-04 21:20 ` Dave Chinner [this message]
2014-04-04 21:40 ` Bob Mastors
2014-04-04 21:57 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140404212016.GY17603@dastard \
--to=david@fromorbit.com \
--cc=bob.mastors@solidfire.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox