public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Chris Holcombe <xfactor973@gmail.com>
Cc: xfs@oss.sgi.com
Subject: Re: XFS umount with IO errors seems to lead to memory corruption
Date: Tue, 10 Feb 2015 09:18:29 +1100	[thread overview]
Message-ID: <20150209221829.GX12722@dastard> (raw)
In-Reply-To: <CAGdUFVbs3kTTAtXjHux5sW-e9j82ouaRN3aEm+-AdVs6NLuJ+g@mail.gmail.com>

On Mon, Feb 09, 2015 at 01:24:15PM -0800, Chris Holcombe wrote:
> Hi Dave,
> 
> http://www.spinics.net/lists/linux-xfs/msg00061.html
> Back in Dec 2013 you responded to this message saying that you would
> take a look at it.  Was a fix for this ever issued? 

Yes, it's been fixed, but that's not you problem.

> I'm seeing very
> similar stacktraces:
> 
>  INFO: task umount:29224 blocked for more than 120 seconds.
>        Tainted: G        W     3.13.0-39-generic #66-Ubuntu
>  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>  umount D ffff880c4fc34480     0 29224  29221 0x00000082
>  ffff880201211db0 0000000000000086 ffff880c39cb1800 ffff880201211fd8
>  0000000000014480 0000000000014480 ffff880c39cb1800 ffff880c33386480
>  ffff880c395e4bc8 ffff880c333864c0 ffff880c333864e8 ffff880c33386490
>  Call Trace:
> 
> [<ffffffff81723109>] schedule+0x29/0x70
> [<ffffffffa023b0c9>] xfs_ail_push_all_sync+0xa9/0xe0 [xfs]
> [<ffffffff810aafd0>] ? prepare_to_wait_event+0x100/0x100
> [<ffffffffa0236f13>] xfs_log_quiesce+0x33/0x70 [xfs]
> [<ffffffffa0236f62>] xfs_log_unmount+0x12/0x30 [xfs]
> [<ffffffffa01ed846>] xfs_unmountfs+0xc6/0x150 [xfs]
> [<ffffffffa01ef211>] xfs_fs_put_super+0x21/0x60 [xfs]
> [<ffffffff811bf452>] generic_shutdown_super+0x72/0xf0
> [<ffffffff811bf707>] kill_block_super+0x27/0x70
> [<ffffffff811bf9ed>] deactivate_locked_super+0x3d/0x60
> [<ffffffff811bffa6>] deactivate_super+0x46/0x60
> [<ffffffff811dcd96>] mntput_no_expire+0xd6/0x170
> [<ffffffff811de31e>] SyS_umount+0x8e/0x100
> [<ffffffff8172f7ed>] system_call_fastpath+0x1a/0x1f

That's XFS hung waiting for IO to complete during unmount.

> These type of errors are showing up in the logs:
> 
> XFS (dm-8): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 19 numblks 1

Error 19 = ENODEV.

You pulled the drive out before you tried to unmount?

> XFS (dm-8): Detected failing async write on buffer block 0x0. Retrying async write.

Which means it's detecting that the write is failing, but the higher
level has been told to keep trying until all metadata has been
flushed. We probably need to tweak this slightly....

Eric - this is another case where transient vs permanent error is
somewhat squishy, and treating ENODEV as a permanent error would
solve this issue (i.e. trigger a shutdown). Did you start doing
anything in this area?

AFAICT a ENODEV error on Linux is a permanent error because if you
replug the device it will come back as a different device and the
ENODEV onteh removed device will still persist. However, I'm not
sure what dm-multipath ends up doing in this case - it's supposed to
hide the same devices coming and going, so maybe it won't trigger
this error at all...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2015-02-09 22:18 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-09 21:24 XFS umount with IO errors seems to lead to memory corruption Chris Holcombe
2015-02-09 22:18 ` Dave Chinner [this message]
2015-02-09 22:25   ` Eric Sandeen
     [not found] <CAOcd+r3i0mDK2vAnZ-0s6VGnSsJwWxnEB2uMrcz+WSJAxx2bmA@mail.gmail.com>
2013-11-21 22:07 ` Dave Chinner
2013-11-24 10:27   ` Alex Lyakas
2013-12-10  7:36     ` Alex Lyakas
2013-12-11  0:40       ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150209221829.GX12722@dastard \
    --to=david@fromorbit.com \
    --cc=xfactor973@gmail.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox