From: Dave Chinner <david@fromorbit.com>
To: xfs@oss.sgi.com
Subject: [regression] stack overflow in xfs_buf_iodone_callbacks
Date: Thu, 21 Jun 2012 19:18:03 +1000 [thread overview]
Message-ID: <20120621091803.GB10673@dastard> (raw)
Folks,
I just had a stack overflow in the delayed write buffer error
handling with a shut down filesystem:
.....
[ 20.712744] [<ffffffff81448023>] xfs_buf_iodone_work+0x23/0x50
[ 20.712744] [<ffffffff814481a0>] xfs_buf_ioend+0x70/0x180
[ 20.712744] [<ffffffff814484c5>] _xfs_buf_ioend+0x25/0x30
[ 20.712744] [<ffffffff81448788>] __xfs_buf_iorequest+0x98/0x130
[ 20.712744] [<ffffffff81448836>] xfs_buf_iorequest+0x16/0x20
[ 20.712744] [<ffffffff81448945>] xfs_bdstrat_cb+0x65/0x110
[ 20.712744] [<ffffffff814b9d7c>] xfs_buf_iodone_callbacks+0x11c/0x290
[ 20.712744] [<ffffffff81448023>] xfs_buf_iodone_work+0x23/0x50
[ 20.712744] [<ffffffff814481a0>] xfs_buf_ioend+0x70/0x180
[ 20.712744] [<ffffffff814484c5>] _xfs_buf_ioend+0x25/0x30
[ 20.712744] [<ffffffff81448788>] __xfs_buf_iorequest+0x98/0x130
[ 20.712744] [<ffffffff81448836>] xfs_buf_iorequest+0x16/0x20
[ 20.712744] [<ffffffff81448945>] xfs_bdstrat_cb+0x65/0x110
[ 20.712744] [<ffffffff814b9d7c>] xfs_buf_iodone_callbacks+0x11c/0x290
[ 20.712744] [<ffffffff81448023>] xfs_buf_iodone_work+0x23/0x50
[ 20.712744] [<ffffffff814481a0>] xfs_buf_ioend+0x70/0x180
[ 20.712744] [<ffffffff814484c5>] _xfs_buf_ioend+0x25/0x30
[ 20.712744] [<ffffffff81448788>] __xfs_buf_iorequest+0x98/0x130
[ 20.712744] [<ffffffff81448836>] xfs_buf_iorequest+0x16/0x20
[ 20.712744] [<ffffffff81448945>] xfs_bdstrat_cb+0x65/0x110
[ 20.712744] [<ffffffff814b9d7c>] xfs_buf_iodone_callbacks+0x11c/0x290
[ 20.712744] [<ffffffff81448023>] xfs_buf_iodone_work+0x23/0x50
[ 20.712744] [<ffffffff814481a0>] xfs_buf_ioend+0x70/0x180
[ 20.712744] [<ffffffff814484c5>] _xfs_buf_ioend+0x25/0x30
[ 20.712744] [<ffffffff81448788>] __xfs_buf_iorequest+0x98/0x130
[ 20.712744] [<ffffffff81448836>] xfs_buf_iorequest+0x16/0x20
[ 20.712744] [<ffffffff81448945>] xfs_bdstrat_cb+0x65/0x110
[ 20.712744] [<ffffffff81448c39>] __xfs_buf_delwri_submit+0x249/0x280
[ 20.712744] [<ffffffff81449920>] xfs_buf_delwri_submit_nowait+0x20/0x30
[ 20.712744] [<ffffffff814bc43e>] xfsaild+0x21e/0x750
[ 20.712744] [<ffffffff810a0472>] kthread+0xa2/0xb0
[ 20.712744] [<ffffffff81b83c64>] kernel_thread_helper+0x4/0x10
Basically, the commit:
43ff212 xfs: on-stack delayed write buffer lists
took away the delay in resubmitting metadata buffers that have
had a write error, and so the xfsbdstrat() resubmission immediately
errors out on the shutdown flag, calling the io completion for teh
buffer that then runs xfs_buf_iodone_callbacks(), that then calls
xfs_bdstrat_cb(), that then errors out on the shutdown flag, calls
io completion, and around it goes in a spiral of death.
I did flag the change to an immediate xfsbdstrat() call as a problem
in review, and mentioned a possible solution to the problem, but it
looks like it fell through the cracks
http://oss.sgi.com/archives/xfs/2012-04/msg00760.html
"This will just resubmit the IO immediately after it is
failed, while previously it will only be pushed again after
it ages out (15s later). Perhaps it can just be left to be
pushed by the aild next time it passes over it?"
That would definitely prevent the Spiral of Stack Doom that I've
just seen....
I don't have time to come up with a fix for this right now, but it
needs to be fixed before 3.5 releases. I don't have time because I'm
going to be AFK next week, so I'd appreciate it if someone could
look at fixing this in the mean time?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next reply other threads:[~2012-06-21 9:18 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-21 9:18 Dave Chinner [this message]
2012-06-21 9:21 ` [regression] stack overflow in xfs_buf_iodone_callbacks Christoph Hellwig
2012-06-21 9:29 ` Dave Chinner
2012-06-21 10:06 ` Stan Hoeppner
2012-06-21 16:34 ` Christoph Hellwig
2012-06-21 23:24 ` Dave Chinner
2012-06-22 16:41 ` Christoph Hellwig
2012-06-22 23:39 ` Dave Chinner
2012-06-25 9:06 ` Christoph Hellwig
2012-06-26 2:20 ` Dave Chinner
2012-06-26 7:51 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120621091803.GB10673@dastard \
--to=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox