public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [regression] stack overflow in xfs_buf_iodone_callbacks
@ 2012-06-21  9:18 Dave Chinner
  2012-06-21  9:21 ` Christoph Hellwig
  2012-06-21 16:34 ` Christoph Hellwig
  0 siblings, 2 replies; 11+ messages in thread
From: Dave Chinner @ 2012-06-21  9:18 UTC (permalink / raw)
  To: xfs

Folks,

I just had a stack overflow in the delayed write buffer error
handling with a shut down filesystem:

.....
[   20.712744]  [<ffffffff81448023>] xfs_buf_iodone_work+0x23/0x50
[   20.712744]  [<ffffffff814481a0>] xfs_buf_ioend+0x70/0x180
[   20.712744]  [<ffffffff814484c5>] _xfs_buf_ioend+0x25/0x30
[   20.712744]  [<ffffffff81448788>] __xfs_buf_iorequest+0x98/0x130
[   20.712744]  [<ffffffff81448836>] xfs_buf_iorequest+0x16/0x20
[   20.712744]  [<ffffffff81448945>] xfs_bdstrat_cb+0x65/0x110
[   20.712744]  [<ffffffff814b9d7c>] xfs_buf_iodone_callbacks+0x11c/0x290
[   20.712744]  [<ffffffff81448023>] xfs_buf_iodone_work+0x23/0x50
[   20.712744]  [<ffffffff814481a0>] xfs_buf_ioend+0x70/0x180
[   20.712744]  [<ffffffff814484c5>] _xfs_buf_ioend+0x25/0x30
[   20.712744]  [<ffffffff81448788>] __xfs_buf_iorequest+0x98/0x130
[   20.712744]  [<ffffffff81448836>] xfs_buf_iorequest+0x16/0x20
[   20.712744]  [<ffffffff81448945>] xfs_bdstrat_cb+0x65/0x110
[   20.712744]  [<ffffffff814b9d7c>] xfs_buf_iodone_callbacks+0x11c/0x290
[   20.712744]  [<ffffffff81448023>] xfs_buf_iodone_work+0x23/0x50
[   20.712744]  [<ffffffff814481a0>] xfs_buf_ioend+0x70/0x180
[   20.712744]  [<ffffffff814484c5>] _xfs_buf_ioend+0x25/0x30
[   20.712744]  [<ffffffff81448788>] __xfs_buf_iorequest+0x98/0x130
[   20.712744]  [<ffffffff81448836>] xfs_buf_iorequest+0x16/0x20
[   20.712744]  [<ffffffff81448945>] xfs_bdstrat_cb+0x65/0x110
[   20.712744]  [<ffffffff814b9d7c>] xfs_buf_iodone_callbacks+0x11c/0x290
[   20.712744]  [<ffffffff81448023>] xfs_buf_iodone_work+0x23/0x50
[   20.712744]  [<ffffffff814481a0>] xfs_buf_ioend+0x70/0x180
[   20.712744]  [<ffffffff814484c5>] _xfs_buf_ioend+0x25/0x30
[   20.712744]  [<ffffffff81448788>] __xfs_buf_iorequest+0x98/0x130
[   20.712744]  [<ffffffff81448836>] xfs_buf_iorequest+0x16/0x20
[   20.712744]  [<ffffffff81448945>] xfs_bdstrat_cb+0x65/0x110
[   20.712744]  [<ffffffff81448c39>] __xfs_buf_delwri_submit+0x249/0x280
[   20.712744]  [<ffffffff81449920>] xfs_buf_delwri_submit_nowait+0x20/0x30
[   20.712744]  [<ffffffff814bc43e>] xfsaild+0x21e/0x750
[   20.712744]  [<ffffffff810a0472>] kthread+0xa2/0xb0
[   20.712744]  [<ffffffff81b83c64>] kernel_thread_helper+0x4/0x10

Basically, the commit:

43ff212 xfs: on-stack delayed write buffer lists

took away the delay in resubmitting metadata buffers that have
had a write error, and so the xfsbdstrat() resubmission immediately
errors out on the shutdown flag, calling the io completion for teh
buffer that then runs xfs_buf_iodone_callbacks(), that then calls
xfs_bdstrat_cb(), that then errors out on the shutdown flag, calls
io completion, and around it goes in a spiral of death.

I did flag the change to an immediate xfsbdstrat() call as a problem
in review, and mentioned a possible solution to the problem, but it
looks like it fell through the cracks

http://oss.sgi.com/archives/xfs/2012-04/msg00760.html

	"This will just resubmit the IO immediately after it is
	failed, while previously it will only be pushed again after
	it ages out (15s later). Perhaps it can just be left to be
	pushed by the aild next time it passes over it?"

That would definitely prevent the Spiral of Stack Doom that I've
just seen....

I don't have time to come up with a fix for this right now, but it
needs to be fixed before 3.5 releases. I don't have time because I'm
going to be AFK next week, so I'd appreciate it if someone could
look at fixing this in the mean time?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [regression] stack overflow in xfs_buf_iodone_callbacks
  2012-06-21  9:18 [regression] stack overflow in xfs_buf_iodone_callbacks Dave Chinner
@ 2012-06-21  9:21 ` Christoph Hellwig
  2012-06-21  9:29   ` Dave Chinner
  2012-06-21 10:06   ` Stan Hoeppner
  2012-06-21 16:34 ` Christoph Hellwig
  1 sibling, 2 replies; 11+ messages in thread
From: Christoph Hellwig @ 2012-06-21  9:21 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Thu, Jun 21, 2012 at 07:18:03PM +1000, Dave Chinner wrote:
> I don't have time to come up with a fix for this right now, but it
> needs to be fixed before 3.5 releases. I don't have time because I'm
> going to be AFK next week, so I'd appreciate it if someone could
> look at fixing this in the mean time?

I'll handle it.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [regression] stack overflow in xfs_buf_iodone_callbacks
  2012-06-21  9:21 ` Christoph Hellwig
@ 2012-06-21  9:29   ` Dave Chinner
  2012-06-21 10:06   ` Stan Hoeppner
  1 sibling, 0 replies; 11+ messages in thread
From: Dave Chinner @ 2012-06-21  9:29 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Thu, Jun 21, 2012 at 05:21:04AM -0400, Christoph Hellwig wrote:
> On Thu, Jun 21, 2012 at 07:18:03PM +1000, Dave Chinner wrote:
> > I don't have time to come up with a fix for this right now, but it
> > needs to be fixed before 3.5 releases. I don't have time because I'm
> > going to be AFK next week, so I'd appreciate it if someone could
> > look at fixing this in the mean time?
> 
> I'll handle it.

Thanks, Christoph.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [regression] stack overflow in xfs_buf_iodone_callbacks
  2012-06-21  9:21 ` Christoph Hellwig
  2012-06-21  9:29   ` Dave Chinner
@ 2012-06-21 10:06   ` Stan Hoeppner
  1 sibling, 0 replies; 11+ messages in thread
From: Stan Hoeppner @ 2012-06-21 10:06 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On 6/21/2012 4:21 AM, Christoph Hellwig wrote:
> On Thu, Jun 21, 2012 at 07:18:03PM +1000, Dave Chinner wrote:
>> I don't have time to come up with a fix for this right now, but it
>> needs to be fixed before 3.5 releases. I don't have time because I'm
>> going to be AFK next week, so I'd appreciate it if someone could
>> look at fixing this in the mean time?
> 
> I'll handle it.

Reading this exchange, seeing the teamwork, is really cool, and struck a
chord with me for some reason.  Especially considering that "I'll handle
it" is probably a many hour (or day) commitment, made with 3 simple
words, at the drop of a hat.

Being part of a list where both deep technical development and 'simple'
end user issues are both handled, in a team fashion, is even cooler.

We probably don't say it enough, so I'll do so now.

Thank you guys, all the XFS developers, for the work you do that
benefits us all.  And thanks to all my fellow XFS users, for helping
make this list what I think is a great community, and an example for
other lists to follow.

-- 
Stan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [regression] stack overflow in xfs_buf_iodone_callbacks
  2012-06-21  9:18 [regression] stack overflow in xfs_buf_iodone_callbacks Dave Chinner
  2012-06-21  9:21 ` Christoph Hellwig
@ 2012-06-21 16:34 ` Christoph Hellwig
  2012-06-21 23:24   ` Dave Chinner
  1 sibling, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2012-06-21 16:34 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Thu, Jun 21, 2012 at 07:18:03PM +1000, Dave Chinner wrote:
> Folks,
> 
> I just had a stack overflow in the delayed write buffer error
> handling with a shut down filesystem:

I've looked at this a bit more, and it seems the effect really can't
be an XFS shutdown.  We never do the shutdown check inside
xfs_buf_iorequest.  The issue obviously is real, but could it be that
you had an actual persistent I/O error on the underlying device?

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [regression] stack overflow in xfs_buf_iodone_callbacks
  2012-06-21 16:34 ` Christoph Hellwig
@ 2012-06-21 23:24   ` Dave Chinner
  2012-06-22 16:41     ` Christoph Hellwig
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Chinner @ 2012-06-21 23:24 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Thu, Jun 21, 2012 at 12:34:09PM -0400, Christoph Hellwig wrote:
> On Thu, Jun 21, 2012 at 07:18:03PM +1000, Dave Chinner wrote:
> > Folks,
> > 
> > I just had a stack overflow in the delayed write buffer error
> > handling with a shut down filesystem:
> 
> I've looked at this a bit more, and it seems the effect really can't
> be an XFS shutdown.  We never do the shutdown check inside
> xfs_buf_iorequest.  The issue obviously is real, but could it be that
> you had an actual persistent I/O error on the underlying device?

It may have been - I didn't catch the initial cause of the problem
in my log because it hard-hung the VM and it wasn't in the
scrollback buffer on the console. All I saw was a corruption error,
a shutdown and the stack blowing up.

Still, I think there is a real problem here - any persistent device
error on IO submission can cause this problem to occur....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [regression] stack overflow in xfs_buf_iodone_callbacks
  2012-06-21 23:24   ` Dave Chinner
@ 2012-06-22 16:41     ` Christoph Hellwig
  2012-06-22 23:39       ` Dave Chinner
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2012-06-22 16:41 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, xfs

On Fri, Jun 22, 2012 at 09:24:14AM +1000, Dave Chinner wrote:
> It may have been - I didn't catch the initial cause of the problem
> in my log because it hard-hung the VM and it wasn't in the
> scrollback buffer on the console. All I saw was a corruption error,
> a shutdown and the stack blowing up.
> 
> Still, I think there is a real problem here - any persistent device
> error on IO submission can cause this problem to occur....

Yes, I was just trying to ask what actually happened as your original
explanation didn't seem to be possible.
I think the patch below should be enough as a minimal fix to avoid the

stack overflow for 3.5. We'll need a much bigger overhaul of the buffer
error handling after that, though.


Index: xfs/fs/xfs/xfs_buf.c
===================================================================
--- xfs.orig/fs/xfs/xfs_buf.c	2012-06-22 14:20:46.696568355 +0200
+++ xfs/fs/xfs/xfs_buf.c	2012-06-22 14:21:37.733234717 +0200
@@ -1255,7 +1255,7 @@ xfs_buf_iorequest(
 	 */
 	atomic_set(&bp->b_io_remaining, 1);
 	_xfs_buf_ioapply(bp);
-	_xfs_buf_ioend(bp, 0);
+	_xfs_buf_ioend(bp, 1);
 
 	xfs_buf_rele(bp);
 }

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [regression] stack overflow in xfs_buf_iodone_callbacks
  2012-06-22 16:41     ` Christoph Hellwig
@ 2012-06-22 23:39       ` Dave Chinner
  2012-06-25  9:06         ` Christoph Hellwig
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Chinner @ 2012-06-22 23:39 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Fri, Jun 22, 2012 at 12:41:47PM -0400, Christoph Hellwig wrote:
> On Fri, Jun 22, 2012 at 09:24:14AM +1000, Dave Chinner wrote:
> > It may have been - I didn't catch the initial cause of the problem
> > in my log because it hard-hung the VM and it wasn't in the
> > scrollback buffer on the console. All I saw was a corruption error,
> > a shutdown and the stack blowing up.
> > 
> > Still, I think there is a real problem here - any persistent device
> > error on IO submission can cause this problem to occur....
> 
> Yes, I was just trying to ask what actually happened as your original
> explanation didn't seem to be possible.
> I think the patch below should be enough as a minimal fix to avoid the
> 
> stack overflow for 3.5. We'll need a much bigger overhaul of the buffer
> error handling after that, though.
> 
> 
> Index: xfs/fs/xfs/xfs_buf.c
> ===================================================================
> --- xfs.orig/fs/xfs/xfs_buf.c	2012-06-22 14:20:46.696568355 +0200
> +++ xfs/fs/xfs/xfs_buf.c	2012-06-22 14:21:37.733234717 +0200
> @@ -1255,7 +1255,7 @@ xfs_buf_iorequest(
>  	 */
>  	atomic_set(&bp->b_io_remaining, 1);
>  	_xfs_buf_ioapply(bp);
> -	_xfs_buf_ioend(bp, 0);
> +	_xfs_buf_ioend(bp, 1);

Hmmmm. How often do we get real io completion occurring before we
call _xfs_buf_ioend() here? I can't see that it is common, so this
is probably fine, but perhaps a few numbers might help here? If it
is rare as we think it is, then yeah, that would work....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [regression] stack overflow in xfs_buf_iodone_callbacks
  2012-06-22 23:39       ` Dave Chinner
@ 2012-06-25  9:06         ` Christoph Hellwig
  2012-06-26  2:20           ` Dave Chinner
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2012-06-25  9:06 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, xfs

On Sat, Jun 23, 2012 at 09:39:55AM +1000, Dave Chinner wrote:
> Hmmmm. How often do we get real io completion occurring before we
> call _xfs_buf_ioend() here? I can't see that it is common, so this
> is probably fine, but perhaps a few numbers might help here? If it
> is rare as we think it is, then yeah, that would work....

The only case where I can see it ever hapen is when sending tons
of separate I/Os in one go to a reall fast device, e.g. a very
fragmented large directory to superfast battery backed dram device.

And even then I don't think it matters very much - for reads we
generally do not have an b_iodone handler attached, so for these
the change does not make any different.  For delayed writes the
additional context switch also doesn't have a major impact on
performance, so the only thing where we could see a difference
is synchronous writes, of which we don't have a lot left, and
essentially none unless the shrinkers kick in and need to do
synchronous reclaims.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [regression] stack overflow in xfs_buf_iodone_callbacks
  2012-06-25  9:06         ` Christoph Hellwig
@ 2012-06-26  2:20           ` Dave Chinner
  2012-06-26  7:51             ` Christoph Hellwig
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Chinner @ 2012-06-26  2:20 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Mon, Jun 25, 2012 at 05:06:58AM -0400, Christoph Hellwig wrote:
> On Sat, Jun 23, 2012 at 09:39:55AM +1000, Dave Chinner wrote:
> > Hmmmm. How often do we get real io completion occurring before we
> > call _xfs_buf_ioend() here? I can't see that it is common, so this
> > is probably fine, but perhaps a few numbers might help here? If it
> > is rare as we think it is, then yeah, that would work....
> 
> The only case where I can see it ever hapen is when sending tons
> of separate I/Os in one go to a reall fast device, e.g. a very
> fragmented large directory to superfast battery backed dram device.
> 
> And even then I don't think it matters very much - for reads we
> generally do not have an b_iodone handler attached, so for these
> the change does not make any different.

We will very soon - CRC checks after reading for disk will be done
after reads. The patch series I'm working on at the moment
introduces sanity checks of buffers on read completion - it doesn't
do CRC checks yet, but it moves all the checks we do on read
completion into iodone callbacks, and when CRCs are introduced they
will simply be slotted into those functions....

> For delayed writes the
> additional context switch also doesn't have a major impact on
> performance, so the only thing where we could see a difference
> is synchronous writes, of which we don't have a lot left, and
> essentially none unless the shrinkers kick in and need to do
> synchronous reclaims.

*nod*

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [regression] stack overflow in xfs_buf_iodone_callbacks
  2012-06-26  2:20           ` Dave Chinner
@ 2012-06-26  7:51             ` Christoph Hellwig
  0 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2012-06-26  7:51 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Tue, Jun 26, 2012 at 12:20:11PM +1000, Dave Chinner wrote:
> We will very soon - CRC checks after reading for disk will be done
> after reads. The patch series I'm working on at the moment
> introduces sanity checks of buffers on read completion - it doesn't
> do CRC checks yet, but it moves all the checks we do on read
> completion into iodone callbacks, and when CRCs are introduced they
> will simply be slotted into those functions....

I'm tempted to move calling of b_iodone into the caller that gets
waken by the completion, as that would help to clean a lot of nasty
bits up, but that's still a bit out.

The whole area aound xfs_bdstrat_cb, xfsbdstrat and buffer error handling
is something that needs a fair amount of attention, and I plan to tackle
a lot of it for 3.6.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-06-26  7:51 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-21  9:18 [regression] stack overflow in xfs_buf_iodone_callbacks Dave Chinner
2012-06-21  9:21 ` Christoph Hellwig
2012-06-21  9:29   ` Dave Chinner
2012-06-21 10:06   ` Stan Hoeppner
2012-06-21 16:34 ` Christoph Hellwig
2012-06-21 23:24   ` Dave Chinner
2012-06-22 16:41     ` Christoph Hellwig
2012-06-22 23:39       ` Dave Chinner
2012-06-25  9:06         ` Christoph Hellwig
2012-06-26  2:20           ` Dave Chinner
2012-06-26  7:51             ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox