* Re: [PATCH] xfs: fix broken error handling in xfs_vm_writepage
[not found] <1353625410-1413-1-git-send-email-peterhuewe@gmx.de>
@ 2012-11-23 1:01 ` Dave Chinner
2012-11-23 7:44 ` Peter Hüwe
0 siblings, 1 reply; 4+ messages in thread
From: Dave Chinner @ 2012-11-23 1:01 UTC (permalink / raw)
To: Peter Huewe; +Cc: Ben Myers, stable, xfs
[add xfs@oss.sgi.com cc]
On Fri, Nov 23, 2012 at 12:03:30AM +0100, Peter Huewe wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> When we shut down the filesystem, it might first be detected in
> writeback when we are allocating a inode size transaction. This
> happens after we have moved all the pages into the writeback state
> and unlocked them. Unfortunately, if we fail to set up the
> transaction we then abort writeback and try to invalidate the
> current page. This then triggers are BUG() in block_invalidatepage()
> because we are trying to invalidate an unlocked page.
>
> Fixing this is a bit of a chicken and egg problem - we can't
> allocate the transaction until we've clustered all the pages into
> the IO and we know the size of it (i.e. whether the last block of
> the IO is beyond the current EOF or not). However, we don't want to
> hold pages locked for long periods of time, especially while we lock
> other pages to cluster them into the write.
>
> To fix this, we need to make a clear delineation in writeback where
> errors can only be handled by IO completion processing. That is,
> once we have marked a page for writeback and unlocked it, we have to
> report errors via IO completion because we've already started the
> IO. We may not have submitted any IO, but we've changed the page
> state to indicate that it is under IO so we must now use the IO
> completion path to report errors.
>
> To do this, add an error field to xfs_submit_ioend() to pass it the
> error that occurred during the building on the ioend chain. When
> this is non-zero, mark each ioend with the error and call
> xfs_finish_ioend() directly rather than building bios. This will
> immediately push the ioends through completion processing with the
> error that has occurred.
>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> Reviewed-by: Mark Tinguely <tinguely@sgi.com>
> Signed-off-by: Ben Myers <bpm@sgi.com>
Any particular reason you picked this patch for a backport and not
many of the other fixes that went into the 3.7 series?
As it is, this problem is not that easy to hit, and I'm wary of
backporting changes to the io completion/Io submission error
handling paths to stable kernels without wider testing of the fix
(i.e. release of 3.7 and then a couple of weeks of people using it).
That's the reason why I didn't put a cc to the stable kernel on the
commit in the first place.
Sometimes there's good reason for being cautious about
backporting fixes to stable kernels - if the problem is not being
reported by users then letting the fixes get out into the real world
for a while before backporting them to the stable kernels is the
right approach. Stable kernels are supposed to be stable, and as
such we want to be certain that changes are not going to have
unintneded consequences and then have to rush more fixes back to the
stable kernels because we broke them....
Cheers,
Dave.
--
Dave Chinner
dchinner@redhat.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] xfs: fix broken error handling in xfs_vm_writepage
2012-11-23 1:01 ` [PATCH] xfs: fix broken error handling in xfs_vm_writepage Dave Chinner
@ 2012-11-23 7:44 ` Peter Hüwe
0 siblings, 0 replies; 4+ messages in thread
From: Peter Hüwe @ 2012-11-23 7:44 UTC (permalink / raw)
To: Dave Chinner; +Cc: Ben Myers, stable, xfs
Hi Dave,
Am Freitag, 23. November 2012, 02:01:23 schrieb Dave Chinner:
> Any particular reason you picked this patch for a backport and not
> many of the other fixes that went into the 3.7 series?
Mainly two reasons:
time and before spending many hours trying to 'backport' all this stuff, I first
wanted to see what the response would be like in general.
I'm still new to the stable kernel business, so I already expected that there
will be some learning curve ;)
Maybe I should add a "Learners Sticker" to my first xx stable related messages
:P
So I really appreciate your feedback.
>
> As it is, this problem is not that easy to hit, and I'm wary of
> backporting changes to the io completion/Io submission error
> handling paths to stable kernels without wider testing of the fix
> (i.e. release of 3.7 and then a couple of weeks of people using it).
> That's the reason why I didn't put a cc to the stable kernel on the
> commit in the first place.
>
> Sometimes there's good reason for being cautious about
> backporting fixes to stable kernels - if the problem is not being
> reported by users then letting the fixes get out into the real world
> for a while before backporting them to the stable kernels is the
> right approach. Stable kernels are supposed to be stable, and as
> such we want to be certain that changes are not going to have
> unintneded consequences and then have to rush more fixes back to the
> stable kernels because we broke them....
As stated in the other mail, I was a bit too eager here as well ;)
We should probably wait with the inclusion - so sorry for the noise.
Thanks,
PeterH
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH] xfs: fix broken error handling in xfs_vm_writepage
@ 2012-11-12 0:49 Dave Chinner
2012-11-12 0:58 ` Dave Chinner
0 siblings, 1 reply; 4+ messages in thread
From: Dave Chinner @ 2012-11-12 0:49 UTC (permalink / raw)
To: xfs
From: Dave Chinner <dchinner@redhat.com>
When we shut down the filesystem, it might first be detected in
writeback when we are allocating a inode size transaction. This
happens after we have moved all the pages into the writeback state
and unlocked them. Unfortunately, if we fail to set up the
transaction we then abort writeback and try to invalidate the
current page. This then triggers are BUG() in block_invalidatepage()
because we are trying to invalidate an unlocked page.
Fixing this is a bit of a chicken and egg problem - we can't
allocate the transaction until we've clustered all the pages into
the IO and we know the size of it (i.e. whether the last block of
the IO is beyond the current EOF or not). However, we don't want to
hold pages locked for long periods of time, especially while we lock
other pages to cluster them into the write.
To fix this, we need to make a clear delineation in writeback where
errors can only be handled by IO completion processing. That is,
once we have marked a page for writeback and unlocked it, we have to
report errors via IO completion because we've already started the
IO. We may not have submitted any IO, but we've changed the page
state to indicate that it is under IO so we must now use the IO
completion path to report errors.
To do this, add an error field to xfs_submit_ioend() to pass it the
error that occurred during the building on the ioend chain. When
this is non-zero, mark each ioend with the error and call
xfs_finish_ioend() directly rather than building bios. This will
immediately push the ioends through completion processing with the
error that has occurred.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_aops.c | 54 ++++++++++++++++++++++++++++++++++++++---------------
1 file changed, 39 insertions(+), 15 deletions(-)
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 42ef842..71361da 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -481,11 +481,17 @@ static inline int bio_add_buffer(struct bio *bio, struct buffer_head *bh)
*
* The fix is two passes across the ioend list - one to start writeback on the
* buffer_heads, and then submit them for I/O on the second pass.
+ *
+ * If @fail is non-zero, it means that we have a situation where some part of
+ * the submission process has failed after we have marked paged for writeback
+ * and unlocked them. In this situation, we need to fail the ioend chain rather
+ * than submit it to IO. This typically only happens on a filesystem shutdown.
*/
STATIC void
xfs_submit_ioend(
struct writeback_control *wbc,
- xfs_ioend_t *ioend)
+ xfs_ioend_t *ioend,
+ int fail)
{
xfs_ioend_t *head = ioend;
xfs_ioend_t *next;
@@ -506,6 +512,18 @@ xfs_submit_ioend(
next = ioend->io_list;
bio = NULL;
+ /*
+ * If we are failing the IO now, just mark the ioend with an
+ * error and finish it. This will run IO completion immediately
+ * as there is only one reference to the ioend at this point in
+ * time.
+ */
+ if (fail) {
+ ioend->io_error = -fail;
+ xfs_finish_ioend(ioend);
+ continue;
+ }
+
for (bh = ioend->io_buffer_head; bh; bh = bh->b_private) {
if (!bio) {
@@ -1060,7 +1078,18 @@ xfs_vm_writepage(
xfs_start_page_writeback(page, 1, count);
- if (ioend && imap_valid) {
+ /* if there is no IO to be submitted for this page, we are done */
+ if (!ioend)
+ return 0;
+
+ ASSERT(iohead);
+
+ /*
+ * Any errors from this point onwards need tobe reported through the IO
+ * completion path as we have marked the initial page as under writeback
+ * and unlocked it.
+ */
+ if (imap_valid) {
xfs_off_t end_index;
end_index = imap.br_startoff + imap.br_blockcount;
@@ -1079,20 +1108,15 @@ xfs_vm_writepage(
wbc, end_index);
}
- if (iohead) {
- /*
- * Reserve log space if we might write beyond the on-disk
- * inode size.
- */
- if (ioend->io_type != XFS_IO_UNWRITTEN &&
- xfs_ioend_is_append(ioend)) {
- err = xfs_setfilesize_trans_alloc(ioend);
- if (err)
- goto error;
- }
- xfs_submit_ioend(wbc, iohead);
- }
+ /*
+ * Reserve log space if we might write beyond the on-disk inode size.
+ */
+ err = 0;
+ if (ioend->io_type != XFS_IO_UNWRITTEN && xfs_ioend_is_append(ioend))
+ err = xfs_setfilesize_trans_alloc(ioend);
+
+ xfs_submit_ioend(wbc, iohead, err);
return 0;
--
1.7.10
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] xfs: fix broken error handling in xfs_vm_writepage
2012-11-12 0:49 Dave Chinner
@ 2012-11-12 0:58 ` Dave Chinner
0 siblings, 0 replies; 4+ messages in thread
From: Dave Chinner @ 2012-11-12 0:58 UTC (permalink / raw)
To: xfs
On Mon, Nov 12, 2012 at 11:49:10AM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> When we shut down the filesystem, it might first be detected in
> writeback when we are allocating a inode size transaction. This
> happens after we have moved all the pages into the writeback state
> and unlocked them. Unfortunately, if we fail to set up the
> transaction we then abort writeback and try to invalidate the
> current page. This then triggers are BUG() in block_invalidatepage()
> because we are trying to invalidate an unlocked page.
FWIW, I found this problem when testing recovery of wrapped log
buffers. The test:
$ cat t.sh
#!/bin/bash
while [ 1 ]; do
mkfs.xfs -f /dev/vdb > /dev/null 2>&1
mount /dev/vdb /mnt/scratch
./compilebench -D /mnt/scratch > /dev/null 2>&1 &
sleep 36
/home/dave/src/xfstests-dev/src/godown /mnt/scratch
sleep 5
umount /mnt/scratch
xfs_logprint -d /dev/vdb |grep -B 1 "^\["
mount /dev/vdb /mnt/scratch
umount /mnt/scratch
done
would fail after 3-4 iterations due to the BUG() in
block_invalidatepage(). This fix has been running that loop for 2
hours now, so it's gone through over a hundred iterations without
failing now - it takes about 45s an iteration to run. Note that this
is also exercising the wrapped log buffer recovery fix on every
iteration, too.... :)
And FWIW, this probably should have a cc: <stable@vger.kernel.org>
on it as well, as it is a recent regression that turns a shutdown
into hard failure....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2012-11-23 7:33 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1353625410-1413-1-git-send-email-peterhuewe@gmx.de>
2012-11-23 1:01 ` [PATCH] xfs: fix broken error handling in xfs_vm_writepage Dave Chinner
2012-11-23 7:44 ` Peter Hüwe
2012-11-12 0:49 Dave Chinner
2012-11-12 0:58 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox