public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* page discard on page error
@ 2010-08-18 14:27 Mike Gao
  2010-08-18 15:50 ` Dave Chinner
  0 siblings, 1 reply; 4+ messages in thread
From: Mike Gao @ 2010-08-18 14:27 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: text/plain, Size: 980 bytes --]

Hi, all,

I sync latest XFS last month and build into our embedded system. When I
test, I found below issue.
It seems that the xfs_iomap failed when it did allocat with
xfs_ilock_nowait().
I debug this and found that ip->i_lock is locked and there are sometime no
unlock coming between two lock trying.
This issue will give wrong data in file on disk exactly on that offset.

Filesystem "ubdb1": page discard on page 09160ac0, inode 0x43, offset
229494784.
Filesystem "ubdb1": page discard on page 09160ae0, inode 0x43, offset
229498880.
Filesystem "ubdb1": page discard on page 0915fac0, inode 0x43, offset
229502976.
Filesystem "ubdb1": page discard on page 0915fae0, inode 0x43, offset
229507072.
Filesystem "ubdb1": page discard on page 0914d440, inode 0x43, offset
229511168.


Should we just discard the page? I don't think it is good because the file
will be corrupted.
Should we fix it? I am new to XFS and more than like to hear your opinion.

Thanks very much,
Mike Gao

[-- Attachment #1.2: Type: text/html, Size: 1261 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: page discard on page error
  2010-08-18 14:27 page discard on page error Mike Gao
@ 2010-08-18 15:50 ` Dave Chinner
  2010-08-18 16:17   ` Christoph Hellwig
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Chinner @ 2010-08-18 15:50 UTC (permalink / raw)
  To: Mike Gao; +Cc: hch, xfs

On Wed, Aug 18, 2010 at 09:27:55AM -0500, Mike Gao wrote:
> Hi, all,
> 
> I sync latest XFS last month and build into our embedded system. When I
> test, I found below issue.
> It seems that the xfs_iomap failed when it did allocat with
> xfs_ilock_nowait().
> I debug this and found that ip->i_lock is locked and there are sometime no
> unlock coming between two lock trying.
> This issue will give wrong data in file on disk exactly on that offset.

Ah, it looks like we dropped the EAGAIN error from xfs_map_blocks()
on the floor.

Christoph, the 2.6.34 code did this:

error:
......
        /*
         * If it's delalloc and we have nowhere to put it,
         * throw it away, unless the lower layers told
         * us to try again.
         */
        if (err != -EAGAIN) {
                if (!unmapped)
                        xfs_aops_discard_page(page);
                ClearPageUptodate(page);
        }

and the caller redirtied the page on EAGAIN. The EAGAIN comes from
the BMAPI_TRYLOCK case, so it is a valid return indicating that we
would have blocked on a non-blocking IO.

The current code does not handle EAGAIN at all, just discards the
page. This looks incorrect - I think that EAGAIN should still cause
the page to be redirtied, not discarded. I'm not sure if I missed
something else, though - can you remember why you removed the EAGAIN
case?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: page discard on page error
  2010-08-18 15:50 ` Dave Chinner
@ 2010-08-18 16:17   ` Christoph Hellwig
       [not found]     ` <AANLkTimsDB=Za=fF3Rk-oMFf7=pKSGR7Pg6uSKSC-4Q1@mail.gmail.com>
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2010-08-18 16:17 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Mike Gao, xfs

On Thu, Aug 19, 2010 at 01:50:43AM +1000, Dave Chinner wrote:
> The current code does not handle EAGAIN at all, just discards the
> page. This looks incorrect - I think that EAGAIN should still cause
> the page to be redirtied, not discarded. I'm not sure if I missed
> something else, though - can you remember why you removed the EAGAIN
> case?

I don't think that was intentional, in fact I'm pretty sure I handled
it in an earlier version.  Mike, can you test the patch below?


Index: xfs/fs/xfs/linux-2.6/xfs_aops.c
===================================================================
--- xfs.orig/fs/xfs/linux-2.6/xfs_aops.c	2010-08-18 18:10:48.140261091 +0200
+++ xfs/fs/xfs/linux-2.6/xfs_aops.c	2010-08-18 18:12:41.167005888 +0200
@@ -1068,7 +1068,7 @@ xfs_vm_writepage(
 	 * by themselves.
 	 */
 	if ((current->flags & (PF_MEMALLOC|PF_KSWAPD)) == PF_MEMALLOC)
-		goto out_fail;
+		goto redirty;
 
 	/*
 	 * We need a transaction if there are delalloc or unwritten buffers
@@ -1080,7 +1080,7 @@ xfs_vm_writepage(
 	 */
 	xfs_count_page_state(page, &delalloc, &unwritten);
 	if ((current->flags & PF_FSTRANS) && (delalloc || unwritten))
-		goto out_fail;
+		goto redirty;
 
 	/* Is this page beyond the end of the file? */
 	offset = i_size_read(inode);
@@ -1245,12 +1245,15 @@ error:
 	if (iohead)
 		xfs_cancel_ioend(iohead);
 
+	if (err == -EAGAIN)
+		goto redirty;
+
 	xfs_aops_discard_page(page);
 	ClearPageUptodate(page);
 	unlock_page(page);
 	return err;
 
-out_fail:
+redirty:
 	redirty_page_for_writepage(wbc, page);
 	unlock_page(page);
 	return 0;

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: page discard on page error
       [not found]     ` <AANLkTimsDB=Za=fF3Rk-oMFf7=pKSGR7Pg6uSKSC-4Q1@mail.gmail.com>
@ 2010-08-19  9:42       ` Christoph Hellwig
  0 siblings, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2010-08-19  9:42 UTC (permalink / raw)
  To: Mike Gao; +Cc: xfs

Hi Mike,

Adding Dave and the list back to the cc list, given that mail didn't 
contain private information and was addressed to Dave and me.

On Wed, Aug 18, 2010 at 01:28:12PM -0500, Mike Gao wrote:
> I copied two video files (260M and 400M). The 260M one is done and cmp the
> same. But the 400M cmp doesn't match.
> cmp /mnt/USB0disk1/data/iCarly-4.ts ./iCarly-4.ts
> /mnt/USB0disk1/data/iCarly-4.ts ./iCarly-4.ts differ: char 112919977, line
> 516965
> try again
> /mnt/USB0disk1/data/iCarly-4.ts ./iCarly-4.ts differ: char 78496169, line
> 368800
> 
> Do you remember if there is still somewhere discard the page silence? I will
> try the latest sync to see if this is still the same.

There shouldn't be silent discards. Did you just do a straight copy
from a usb device to a local filesystem?

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-08-19  9:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-08-18 14:27 page discard on page error Mike Gao
2010-08-18 15:50 ` Dave Chinner
2010-08-18 16:17   ` Christoph Hellwig
     [not found]     ` <AANLkTimsDB=Za=fF3Rk-oMFf7=pKSGR7Pg6uSKSC-4Q1@mail.gmail.com>
2010-08-19  9:42       ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox