public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* Review: Clear unwritten flag on during partial page truncation
@ 2006-12-20  6:28 David Chinner
  2006-12-20  8:21 ` Lachlan McIlroy
  2006-12-21  6:16 ` Nathan Scott
  0 siblings, 2 replies; 8+ messages in thread
From: David Chinner @ 2006-12-20  6:28 UTC (permalink / raw)
  To: xfs-dev; +Cc: xfs

For a long time now we've been getting assert failures in XFSQA test
083 which is a fsstress test near ENOSPC.

The failure mode is:

<4>Assertion failed: !(iomapp->iomap_flags & IOMAP_DELAY), file: fs/xfs/linux-2.6/xfs_aops.c, line: 510

A inode read-write trace uncovered what the problem was.  We had a
page with an unwritten extent lying partially over it (last block on
the page), and then we truncated the file to the first block on the
page. Because it is a partial page truncation, we walk the buffers
on the page and clear the flags on the buffers past end of file so
that if we write to them again the correct flags are set on the
buffers.

block_invalidatepage() call discard_buffers() to do this, which
clears all the common state from the buffers past EOF. The problem
stems from the fact that the buffer_unwritten() flag is an XFS
private flag, and hence does not get cleared from the buffer.

If we then extend the file with a write to within the buffer that we
failed to clear the unwritten flag from, we then make the wrong
decision in xfs_convert_page_state() when trying to map that buffer
- we see buffer_unwritten() instead of buffer_delay(), but the
extent map we got for that range from xfs_bmapi() correctly says
IOMAP_DELAY.

Hence the solution is to clear the private buffer flags in
xfs_vm_invalidatepage() so that when we extend the file the buffers
on the page are all consistent.

Patch below. Comments?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group


---
 fs/xfs/linux-2.6/xfs_aops.c |   38 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 37 insertions(+), 1 deletion(-)

Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_aops.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_aops.c	2006-11-30 19:24:46.000000000 +1100
+++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_aops.c	2006-11-30 23:17:14.123674539 +1100
@@ -41,6 +41,8 @@
 #include <linux/pagevec.h>
 #include <linux/writeback.h>
 
+STATIC void xfs_vm_invalidatepage(struct page *, unsigned long);
+
 STATIC void
 xfs_count_page_state(
 	struct page		*page,
@@ -1061,7 +1063,7 @@ error:
 	 */
 	if (err != -EAGAIN) {
 		if (!unmapped)
-			block_invalidatepage(page, 0);
+			xfs_vm_invalidatepage(page, 0);
 		ClearPageUptodate(page);
 	}
 	return err;
@@ -1451,6 +1453,14 @@ xfs_vm_readpages(
 	return mpage_readpages(mapping, pages, nr_pages, xfs_get_blocks);
 }
 
+/*
+ * block_invalidatepage() doesn't clear private buffer flags like the unwritten
+ * flag. Leaving the unwritten flag behind on buffers that have been
+ * invalidated but are still attached to a page (i.e.  buffer size < page size
+ * and partial page invalidation) will result in incorrect allocation occurring
+ * during writeback if the file is extended to within or past the invalidated
+ * block before writeback.
+ */
 STATIC void
 xfs_vm_invalidatepage(
 	struct page		*page,
@@ -1458,6 +1468,32 @@ xfs_vm_invalidatepage(
 {
 	xfs_page_trace(XFS_INVALIDPAGE_ENTER,
 			page->mapping->host, page, offset);
+
+	/*
+	 * Need to clear private flags from buffers on partial
+	 * page truncations ourselves. Same inner loop as
+	 * block_invalidatepage() is used.
+	 */
+	if (offset && page_has_buffers(page)) {
+		struct buffer_head *head, *bh, *next;
+		unsigned int curr_off = 0;
+
+		head = page_buffers(page);
+		bh = head;
+		do {
+			unsigned int next_off = curr_off + bh->b_size;
+			next = bh->b_this_page;
+
+			/*
+			 * is this block fully invalidated?
+			 */
+			if (offset <= curr_off)
+				clear_buffer_unwritten(bh);
+			curr_off = next_off;
+			bh = next;
+		} while (bh != head);
+	}
+
 	block_invalidatepage(page, offset);
 }
 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Review: Clear unwritten flag on during partial page truncation
  2006-12-20  6:28 Review: Clear unwritten flag on during partial page truncation David Chinner
@ 2006-12-20  8:21 ` Lachlan McIlroy
  2006-12-20  9:07   ` David Chinner
  2006-12-21  6:16 ` Nathan Scott
  1 sibling, 1 reply; 8+ messages in thread
From: Lachlan McIlroy @ 2006-12-20  8:21 UTC (permalink / raw)
  To: David Chinner; +Cc: xfs-dev, xfs

@@ -41,6 +41,8 @@
  #include <linux/pagevec.h>
  #include <linux/writeback.h>

+STATIC void xfs_vm_invalidatepage(struct page *, unsigned long);
+
  STATIC void
  xfs_count_page_state(
  	struct page		*page,
@@ -1061,7 +1063,7 @@ error:
  	 */
  	if (err != -EAGAIN) {
  		if (!unmapped)
-			block_invalidatepage(page, 0);
+			xfs_vm_invalidatepage(page, 0);

We pass in an offset of zero here...


@@ -1458,6 +1468,32 @@ xfs_vm_invalidatepage(
  {
  	xfs_page_trace(XFS_INVALIDPAGE_ENTER,
  			page->mapping->host, page, offset);
+
+	/*
+	 * Need to clear private flags from buffers on partial
+	 * page truncations ourselves. Same inner loop as
+	 * block_invalidatepage() is used.
+	 */
+	if (offset && page_has_buffers(page)) {

And only do this code for non-zero offsets.  Are you sure this is correct?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Review: Clear unwritten flag on during partial page truncation
  2006-12-20  8:21 ` Lachlan McIlroy
@ 2006-12-20  9:07   ` David Chinner
  0 siblings, 0 replies; 8+ messages in thread
From: David Chinner @ 2006-12-20  9:07 UTC (permalink / raw)
  To: Lachlan McIlroy; +Cc: David Chinner, xfs-dev, xfs

On Wed, Dec 20, 2006 at 08:21:54AM +0000, Lachlan McIlroy wrote:
> @@ -41,6 +41,8 @@
>  #include <linux/pagevec.h>
>  #include <linux/writeback.h>
> 
> +STATIC void xfs_vm_invalidatepage(struct page *, unsigned long);
> +
>  STATIC void
>  xfs_count_page_state(
>  	struct page		*page,
> @@ -1061,7 +1063,7 @@ error:
>  	 */
>  	if (err != -EAGAIN) {
>  		if (!unmapped)
> -			block_invalidatepage(page, 0);
> +			xfs_vm_invalidatepage(page, 0);
> 
> We pass in an offset of zero here...

Yup - to invalidate the whole page.

> @@ -1458,6 +1468,32 @@ xfs_vm_invalidatepage(
>  {
>  	xfs_page_trace(XFS_INVALIDPAGE_ENTER,
>  			page->mapping->host, page, offset);
> +
> +	/*
> +	 * Need to clear private flags from buffers on partial
> +	 * page truncations ourselves. Same inner loop as
> +	 * block_invalidatepage() is used.
> +	 */
> +	if (offset && page_has_buffers(page)) {
> 
> And only do this code for non-zero offsets.  Are you sure this is correct?

Yes. offset == 0 will currently just fall through to
block_invalidatepage() which removes and frees all the buffers.
Right now that first change is a no-op except on debug kernels we'll
get a trace entry to indicate that we invalidated the page.

In future, however, we may need to scan all the buffers before
freeing them, so now we'll only need to modify xfs_vm_invalidatepage()....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Review: Clear unwritten flag on during partial page truncation
  2006-12-20  6:28 Review: Clear unwritten flag on during partial page truncation David Chinner
  2006-12-20  8:21 ` Lachlan McIlroy
@ 2006-12-21  6:16 ` Nathan Scott
  2006-12-21 11:37   ` David Chinner
  2006-12-22 13:21   ` Christoph Hellwig
  1 sibling, 2 replies; 8+ messages in thread
From: Nathan Scott @ 2006-12-21  6:16 UTC (permalink / raw)
  To: David Chinner; +Cc: xfs-dev, xfs

On Wed, 2006-12-20 at 17:28 +1100, David Chinner wrote:
> Hence the solution is to clear the private buffer flags in
> xfs_vm_invalidatepage() so that when we extend the file the buffers
> on the page are all consistent.
> 
> Patch below. Comments?

Looks good Dave, nice sleuthing.

In hindsight, it'd have been really good to have gone for the real
BH_Unwritten flag upfront, and then being able to clear that inside
discard_buffer (like was done for BH_Delay)... if we did that, then
all this new code we're adding here (to just clear_buffer_unwritten,
ultimately) and also the complete hack in xfs_count_page_state could
be removed.  It still might be worth considering doing that, in case
there's other hard-to-hit-but-not-yet-uncovered bugs lurking along
the same lines.  But alot of effort, with the possibility of it not
being merged at all, as it touches code outside XFS.  D'oh.

FWIW, GFS seems to have managed to do even worse here, and looks like
they have dup'd big chunks of buffer.c ... has a discard_buffer() copy
and invalidate_page and probably others which are closely derived from
the equivalent buffer.c code ... guess those guys (hi Russell) could
do some code rationalisation in this area too before they get bitten.

cheers.

-- 
Nathan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Review: Clear unwritten flag on during partial page truncation
  2006-12-21  6:16 ` Nathan Scott
@ 2006-12-21 11:37   ` David Chinner
  2006-12-21 22:04     ` Nathan Scott
  2006-12-22 13:21   ` Christoph Hellwig
  1 sibling, 1 reply; 8+ messages in thread
From: David Chinner @ 2006-12-21 11:37 UTC (permalink / raw)
  To: Nathan Scott; +Cc: David Chinner, xfs-dev, xfs

On Thu, Dec 21, 2006 at 05:16:58PM +1100, Nathan Scott wrote:
> On Wed, 2006-12-20 at 17:28 +1100, David Chinner wrote:
> > Hence the solution is to clear the private buffer flags in
> > xfs_vm_invalidatepage() so that when we extend the file the buffers
> > on the page are all consistent.
> > 
> > Patch below. Comments?
> 
> Looks good Dave, nice sleuthing.
> 
> In hindsight, it'd have been really good to have gone for the real
> BH_Unwritten flag upfront, and then being able to clear that inside
> discard_buffer (like was done for BH_Delay)...  if we did that, then
> all this new code we're adding here (to just clear_buffer_unwritten,
> ultimately) and also the complete hack in xfs_count_page_state could
> be removed.  It still might be worth considering doing that, in case
> there's other hard-to-hit-but-not-yet-uncovered bugs lurking along
> the same lines.  But alot of effort, with the possibility of it not
> being merged at all, as it touches code outside XFS.  D'oh.

Yep, pretty much my thinking as well.

I forgot about that hack in xfs_count_page_state() - we should
be able to remove that with this change, right?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Review: Clear unwritten flag on during partial page truncation
  2006-12-21 11:37   ` David Chinner
@ 2006-12-21 22:04     ` Nathan Scott
  0 siblings, 0 replies; 8+ messages in thread
From: Nathan Scott @ 2006-12-21 22:04 UTC (permalink / raw)
  To: David Chinner; +Cc: xfs-dev, xfs

On Thu, 2006-12-21 at 22:37 +1100, David Chinner wrote:
> > ...
> > the same lines.  But alot of effort, with the possibility of it not
> > being merged at all, as it touches code outside XFS.  D'oh.
> 
> Yep, pretty much my thinking as well.
> 
> I forgot about that hack in xfs_count_page_state() - we should
> be able to remove that with this change, right?
> 

Hmm, yes, quite possibly ... worth testing out anyway, but I think
you may be right there.

cheers.

-- 
Nathan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Review: Clear unwritten flag on during partial page truncation
  2006-12-21  6:16 ` Nathan Scott
  2006-12-21 11:37   ` David Chinner
@ 2006-12-22 13:21   ` Christoph Hellwig
  2006-12-23  2:55     ` David Chinner
  1 sibling, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2006-12-22 13:21 UTC (permalink / raw)
  To: Nathan Scott; +Cc: David Chinner, xfs-dev, xfs

On Thu, Dec 21, 2006 at 05:16:58PM +1100, Nathan Scott wrote:
> On Wed, 2006-12-20 at 17:28 +1100, David Chinner wrote:
> > Hence the solution is to clear the private buffer flags in
> > xfs_vm_invalidatepage() so that when we extend the file the buffers
> > on the page are all consistent.
> > 
> > Patch below. Comments?
> 
> Looks good Dave, nice sleuthing.
> 
> In hindsight, it'd have been really good to have gone for the real
> BH_Unwritten flag upfront, and then being able to clear that inside
> discard_buffer (like was done for BH_Delay)... if we did that, then
> all this new code we're adding here (to just clear_buffer_unwritten,
> ultimately) and also the complete hack in xfs_count_page_state could
> be removed.  It still might be worth considering doing that, in case
> there's other hard-to-hit-but-not-yet-uncovered bugs lurking along
> the same lines.  But alot of effort, with the possibility of it not
> being merged at all, as it touches code outside XFS.  D'oh.

Getting code into buffer.c to fix this properly should be no problem
at all.  As usual we prefer to fix core code instead of working around
it.  This would also fix the data loss issue after we write through mmap
into an unwritten extent.

> 
> FWIW, GFS seems to have managed to do even worse here, and looks like
> they have dup'd big chunks of buffer.c ... has a discard_buffer() copy
> and invalidate_page and probably others which are closely derived from
> the equivalent buffer.c code ... guess those guys (hi Russell) could
> do some code rationalisation in this area too before they get bitten.

GFS has crap code, news at 11 ;-)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Review: Clear unwritten flag on during partial page truncation
  2006-12-22 13:21   ` Christoph Hellwig
@ 2006-12-23  2:55     ` David Chinner
  0 siblings, 0 replies; 8+ messages in thread
From: David Chinner @ 2006-12-23  2:55 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Nathan Scott, David Chinner, xfs-dev, xfs

On Fri, Dec 22, 2006 at 01:21:42PM +0000, Christoph Hellwig wrote:
> On Thu, Dec 21, 2006 at 05:16:58PM +1100, Nathan Scott wrote:
> > On Wed, 2006-12-20 at 17:28 +1100, David Chinner wrote:
> > > Hence the solution is to clear the private buffer flags in
> > > xfs_vm_invalidatepage() so that when we extend the file the buffers
> > > on the page are all consistent.
> > > 
> > > Patch below. Comments?
> > 
> > Looks good Dave, nice sleuthing.
> > 
> > In hindsight, it'd have been really good to have gone for the real
> > BH_Unwritten flag upfront, and then being able to clear that inside
> > discard_buffer (like was done for BH_Delay)... if we did that, then
> > all this new code we're adding here (to just clear_buffer_unwritten,
> > ultimately) and also the complete hack in xfs_count_page_state could
> > be removed.  It still might be worth considering doing that, in case
> > there's other hard-to-hit-but-not-yet-uncovered bugs lurking along
> > the same lines.  But alot of effort, with the possibility of it not
> > being merged at all, as it touches code outside XFS.  D'oh.
> 
> Getting code into buffer.c to fix this properly should be no problem
> at all.  As usual we prefer to fix core code instead of working around
> it.  This would also fix the data loss issue after we write through mmap
> into an unwritten extent.

Ok, I'll work up a patch for it in the new year, Christoph.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2006-12-23  2:57 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-20  6:28 Review: Clear unwritten flag on during partial page truncation David Chinner
2006-12-20  8:21 ` Lachlan McIlroy
2006-12-20  9:07   ` David Chinner
2006-12-21  6:16 ` Nathan Scott
2006-12-21 11:37   ` David Chinner
2006-12-21 22:04     ` Nathan Scott
2006-12-22 13:21   ` Christoph Hellwig
2006-12-23  2:55     ` David Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox