All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alain Renaud <arenaud@sgi.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH] xfs: using extsize cause corruption with multi buffer page.
Date: Tue, 05 Jun 2012 08:34:45 -0400	[thread overview]
Message-ID: <4FCDFCE5.5020702@sgi.com> (raw)
In-Reply-To: <20120605114516.GM4347@dastard>

Thanks Dave,
  I will reviewed the information you just sent me and use that to update 
both this fix and the test.


Again thanks a lot.


On 12-06-05 07:45 AM, Dave Chinner wrote:
> On Fri, Jun 01, 2012 at 03:18:26PM -0400, Alain Renaud wrote:
>>    This mod make sure that buffer in a xfs_ioend are all in the
>> same extent. This is actually similar to what is done in
>> xfs_convert_page() already.
>>
>> This solve the problem of having multiple extent in one page.
> You need to describe how the change fixes the problem, not state
> that it does.
>
> i.e. a fix is no good if you can't explain *why* it fixes the bug.
>
>> With the current kernel if we have a page that look like this:
>> buffer  content
>> 0       empty  b_state = 0
>> 1       DATA   b_state = 0x1023
>> 2       DATA   b_state = 0x1023
>> 3       empty  b_state = 0
>> 4       empty  b_state = 0
>> 5       DATA   b_state = 0x1023
>> 6       DATA   b_state = 0x1023
>> 7       empty  b_state = 0
>>
>>
>> We endup with buffer 1-4 been tag as real and 5-EOF tag as unwritten.
>> Instead of 1-2 real, 3-4 unwritten, 5-6 real, 7-EOF unwritten.
> Actaully, the buffer state is correct - it's the ioend construction
> that appears wrong and that leads to conversion being incorrect.
> Indeed, if you look at my previous email about the test case you'll
> see that the in-memory state is correct before the unmount....
>
> As it is, the patch changes the way the new_ioend variable works -
> it's supposed to be set only if we have to map a new extent. That
> is, if we currently don't have a valid extent we need a new ioend
> when we map the new extent. The correct way to create a new extent
> is to mark the current imap as invalid so we have to do a new
> lookup....
>
>> @@ -985,6 +986,7 @@ xfs_vm_writepage(
>>   				ASSERT(buffer_mapped(bh));
>>   				imap_valid = 0;
>>   			}
>> +			new_ioend = 1;
>>   			continue;
>>   		}
> This is the real change of logic, and indicates where the bug
> potentially lies. The same fix can be accomplished simply by
> replacing all your changes with this:
>
> @@ -981,10 +981,9 @@ xfs_vm_writepage(
>                                  imap_valid = 0;
>                          }
>                  } else {
> -                       if (PageUptodate(page)) {
> +                       if (PageUptodate(page))
>                                  ASSERT(buffer_mapped(bh));
> -                               imap_valid = 0;
> -                       }
> +                       imap_valid = 0;
>                          continue;
>                  }
>
> But this still doesn't explain what the problem actually is. All it
> indicates is that we have a buffer that is mapped but not up to date
> on a page that is also not up to date. What we need to do is
> understand why this is happening, and if changing this logic is the
> correct fix that won't have any unintended side effects.
>
> So let's try to understand this a bit better. First of all, are
> extent size hints necessary to trigger this? With this change:
>
> - xfs_io -f -c "extsize `expr $pgsize \* 10`" \
> + xfs_io -f -c "resvsp 0 `expr $pgsize \* 10`" \
>
> I find that the same corruption occurs, so this is an unwritten
> extent problem, not an extent size hint issue. So now we have a much
> wider scope than initially indicated.
>
> So, next question: what part does the truncate play, if any? So I
> comment out the truncate, and it appears that it has no effect on
> the result of the test. OK, lets drop that completely.
>
> Ok, that leaves us with a test case that is kind of similar to part
> 4 of test 194, but with the initial condition of a prealocated
> region rather than an allocated block. And test 194 doesn't check
> the state after a unmount/mount cycle so even if it did have a
> preallocated initial state, it wouldn't detect this problem.  IOWs,
> we have a whole new class of corner cases that we don't currently
> exercise. Alain, your test is going need some more work...
>
> Alright, so lets look at why we fail to handle this case in
> writeback properly.
>
>    xfs_io-2498  xfs_writepage:        dev 253:16 ino 0x23 pgoff 0x1000 size 0x1e00 offset 0 delalloc 0 unwritten 1
>    xfs_io-2498  xfs_map_blocks_found: dev 253:16 ino 0x23 size 0x0 offset 0x1200 count 512 type  startoff 0x0 startblock 48 blockcount 0x50
> kworker/0:15   xfs_unwritten_convert: dev 253:16 ino 0x23 isize 0x1e00 disize 0x0 offset 0x1200 count 2048
>
> Ah, there's why it fails to handle the case correctly - the
> unwritten extent is a single map and hence if we won't do another
> lookup because it spans the entire page already. Hence if imap_valid
> is not cleared, we'll never look it up again and set new_ioend.
>
> If the page is marked uptodate, then that means all the buffers have
> been correctly initialised and should all be marked up to date.
> Hence I'm not sure that the existing logic there is actually
> possible to hit. We know the buffer is not uptodate, and we use
> generic_write_end, which calls __block_commit_write(), and that only
> calls SetPageUptodate() when all the buffers are uptodate.
>
> Indeed, if the buffer is not uptodate, then it means we didn't
> complete the write on it and it does not have valid data in it, so
> regardless of the page state we should not write it. That means we
> must always skip it and that means we always need a new ioend in
> this case. hence we should always clear imap_valid in this case.
> Hence i think the above hunk is the right fix for the problem.
>
> Alain, can you check my logic, run it through your tests and if it
> works, respin the patch with this info in the commit message and a
> comment in the code indicating why we need to clear imap_valid
> unconditionally there?
>
> Cheers,
>
> Dave.
>
>
> That means, I think, that the logic in the current code is just
> plain wrong. xfs_convert_page() aborts on pages and buffers that are
> both not uptodate, so they get handled by xfs_vm_writepage(). This
> particular branch we are looking at there is the the
> !buffer_uptodate() branch.
>
>
>
> What your change does is this:
>> -- 
>> 1.7.4.1
>>
>> _______________________________________________
>> xfs mailing list
>> xfs@oss.sgi.com
>> http://oss.sgi.com/mailman/listinfo/xfs
>>

-- 
===============================================
Alain Renaud     - Cluster File System Engineer
arenaud@sgi.com  - SGI
===============================================

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2012-06-05 12:34 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-01 19:18 [PATCH] xfs: using extsize cause corruption with multi buffer page Alain Renaud
2012-06-05 11:45 ` Dave Chinner
2012-06-05 12:34   ` Alain Renaud [this message]
2012-06-08 19:34     ` [PATCH] xfs: xfs_vm_writepage clear iomap_valid when !buffer_uptodate (REV2) Alain Renaud
2012-06-08 19:56       ` Ben Myers
2012-06-08 23:13       ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FCDFCE5.5020702@sgi.com \
    --to=arenaud@sgi.com \
    --cc=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.