public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Lachlan McIlroy <lachlan@sgi.com>
To: Eric Sandeen <sandeen@sandeen.net>
Cc: Russell Cattelan <cattelan@thebarn.com>, xfs@oss.sgi.com
Subject: Re: REVIEW: Fix for incore extent corruption.
Date: Fri, 19 Sep 2008 10:59:11 +1000	[thread overview]
Message-ID: <48D2F95F.8040006@sgi.com> (raw)
In-Reply-To: <59751.131.252.241.230.1221766138.squirrel@sandeen.net>

Eric Sandeen wrote:
> Eric Sandeen wrote:
>> Lachlan McIlroy wrote:
>>> Russell, this fixes xfs_iext_irec_compact_full().  If we don't move
>>> all the records from the next page into the current page then we need
>>> to update the er_extoff of the modified page as we move the remaining
>>> extents up.  Would you mind giving it a go?
>>>
>>> --- a/fs/xfs/xfs_inode.c	2008-09-18 18:48:46.000000000 +1000
>>> +++ b/fs/xfs/xfs_inode.c	2008-09-18 18:57:18.000000000 +1000
>>> @@ -4623,6 +4623,7 @@ xfs_iext_irec_compact_full(
>>>  					(XFS_LINEAR_EXTS -
>>>  						erp_next->er_extcount) *
>>>  					sizeof(xfs_bmbt_rec_t));
>>> +				erp_next->er_extoff += ext_diff;
>>>  			}
>>>  		}
>> Lachlan, I concur.  I spent way too long last night looking at this and
>> arrived at the same conclusion about the root cause of the problem, but
>> didn't hae *quite* the right solution.  I blame it on 2am ;)  Your fix
>> looks right.
> 
> FWIW; some supporting information from debugging etc.
> 
> xfs_iext_irec_compact_full:
> 
> Move 1 item from BUF2 into BUF1, and compact BUF2
> 
>                          copy                memmove/zero
>          BUF1    BUF2    --->   BUF1     BUF2    --->    BUF1    BUF2
>         +-----+ +-----+         +-----+ +-----+         +-----+ +-----+
>         |  0  | |  3  |         |  0  | |     |         |  0  | |  4  |
>         +-----+ +-----+         +-----+ +-----+         +-----+ +-----+
>         |  1  | |  4  |         |  1  | |  4  |         |  1  | |     |
>         +-----+ +-----+         +-----+ +-----+         +-----+ +-----+
>         |  2  | |     |         |  2  | |     |         |  2  | |     |
>         +-----+ +-----+         +-----+ +-----+         +-----+ +-----+
>         |     | |     |         |  3  | |     |         |  3  | |     |
>         +-----+ +-----+         +-----+ +-----+         +-----+ +-----+
> er_count   3       2                                       4       1
> er_offset  0       3                                       0       4
> 
> If we don't update er_offset properly in BUF2, then a lookup for extent
> index 3 may find the first one in BUF2, not the last one in BUF1 (both
> claim to be "extent index 3"
> 
>>From some tracing when I hit this path:
> 
> ...
> 250: ffff810065c61fa0 startoff 251 startblock 263 blockcount 1 flag 1
> 251: ffff810065c61fb0 startoff 252 startblock NULLSTARTBLOCK(5) blockcount
> 1 flag 0
> 252: ffff810065c61fc0 startoff 253 startblock 265 blockcount 1 flag 1
> 253: ffff810065c61fd0 startoff 254 startblock NULLSTARTBLOCK(5) blockcount
> 1 flag 0
> 254: ffff810065c90000 startoff 255 startblock 267 blockcount 1 flag 1
> 255: ffff810065c90010 startoff 256 startblock NULLSTARTBLOCK(5) blockcount
> 1 flag 0
> 256: ffff810065c90020 startoff 257 startblock 269 blockcount 1 flag 1
> 257: ffff810065c90030 startoff 258 startblock NULLSTARTBLOCK(5) blockcount
> 1 flag 0
> 258: ffff810065c90040 startoff 259 startblock 271 blockcount 1 flag 1
> 259: ffff810065c90050 startoff 260 startblock NULLSTARTBLOCK(5) blockcount
> 1 flag 0
> ...
> 
> move enough to fill the previous page:
> copy 2 (32) from ffff810065c90000 to ffff810065c61fe0
> 
> next page is not empty, so shift up:
> 
> move 254 (4064) from ffff810065c90020 to ffff810065c90000
> 
> But then I ran through the entire extent list for all indexes in order, and:
> 
> 250: ffff810065c61fa0 startoff 251 startblock 263 blockcount 1 flag 1
> 251: ffff810065c61fb0 startoff 252 startblock NULLSTARTBLOCK(5) blockcount
> 1 flag 0
> 252: ffff810065c61fc0 startoff 253 startblock 265 blockcount 1 flag 1
> 253: ffff810065c61fd0 startoff 254 startblock NULLSTARTBLOCK(5) blockcount
> 1 flag 0
> --- XXX where are starting offsets 255, 256 XXX ---
> 254: ffff810065c90000 startoff 257 startblock 269 blockcount 1 flag 1
> 255: ffff810065c90010 startoff 258 startblock NULLSTARTBLOCK(5) blockcount
> 1 flag 0
> 256: ffff810065c90020 startoff 259 startblock 271 blockcount 1 flag 1
> 
> starting *offsets* 255, 256 are lost because the next buffer was still
> claiming to start at extent index 254 so it essentially jumped there,
> missing the 2 extents we added to the previous buffer.
> 
> in addition, since the er_startoff for this last buffer was wrong, so was
> the last extent record - off by one, and looked at uninit'd memory:
> 
> 507: ffff810065c90fd0 startoff 510 startblock NULLSTARTBLOCK(5) blockcount
> 1 flag 0
> 508: ffff810065c92fc0 startoff 483406127300608 startblock 2014118168
> blockcount 196608 flag 0
> wtf, ext 509 out of order (1888313573376 < 483406127300608)?
> 
> hope that's more useful than confusing :)
It's very useful Eric, especially the diagram which is much easier to
understand than the squiggles on my notepad.

> 
> Anyway I really looked closely at this and I think Lachlan is spot-on.
> 
> I might even suggest proposing this and the previous fix for -stable....
Good suggestion.

> 
> -Eric
> 
> 

  reply	other threads:[~2008-09-19  0:49 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-18  0:02 REVIEW: Fix for incore extent corruption Russell Cattelan
2008-09-18  3:38 ` Lachlan McIlroy
2008-09-18  4:45   ` Russell Cattelan
2008-09-18  7:02     ` Lachlan McIlroy
2008-09-18  9:00     ` Lachlan McIlroy
2008-09-18 18:30       ` Eric Sandeen
2008-09-18 19:28         ` Eric Sandeen
2008-09-19  0:59           ` Lachlan McIlroy [this message]
2008-09-19  0:55         ` Lachlan McIlroy
2008-09-19  7:01           ` Eric Sandeen
2008-09-22  2:08             ` Lachlan McIlroy
2008-09-18 21:34       ` Russell Cattelan
2008-09-18 22:20         ` Eric Sandeen
2008-09-19  0:51           ` Lachlan McIlroy
2008-09-19  3:25             ` Lachlan McIlroy
2008-09-19  6:23               ` Eric Sandeen
2008-09-19 15:15                 ` Russell Cattelan
2008-09-22  2:17                 ` Lachlan McIlroy
2008-09-19 15:03               ` Russell Cattelan
2008-09-22  2:33                 ` Lachlan McIlroy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48D2F95F.8040006@sgi.com \
    --to=lachlan@sgi.com \
    --cc=cattelan@thebarn.com \
    --cc=sandeen@sandeen.net \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox