From: Lachlan McIlroy <lachlan@sgi.com>
To: Eric Sandeen <sandeen@sandeen.net>
Cc: Russell Cattelan <cattelan@thebarn.com>, xfs@oss.sgi.com
Subject: Re: REVIEW: Fix for incore extent corruption.
Date: Fri, 19 Sep 2008 10:59:11 +1000 [thread overview]
Message-ID: <48D2F95F.8040006@sgi.com> (raw)
In-Reply-To: <59751.131.252.241.230.1221766138.squirrel@sandeen.net>
Eric Sandeen wrote:
> Eric Sandeen wrote:
>> Lachlan McIlroy wrote:
>>> Russell, this fixes xfs_iext_irec_compact_full(). If we don't move
>>> all the records from the next page into the current page then we need
>>> to update the er_extoff of the modified page as we move the remaining
>>> extents up. Would you mind giving it a go?
>>>
>>> --- a/fs/xfs/xfs_inode.c 2008-09-18 18:48:46.000000000 +1000
>>> +++ b/fs/xfs/xfs_inode.c 2008-09-18 18:57:18.000000000 +1000
>>> @@ -4623,6 +4623,7 @@ xfs_iext_irec_compact_full(
>>> (XFS_LINEAR_EXTS -
>>> erp_next->er_extcount) *
>>> sizeof(xfs_bmbt_rec_t));
>>> + erp_next->er_extoff += ext_diff;
>>> }
>>> }
>> Lachlan, I concur. I spent way too long last night looking at this and
>> arrived at the same conclusion about the root cause of the problem, but
>> didn't hae *quite* the right solution. I blame it on 2am ;) Your fix
>> looks right.
>
> FWIW; some supporting information from debugging etc.
>
> xfs_iext_irec_compact_full:
>
> Move 1 item from BUF2 into BUF1, and compact BUF2
>
> copy memmove/zero
> BUF1 BUF2 ---> BUF1 BUF2 ---> BUF1 BUF2
> +-----+ +-----+ +-----+ +-----+ +-----+ +-----+
> | 0 | | 3 | | 0 | | | | 0 | | 4 |
> +-----+ +-----+ +-----+ +-----+ +-----+ +-----+
> | 1 | | 4 | | 1 | | 4 | | 1 | | |
> +-----+ +-----+ +-----+ +-----+ +-----+ +-----+
> | 2 | | | | 2 | | | | 2 | | |
> +-----+ +-----+ +-----+ +-----+ +-----+ +-----+
> | | | | | 3 | | | | 3 | | |
> +-----+ +-----+ +-----+ +-----+ +-----+ +-----+
> er_count 3 2 4 1
> er_offset 0 3 0 4
>
> If we don't update er_offset properly in BUF2, then a lookup for extent
> index 3 may find the first one in BUF2, not the last one in BUF1 (both
> claim to be "extent index 3"
>
>>From some tracing when I hit this path:
>
> ...
> 250: ffff810065c61fa0 startoff 251 startblock 263 blockcount 1 flag 1
> 251: ffff810065c61fb0 startoff 252 startblock NULLSTARTBLOCK(5) blockcount
> 1 flag 0
> 252: ffff810065c61fc0 startoff 253 startblock 265 blockcount 1 flag 1
> 253: ffff810065c61fd0 startoff 254 startblock NULLSTARTBLOCK(5) blockcount
> 1 flag 0
> 254: ffff810065c90000 startoff 255 startblock 267 blockcount 1 flag 1
> 255: ffff810065c90010 startoff 256 startblock NULLSTARTBLOCK(5) blockcount
> 1 flag 0
> 256: ffff810065c90020 startoff 257 startblock 269 blockcount 1 flag 1
> 257: ffff810065c90030 startoff 258 startblock NULLSTARTBLOCK(5) blockcount
> 1 flag 0
> 258: ffff810065c90040 startoff 259 startblock 271 blockcount 1 flag 1
> 259: ffff810065c90050 startoff 260 startblock NULLSTARTBLOCK(5) blockcount
> 1 flag 0
> ...
>
> move enough to fill the previous page:
> copy 2 (32) from ffff810065c90000 to ffff810065c61fe0
>
> next page is not empty, so shift up:
>
> move 254 (4064) from ffff810065c90020 to ffff810065c90000
>
> But then I ran through the entire extent list for all indexes in order, and:
>
> 250: ffff810065c61fa0 startoff 251 startblock 263 blockcount 1 flag 1
> 251: ffff810065c61fb0 startoff 252 startblock NULLSTARTBLOCK(5) blockcount
> 1 flag 0
> 252: ffff810065c61fc0 startoff 253 startblock 265 blockcount 1 flag 1
> 253: ffff810065c61fd0 startoff 254 startblock NULLSTARTBLOCK(5) blockcount
> 1 flag 0
> --- XXX where are starting offsets 255, 256 XXX ---
> 254: ffff810065c90000 startoff 257 startblock 269 blockcount 1 flag 1
> 255: ffff810065c90010 startoff 258 startblock NULLSTARTBLOCK(5) blockcount
> 1 flag 0
> 256: ffff810065c90020 startoff 259 startblock 271 blockcount 1 flag 1
>
> starting *offsets* 255, 256 are lost because the next buffer was still
> claiming to start at extent index 254 so it essentially jumped there,
> missing the 2 extents we added to the previous buffer.
>
> in addition, since the er_startoff for this last buffer was wrong, so was
> the last extent record - off by one, and looked at uninit'd memory:
>
> 507: ffff810065c90fd0 startoff 510 startblock NULLSTARTBLOCK(5) blockcount
> 1 flag 0
> 508: ffff810065c92fc0 startoff 483406127300608 startblock 2014118168
> blockcount 196608 flag 0
> wtf, ext 509 out of order (1888313573376 < 483406127300608)?
>
> hope that's more useful than confusing :)
It's very useful Eric, especially the diagram which is much easier to
understand than the squiggles on my notepad.
>
> Anyway I really looked closely at this and I think Lachlan is spot-on.
>
> I might even suggest proposing this and the previous fix for -stable....
Good suggestion.
>
> -Eric
>
>
next prev parent reply other threads:[~2008-09-19 0:49 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-18 0:02 REVIEW: Fix for incore extent corruption Russell Cattelan
2008-09-18 3:38 ` Lachlan McIlroy
2008-09-18 4:45 ` Russell Cattelan
2008-09-18 7:02 ` Lachlan McIlroy
2008-09-18 9:00 ` Lachlan McIlroy
2008-09-18 18:30 ` Eric Sandeen
2008-09-18 19:28 ` Eric Sandeen
2008-09-19 0:59 ` Lachlan McIlroy [this message]
2008-09-19 0:55 ` Lachlan McIlroy
2008-09-19 7:01 ` Eric Sandeen
2008-09-22 2:08 ` Lachlan McIlroy
2008-09-18 21:34 ` Russell Cattelan
2008-09-18 22:20 ` Eric Sandeen
2008-09-19 0:51 ` Lachlan McIlroy
2008-09-19 3:25 ` Lachlan McIlroy
2008-09-19 6:23 ` Eric Sandeen
2008-09-19 15:15 ` Russell Cattelan
2008-09-22 2:17 ` Lachlan McIlroy
2008-09-19 15:03 ` Russell Cattelan
2008-09-22 2:33 ` Lachlan McIlroy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48D2F95F.8040006@sgi.com \
--to=lachlan@sgi.com \
--cc=cattelan@thebarn.com \
--cc=sandeen@sandeen.net \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox