From: Eric Sandeen <sandeen@sandeen.net>
To: lachlan@sgi.com
Cc: xfs-oss <xfs@oss.sgi.com>
Subject: Re: [PATCH] fix corruption case for block size < page size
Date: Mon, 15 Dec 2008 23:40:44 -0600 [thread overview]
Message-ID: <49473F5C.3070308@sandeen.net> (raw)
In-Reply-To: <494735D9.8020809@sgi.com>
Lachlan McIlroy wrote:
> I'm still working through this Eric so I don't fully understand what's
> going on.
>
> It looks to me like the region was never zeroed at all. In
> xfs_zero_last_block() we only zero up to the end of the last block
> (hence the name) but if the last page extends beyond that last
> block we wont zero that extra space in the page. If that remaining
> space in the page sits over a hole then xfs_zero_eof() wont zero it
> either.
So the testcase steps are (8 blocks per page here).
1: |1111|1111|1111|1111|1111|1111|1111|1111| write 1's
2: |RRRR|1111|1111|1111|1111|1111|1111|1111| mapread first block
|<--------trunc-----------------------
3: |1100| trunc down to 1/2 block
trunc-->|
4: |1100| trunc up to block+1byte
5: | | | | |2222| write 2's (extending)
And we get:
# |1100|0000|1111|1111|2222|----|----|----|
But we should get:
# |1100|HHHH|HHHH|HHHH|2222|----|----|----|
(where "H" means hole)
So no, the data for blocks 1,2,3 is never zeroed, nor should it be, I
think - we never partially write those blocks. And the stale 1's coming
through in blocks 2 & 3 are not the result of non-zero buffer heads
getting written to disk in the last stage; they are simply stale disk
blocks from the first step mapped back into the file. (They were
originally written as 1's in the first step.)
> In your example above the last write extends the file size from 513
> bytes to 2048 bytes. In xfs_zero_last_block() we'll only zero from
> 513 up to 1024 bytes (ie up to the end of the last block) but leave
> the rest of the page untouched.
Actually; after the truncate down step (3) we should have:
|<--------trunc-----------------------
3: |11??| trunc down to 1/2 block
^
|
EOF
Hm, but does the end of this block get zeroed now or only when we
subsequently extend the size? The latter I think...?
So I think in the next step:
trunc-->|
4: |1100| trunc up to block+1byte
^^
now || this part of the block gets zeroed, right, by xfs_zero_eof?
> Because of the truncate to 256 bytes
> only the first block is allocated and everything beyond 512 bytes is
> a hole.
Yep, up until the last write anyway.
> More specifically there is a hole under the remainder of the
> page so xfs_zero_eof() will skip that region and not zero anything.
Well, the last write (step 5) is still completely within the page...
Right, that's what it *should* be doing; but in page_state_convert (and
I'll admit to not having this 100% nailed down) we write block 1 and map
blocks 2 & 3 back into the file, and get:
# |1100|0000|1111|1111|2222|----|----|----|
^^^^ ^^^^
where these |||| |||| blocks are stale data, and block 1 is written
(but at least zeroed). How block 1 got zeroed I guess I'm not quite
certain yet. But it does not appear that blocks 2 and 3 get *written*
any time other than step 1; blktrace seems to confirm this. block 1
does get written, and 0s are written. (But I don't think this block
ever should get written either; EOF landed there but only via truncate,
not a write).
Crap, now you've got me slightly confused again, and I'll need to look a
bit more to be sure I'm 100% clear on what's getting zeroed and when vs.
what's getting mapped and why. :)
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2008-12-16 5:40 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-12-13 7:07 [PATCH] fix corruption case for block size < page size Eric Sandeen
2008-12-13 17:48 ` Eric Sandeen
2008-12-13 18:20 ` Eric Sandeen
2008-12-16 5:00 ` Lachlan McIlroy
2008-12-16 5:40 ` Eric Sandeen [this message]
2008-12-16 6:05 ` Lachlan McIlroy
2008-12-16 6:10 ` Eric Sandeen
2008-12-16 6:21 ` Eric Sandeen
2008-12-16 6:51 ` Eric Sandeen
2009-01-07 5:23 ` Lachlan McIlroy
2009-01-07 5:53 ` Eric Sandeen
2009-01-07 6:32 ` Lachlan McIlroy
2009-01-07 21:42 ` Dave Chinner
2009-01-09 0:18 ` Lachlan McIlroy
2008-12-16 7:54 ` Lachlan McIlroy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49473F5C.3070308@sandeen.net \
--to=sandeen@sandeen.net \
--cc=lachlan@sgi.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox