From: Zirong Lang <zlang@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Eric Sandeen <sandeen@sandeen.net>,
Eric Sandeen <sandeen@redhat.com>,
fstests@vger.kernel.org
Subject: Re: [PATCH] xfs/194: fix the exception when run on 4k sector drives
Date: Tue, 18 Aug 2015 23:46:59 -0400 (EDT) [thread overview]
Message-ID: <1801602262.10426533.1439956019811.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <20150819024216.GH3902@dastard>
----- 原始邮件 -----
> 发件人: "Dave Chinner" <david@fromorbit.com>
> 收件人: "Eric Sandeen" <sandeen@sandeen.net>
> 抄送: "Eric Sandeen" <sandeen@redhat.com>, "Zorro Lang" <zlang@redhat.com>, fstests@vger.kernel.org
> 发送时间: 星期三, 2015年 8 月 19日 上午 10:42:16
> 主题: Re: [PATCH] xfs/194: fix the exception when run on 4k sector drives
>
> On Tue, Aug 18, 2015 at 06:03:45PM -0500, Eric Sandeen wrote:
> > On 8/18/15 5:43 PM, Dave Chinner wrote:
> > > On Tue, Aug 18, 2015 at 05:33:05PM -0500, Eric Sandeen wrote:
> > >> On 8/18/15 5:28 PM, Dave Chinner wrote:
> > >>> On Wed, Aug 19, 2015 at 01:21:51AM +0800, Zorro Lang wrote:
> > >>>> @@ -50,6 +50,16 @@ rm -f $seqres.full
> > >>>> # For this test we use block size = 1/8 page size
> > >>>> pgsize=`$here/src/feature -s`
> > >>>> blksize=`expr $pgsize / 8`
> > >>>> +secsize=`_min_dio_alignment $SCRATCH_DEV`
> > >>>> +
> > >>>> +# The minimal blksize can't less than sector size, So if
> > >>>> +# blksize < secsize, we should adjust blksize and pgsize number.
> > >>>> +# Of course, if we adjust pgsize, pgsize won't equal to the
> > >>>> +# real page size of system.
> > >>>> +if [ $blksize -lt $secsize ];then
> > >>>> + blksize=$secsize
> > >>>> + pgsize=`expr $blksize \* 8`
> > >>>> +fi
> > >>>
> > >>> No, this is wrong. the page size stays fixed at the machine page
> > >>> size. We are testing *sub-page block sizes* here and the sector size
> > >>> must be <= page size. Increasing the "page size" to larger than the
> > >>> machine page size does not make the kernel use larger page sizes.
> > >>>
> > >>> IOWs, if you've got sector size = page size (e.g. 4k sector device)
> > >>> then no matter what you say $pgsize is, the kernel will see a block
> > >>> size = page size test.
> > >>>
> > >>> This whole chunk of code can simply be replaced with:
> > >>>
> > >>> blksize=`_min_dio_alignment $SCRATCH_DEV`
> > >>>
> > >>> Because that's what we actually need to test...
> > >>
> > >> That won't work either, because we could easily get 512 from that.
> > >
> > > If 'blockdev --getss $dev' returns 512, then the device supports 512
> > > byte IOs and so it is fine to do 512 byte IOs in the test.
> > >
> > >> and then this test:
> > >>
> > >> # Now try the same thing but write a sector in the middle of that hole
> > >> # If things go badly stale data will be exposed either side.
> > >> # This is most interesting for block size > 512 (page size > 4096)
> > >>
> > >> # We *should* get:
> > >> # |1100|HHHH|33HH|HHHH|2222|----|----|----|
> > >>
> > >> echo "== Test 4 =="
> > >> xfs_io \
> > >> -c "pwrite -S 0x11 -b $pgsize 0 $pgsize" \
> > >> -c "mmap -r 0 $blksize" -c "mread 0 $blksize" -c "munmap" \
> > >> -c "truncate `expr $blksize / 2`" \
> > >> -c "truncate `expr $blksize + 1`" \
> > >> -c "pwrite -S 0x22 -b $blksize `expr $pgsize / 2` $blksize" \
> > >> -c "pwrite -S 0x33 -b 512 `expr $blksize \* 2` 512" \
> > >> -t -d -f $SCRATCH_MNT/testfile4 >> $seqres.full
> > >>
> > >> will be impossible.
> > >>
> > >> AFAICT everything works except for that explicit 512-byte IO.
> > >
> > > Right. That hard coded 512 needs to change to $blksize, because
> > > blksize is now equal to the sector size. I thought this would be
> > > obvious to the reader, so I didn't comment on it.
> >
> > if that last IO is $blksize, and blocksize == sector size, then the
> > test won't be testing what it's designed to test here, i.e. a
> > sub-block direct IO write.
>
> That's not what the test is exercising:
>
> # Test mapping around/over holes for sub-page blocks
>
> it's testing *sub-page block behaviour*, not sub-block direct IO.
>
> >
> > # We *should* get:
> > # |1100|HHHH|33HH|HHHH|2222|----|----|----|
> > ^^
> > this
>
> That implies a sub-block sized direct IO, on a single page that has
> 8 blocks. On a 4k page size machine, that is impossible and so most
> of the time we are not doing what the comment implies.
>
> With a 4k page, 512 byte block size:
>
> | xfs_io \
> | -c "pwrite -S 0x11 -b $pgsize 0 $pgsize" \
>
> Write an entire page (4k)
>
> # |1111|1111|1111|1111|1111|1111|1111|1111|
>
> | -c "mmap -r 0 $blksize" -c "mread 0 $blksize" -c "munmap" \
>
> map the first block (0-511 bytes - one sector)
>
> | -c "truncate `expr $blksize / 2`" \
> | -c "truncate `expr $blksize + 1`" \
>
> sub-block truncate down, sub-block truncate up, make sure page cache
> is correctly zeroed.
>
> # |1100|HHHH|----|----|----|----|----|----|
>
> | -c "pwrite -S 0x22 -b $blksize `expr $pgsize / 2` $blksize" \
>
> DIO write of a single block half way through the original page, make
> sure page cache is flushed correctly before DIO.
>
> # |1100|HHHH|HHHH|HHHH|2222|----|----|----|
>
> FWIW, this write will fail on a 4k sector device on a 4k page size
> platform, because the IO is not sector aligned, and is why the
> original patch needed to multiply pgsize out to 8 * sector size....
>
> | -c "pwrite -S 0x33 -b 512 `expr $blksize \* 2` 512" \
>
> do a -minimum sized write- to the *3rd* block in the page.
>
> # |1100|HHHH|3333|HHHH|2222|----|----|----|
Yes, that's true. if sector size and block size all 512b, we will get this.
In my test machine(64k page size, and 4k sector size), this case will
mkfs with blksize=8k, and we can get:
|1100|HHHH|33HH|HHHH|2222|----|----|----| (one |----| means 8k block size at here:)
>
> And that matches the expected output. We do not get this output with
> a block size that is anything other than pgsize / 8, regardless of
> whether the last write is a sub-block DIO or not.
>
> IOWs, this test assumes that there are at least 8 blocks to page
> because to exercise the appropriate paths it needs a hole between
> each region that is written. 4k sector/4k page means the kernel
> cannot do sub-page block size operations, and the test does not
> exercise the code paths we're expecting it to. Hence it may simply
> be best to do this:
>
> if [ $sector_size > $page_size / 8 ]; then
> _not_run "sector size too large for platform page size"
> fi
If so, that'll be simple. I will send V2 patch for review;)
Thanks,
Zorro
>
> and replace the hard coded 512 with $sector_size.
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
next prev parent reply other threads:[~2015-08-19 3:47 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-18 17:21 [PATCH] xfs/194: fix the exception when run on 4k sector drives Zorro Lang
2015-08-18 22:28 ` Dave Chinner
2015-08-18 22:33 ` Eric Sandeen
2015-08-18 22:43 ` Dave Chinner
2015-08-18 23:03 ` Eric Sandeen
2015-08-19 2:24 ` Zirong Lang
2015-08-19 2:42 ` Dave Chinner
2015-08-19 3:35 ` Eric Sandeen
2015-08-19 3:46 ` Zirong Lang [this message]
2015-08-19 2:24 ` Zirong Lang
2015-08-19 2:48 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1801602262.10426533.1439956019811.JavaMail.zimbra@redhat.com \
--to=zlang@redhat.com \
--cc=david@fromorbit.com \
--cc=fstests@vger.kernel.org \
--cc=sandeen@redhat.com \
--cc=sandeen@sandeen.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox