* possible dev branch regression - xfstest 285/1k
@ 2013-03-15 22:28 Eric Whitney
2013-03-16 2:32 ` Zheng Liu
2013-03-16 15:09 ` Zheng Liu
0 siblings, 2 replies; 19+ messages in thread
From: Eric Whitney @ 2013-03-15 22:28 UTC (permalink / raw)
To: linux-ext4; +Cc: tytso
I'm seeing Xfstest 285 consistently fail for the 1k test case using the
latest dev branch while running on both x86 and ARM. Subtest 08 is
the problem. From the test output:
08. Test file with unwritten extents, only have unwritten pages
08.01 SEEK_HOLE expected 0 or 4194304, got 11264. FAIL
08.02 SEEK_HOLE expected 1 or 4194304, got 11264. FAIL
08.03 SEEK_DATA expected 10240 or 10240, got 0. FAIL
08.04 SEEK_DATA expected 10240 or 10240, got 1. FAIL
>From previous discussions, we expect 285 to fail in the ext3 (nodelalloc,
no flex_bg, and no extents) test case, but in subtest 07. It still does
that.
In the dev branch, reverting 4f42f80a8f - "ext4: use s_extent_max_zeroout_kb
value as number of kb" - results in success for 285 in the 1k test case.
Regards,
Eric
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: possible dev branch regression - xfstest 285/1k 2013-03-15 22:28 possible dev branch regression - xfstest 285/1k Eric Whitney @ 2013-03-16 2:32 ` Zheng Liu 2013-03-16 15:09 ` Zheng Liu 1 sibling, 0 replies; 19+ messages in thread From: Zheng Liu @ 2013-03-16 2:32 UTC (permalink / raw) To: Eric Whitney; +Cc: linux-ext4, tytso Hi Eric, Thanks for reporting it. On 03/16/2013 06:28 AM, Eric Whitney wrote: > I'm seeing Xfstest 285 consistently fail for the 1k test case using the > latest dev branch while running on both x86 and ARM. Subtest 08 is > the problem. From the test output: > > 08. Test file with unwritten extents, only have unwritten pages > 08.01 SEEK_HOLE expected 0 or 4194304, got 11264. FAIL > 08.02 SEEK_HOLE expected 1 or 4194304, got 11264. FAIL > 08.03 SEEK_DATA expected 10240 or 10240, got 0. FAIL > 08.04 SEEK_DATA expected 10240 or 10240, got 1. FAIL > > From previous discussions, we expect 285 to fail in the ext3 (nodelalloc, > no flex_bg, and no extents) test case, but in subtest 07. It still does > that. Sorry, my latest patch doesn't finish yet. > > In the dev branch, reverting 4f42f80a8f - "ext4: use s_extent_max_zeroout_kb > value as number of kb" - results in success for 285 in the 1k test case. Presumably this patch isn't root cause. I suspect there are some bugs in ext4_find_unwritten_pgoff(). I will check it. Thanks, - Zheng ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: possible dev branch regression - xfstest 285/1k 2013-03-15 22:28 possible dev branch regression - xfstest 285/1k Eric Whitney 2013-03-16 2:32 ` Zheng Liu @ 2013-03-16 15:09 ` Zheng Liu 2013-03-17 3:06 ` Theodore Ts'o 2013-03-17 3:36 ` Eric Whitney 1 sibling, 2 replies; 19+ messages in thread From: Zheng Liu @ 2013-03-16 15:09 UTC (permalink / raw) To: Eric Whitney; +Cc: linux-ext4, tytso On Fri, Mar 15, 2013 at 06:28:18PM -0400, Eric Whitney wrote: > I'm seeing Xfstest 285 consistently fail for the 1k test case using the > latest dev branch while running on both x86 and ARM. Subtest 08 is > the problem. From the test output: > > 08. Test file with unwritten extents, only have unwritten pages > 08.01 SEEK_HOLE expected 0 or 4194304, got 11264. FAIL > 08.02 SEEK_HOLE expected 1 or 4194304, got 11264. FAIL > 08.03 SEEK_DATA expected 10240 or 10240, got 0. FAIL > 08.04 SEEK_DATA expected 10240 or 10240, got 1. FAIL > > From previous discussions, we expect 285 to fail in the ext3 (nodelalloc, > no flex_bg, and no extents) test case, but in subtest 07. It still does > that. > > In the dev branch, reverting 4f42f80a8f - "ext4: use s_extent_max_zeroout_kb > value as number of kb" - results in success for 285 in the 1k test case. Hi Eric, I see what's going on. First of all it isn't a bug. :-) Please let me describe why it happens. In this commit (4f42f80a8f), it tries to fix a bug that we never zero out an unwritten extent. So after applied it, when an unwritten extent is converted, it could be zeroed out. In xfstests #285 subtest 08 it preallocates an unwritten extent which is 4MB. Then it writes some data at offset 10 * blocksize, which the length is one blocksize, and calles sync_file_range(2) to flush it. So the call trace looks like: ext4_fallocate() ->ext4_map_blocks() [one unwritten extent is allocated] ext4_file_write() ext4_da_writepages() ->ext4_map_blocks() with EXT4_GET_BLOCKS_CREATE flag ->ext4_ext_handle_uninitialized_extents() ->ext4_ext_convert_to_initialized() In ext4_ext_convert_to_initialized() it tries to zero out unwritten extent if condition is matched. Let's see what happens. case a) 1k block size max_zeroout: 32 ee_len: 4096 allocated: 4086 m_len: 1 In this case, the following condition is matched. fs/ext4/extents.c:3310 else if (map->m_lblk - ee_block + map-m_len < max_zeroout) 10 - 0 + 1 < 32 So unwritten extent [0,11] will be converted to written. That is why 11264 (11 * 1k) is returned when we seek a hole from offset 0 and 1, and 0 and 1 are returned when we seek a data from offset 0 and 1. case b) 4k block size max_zeroout: 8 ee_len: 1024 allocated: 1014 m_len: 1 In this case, the above condition won't be matched. else if (map->m_lblk - ee_block + map-m_len < max_zeroout) 10 - 0 + 1 < 8 So only one unwritten extent [10, 1] is converted, and the test can pass. Regards - Zheng ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: possible dev branch regression - xfstest 285/1k 2013-03-16 15:09 ` Zheng Liu @ 2013-03-17 3:06 ` Theodore Ts'o 2013-03-17 6:13 ` Zheng Liu 2013-03-18 16:10 ` Eric Sandeen 2013-03-17 3:36 ` Eric Whitney 1 sibling, 2 replies; 19+ messages in thread From: Theodore Ts'o @ 2013-03-17 3:06 UTC (permalink / raw) To: Eric Whitney, linux-ext4 On Sat, Mar 16, 2013 at 11:09:23PM +0800, Zheng Liu wrote: > > I see what's going on. First of all it isn't a bug. :-) Please let me > describe why it happens. > > In this commit (4f42f80a8f), it tries to fix a bug that we never zero > out an unwritten extent. So after applied it, when an unwritten extent > is converted, it could be zeroed out. In xfstests #285 subtest 08 it > preallocates an unwritten extent which is 4MB. Then it writes some data > at offset 10 * blocksize, which the length is one blocksize, and calles > sync_file_range(2) to flush it. Specifically, we are now honoring the default setting which sets the max_zeroout_kb value to be 32. With a 4k block file system, if we were to zeroout the extent, we would have to zero out 40k, which is greater than 32k, so resulting file after pwrite(fd, 4096, 40960) looks like this: % filefrag -v /u1/foo08 Filesystem type is: ef53 File size of /u1/foo08 is 4194304 (1024 blocks of 4096 bytes) ext: logical_offset: physical_offset: length: expected: flags: 0: 0.. 9: 1852416.. 1852425: 10: unwritten 1: 10.. 10: 1852426.. 1852426: 1: 2: 11.. 1023: 1852427.. 1853439: 1013: unwritten,eof /u1/foo08: 1 extent found With a 1k block file system, we only need to zero out 10k, which is less than 32k, and so after pwrite(fd, 1024, 10240), the file looks like this: % filefrag -v /mnt/foo08 Filesystem type is: ef53 File size of /mnt/foo08 is 4194304 (4096 blocks of 1024 bytes) ext: logical_offset: physical_offset: length: expected: flags: 0: 0.. 10: 81921.. 81931: 11: 1: 11.. 4095: 81932.. 86016: 4085: unwritten,eof /mnt/foo08: 1 extent found If we run src/seek_sanity_test by hand, we can make it happy by setting the following configuration option before we run it: echo 0 > /sys/fs/ext4/<dev>/extent_max_zeroout_kb I'm not sure what's the best way to make xfstest #285 happy, though. One way might be to change the test so that instead of writing the data at offset bufsize*10, we change it so it writes the data at offset bufsize*40, and change the expected values accordingly. The other would be to add some kind of ext4-specific hack to test #285 which manually sets the extent_max_zeroout_kb tuning parameter after the file system is mounted. I'm not sure which is more likely to be accepted by the xfstests maintainers. I suspect the former, but they may not like either solution, in which case we might have to disable 285 for ext4 and create an ext4-specific test. - Ted ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: possible dev branch regression - xfstest 285/1k 2013-03-17 3:06 ` Theodore Ts'o @ 2013-03-17 6:13 ` Zheng Liu 2013-03-18 16:10 ` Eric Sandeen 1 sibling, 0 replies; 19+ messages in thread From: Zheng Liu @ 2013-03-17 6:13 UTC (permalink / raw) To: Theodore Ts'o; +Cc: Eric Whitney, linux-ext4 Hi Ted, Thanks for looking at this. On Sat, Mar 16, 2013 at 11:06:48PM -0400, Theodore Ts'o wrote: > On Sat, Mar 16, 2013 at 11:09:23PM +0800, Zheng Liu wrote: > > > > I see what's going on. First of all it isn't a bug. :-) Please let me > > describe why it happens. > > > > In this commit (4f42f80a8f), it tries to fix a bug that we never zero > > out an unwritten extent. So after applied it, when an unwritten extent > > is converted, it could be zeroed out. In xfstests #285 subtest 08 it > > preallocates an unwritten extent which is 4MB. Then it writes some data > > at offset 10 * blocksize, which the length is one blocksize, and calles > > sync_file_range(2) to flush it. > > Specifically, we are now honoring the default setting which sets the > max_zeroout_kb value to be 32. With a 4k block file system, if we > were to zeroout the extent, we would have to zero out 40k, which is > greater than 32k, so resulting file after pwrite(fd, 4096, 40960) > looks like this: > > % filefrag -v /u1/foo08 > Filesystem type is: ef53 > File size of /u1/foo08 is 4194304 (1024 blocks of 4096 bytes) > ext: logical_offset: physical_offset: length: expected: flags: > 0: 0.. 9: 1852416.. 1852425: 10: unwritten > 1: 10.. 10: 1852426.. 1852426: 1: > 2: 11.. 1023: 1852427.. 1853439: 1013: unwritten,eof > /u1/foo08: 1 extent found > > With a 1k block file system, we only need to zero out 10k, which is > less than 32k, and so after pwrite(fd, 1024, 10240), the file looks > like this: > > % filefrag -v /mnt/foo08 > Filesystem type is: ef53 > File size of /mnt/foo08 is 4194304 (4096 blocks of 1024 bytes) > ext: logical_offset: physical_offset: length: expected: flags: > 0: 0.. 10: 81921.. 81931: 11: > 1: 11.. 4095: 81932.. 86016: 4085: unwritten,eof > /mnt/foo08: 1 extent found > > If we run src/seek_sanity_test by hand, we can make it happy by > setting the following configuration option before we run it: > > echo 0 > /sys/fs/ext4/<dev>/extent_max_zeroout_kb > > I'm not sure what's the best way to make xfstest #285 happy, though. > > One way might be to change the test so that instead of writing the > data at offset bufsize*10, we change it so it writes the data at > offset bufsize*40, and change the expected values accordingly. The > other would be to add some kind of ext4-specific hack to test #285 > which manually sets the extent_max_zeroout_kb tuning parameter after > the file system is mounted. > > I'm not sure which is more likely to be accepted by the xfstests > maintainers. I suspect the former, but they may not like either > solution, in which case we might have to disable 285 for ext4 and > create an ext4-specific test. It has been on my TODO list for a long time. I will try the former. I think we just need to disable 285 for ext4 with indirect-based file and create a new generic test for all file systems. Regards, - Zheng ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: possible dev branch regression - xfstest 285/1k 2013-03-17 3:06 ` Theodore Ts'o 2013-03-17 6:13 ` Zheng Liu @ 2013-03-18 16:10 ` Eric Sandeen 2013-03-18 16:54 ` gnehzuil.liu 2013-03-18 17:09 ` Theodore Ts'o 1 sibling, 2 replies; 19+ messages in thread From: Eric Sandeen @ 2013-03-18 16:10 UTC (permalink / raw) To: Theodore Ts'o; +Cc: Eric Whitney, linux-ext4 On 3/16/13 10:06 PM, Theodore Ts'o wrote: > On Sat, Mar 16, 2013 at 11:09:23PM +0800, Zheng Liu wrote: >> >> I see what's going on. First of all it isn't a bug. :-) Please let me >> describe why it happens. >> >> In this commit (4f42f80a8f), it tries to fix a bug that we never zero >> out an unwritten extent. So after applied it, when an unwritten extent >> is converted, it could be zeroed out. In xfstests #285 subtest 08 it >> preallocates an unwritten extent which is 4MB. Then it writes some data >> at offset 10 * blocksize, which the length is one blocksize, and calles >> sync_file_range(2) to flush it. > > Specifically, we are now honoring the default setting which sets the > max_zeroout_kb value to be 32. With a 4k block file system, if we > were to zeroout the extent, we would have to zero out 40k, which is > greater than 32k, so resulting file after pwrite(fd, 4096, 40960) > looks like this: > > % filefrag -v /u1/foo08 > Filesystem type is: ef53 > File size of /u1/foo08 is 4194304 (1024 blocks of 4096 bytes) > ext: logical_offset: physical_offset: length: expected: flags: > 0: 0.. 9: 1852416.. 1852425: 10: unwritten > 1: 10.. 10: 1852426.. 1852426: 1: > 2: 11.. 1023: 1852427.. 1853439: 1013: unwritten,eof > /u1/foo08: 1 extent found > > With a 1k block file system, we only need to zero out 10k, which is > less than 32k, and so after pwrite(fd, 1024, 10240), the file looks > like this: > > % filefrag -v /mnt/foo08 > Filesystem type is: ef53 > File size of /mnt/foo08 is 4194304 (4096 blocks of 1024 bytes) > ext: logical_offset: physical_offset: length: expected: flags: > 0: 0.. 10: 81921.. 81931: 11: > 1: 11.. 4095: 81932.. 86016: 4085: unwritten,eof > /mnt/foo08: 1 extent found > So the issue is just that the test is looking for actual holes in specific locations , but the fs chose to allocate zero-filled blocks instead? > If we run src/seek_sanity_test by hand, we can make it happy by > setting the following configuration option before we run it: > > echo 0 > /sys/fs/ext4/<dev>/extent_max_zeroout_kb The test could do this too, right? _need_to_be_root and: if [ "$FSTYP" == "ext4" ]; then ORIG_ZEROOUT_KB=`cat /sys/fs/ext4/$TEST_DEV/extent_max_zeroout_kb` echo 0 > /sys/fs/ext4/$TEST_DEV/extent_max_zeroout_kb fi and put it back to default in _cleanup: echo $ORIG_ZEROOUT_KB > /sys/fs/ext4/$TEST_DEV/extent_max_zeroout_kb That way we'd be testing seek hole correctness w/o being subject to the vagaries in allocator behavior. -Eric > I'm not sure what's the best way to make xfstest #285 happy, though. > > One way might be to change the test so that instead of writing the > data at offset bufsize*10, we change it so it writes the data at > offset bufsize*40, and change the expected values accordingly. The > other would be to add some kind of ext4-specific hack to test #285 > which manually sets the extent_max_zeroout_kb tuning parameter after > the file system is mounted. > > I'm not sure which is more likely to be accepted by the xfstests > maintainers. I suspect the former, but they may not like either > solution, in which case we might have to disable 285 for ext4 and > create an ext4-specific test. > > - Ted > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: possible dev branch regression - xfstest 285/1k 2013-03-18 16:10 ` Eric Sandeen @ 2013-03-18 16:54 ` gnehzuil.liu 2013-03-18 17:09 ` Theodore Ts'o 1 sibling, 0 replies; 19+ messages in thread From: gnehzuil.liu @ 2013-03-18 16:54 UTC (permalink / raw) To: Eric Sandeen; +Cc: Theodore Ts'o, Eric Whitney, linux-ext4@vger.kernel.org Hi Eric, 在 2013-3-19,上午12:10,Eric Sandeen <sandeen@redhat.com> 写道: > On 3/16/13 10:06 PM, Theodore Ts'o wrote: >> On Sat, Mar 16, 2013 at 11:09:23PM +0800, Zheng Liu wrote: >>> >>> I see what's going on. First of all it isn't a bug. :-) Please let me >>> describe why it happens. >>> >>> In this commit (4f42f80a8f), it tries to fix a bug that we never zero >>> out an unwritten extent. So after applied it, when an unwritten extent >>> is converted, it could be zeroed out. In xfstests #285 subtest 08 it >>> preallocates an unwritten extent which is 4MB. Then it writes some data >>> at offset 10 * blocksize, which the length is one blocksize, and calles >>> sync_file_range(2) to flush it. >> >> Specifically, we are now honoring the default setting which sets the >> max_zeroout_kb value to be 32. With a 4k block file system, if we >> were to zeroout the extent, we would have to zero out 40k, which is >> greater than 32k, so resulting file after pwrite(fd, 4096, 40960) >> looks like this: >> >> % filefrag -v /u1/foo08 >> Filesystem type is: ef53 >> File size of /u1/foo08 is 4194304 (1024 blocks of 4096 bytes) >> ext: logical_offset: physical_offset: length: expected: flags: >> 0: 0.. 9: 1852416.. 1852425: 10: unwritten >> 1: 10.. 10: 1852426.. 1852426: 1: >> 2: 11.. 1023: 1852427.. 1853439: 1013: unwritten,eof >> /u1/foo08: 1 extent found >> >> With a 1k block file system, we only need to zero out 10k, which is >> less than 32k, and so after pwrite(fd, 1024, 10240), the file looks >> like this: >> >> % filefrag -v /mnt/foo08 >> Filesystem type is: ef53 >> File size of /mnt/foo08 is 4194304 (4096 blocks of 1024 bytes) >> ext: logical_offset: physical_offset: length: expected: flags: >> 0: 0.. 10: 81921.. 81931: 11: >> 1: 11.. 4095: 81932.. 86016: 4085: unwritten,eof >> /mnt/foo08: 1 extent found > > So the issue is just that the test is looking for actual holes > in specific locations , but the fs chose to allocate zero-filled > blocks instead? Yes, it is. > >> If we run src/seek_sanity_test by hand, we can make it happy by >> setting the following configuration option before we run it: >> >> echo 0 > /sys/fs/ext4/<dev>/extent_max_zeroout_kb > > The test could do this too, right? > > _need_to_be_root > > and: > > if [ "$FSTYP" == "ext4" ]; then > ORIG_ZEROOUT_KB=`cat /sys/fs/ext4/$TEST_DEV/extent_max_zeroout_kb` > echo 0 > /sys/fs/ext4/$TEST_DEV/extent_max_zeroout_kb > fi > > and put it back to default in _cleanup: > > echo $ORIG_ZEROOUT_KB > /sys/fs/ext4/$TEST_DEV/extent_max_zeroout_kb > > That way we'd be testing seek hole correctness w/o being subject to > the vagaries in allocator behavior. Good idea. I will try it. Thanks, - Zheng-- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: possible dev branch regression - xfstest 285/1k 2013-03-18 16:10 ` Eric Sandeen 2013-03-18 16:54 ` gnehzuil.liu @ 2013-03-18 17:09 ` Theodore Ts'o 2013-03-18 17:34 ` Eric Sandeen 1 sibling, 1 reply; 19+ messages in thread From: Theodore Ts'o @ 2013-03-18 17:09 UTC (permalink / raw) To: Eric Sandeen; +Cc: Eric Whitney, linux-ext4 On Mon, Mar 18, 2013 at 11:10:51AM -0500, Eric Sandeen wrote: > > The test could do this too, right? > > _need_to_be_root > > and: > > if [ "$FSTYP" == "ext4" ]; then > ORIG_ZEROOUT_KB=`cat /sys/fs/ext4/$TEST_DEV/extent_max_zeroout_kb` > echo 0 > /sys/fs/ext4/$TEST_DEV/extent_max_zeroout_kb > fi > > and put it back to default in _cleanup: > > echo $ORIG_ZEROOUT_KB > /sys/fs/ext4/$TEST_DEV/extent_max_zeroout_kb > > That way we'd be testing seek hole correctness w/o being subject to > the vagaries in allocator behavior. Yeah, the question is whether it would be more acceptable to put ext4-specific hacks like this into the test, or to modify src/seek_sanity_test.c so that it writes the test block-size block using pwrite at offset blocksize*42 instead of offset blocksize*10. I had assumed putting hacks which tweaked sysfs tunables into the xfstest script itself would be frowned upon, but if that's considered OK, that would be great. - Ted ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: possible dev branch regression - xfstest 285/1k 2013-03-18 17:09 ` Theodore Ts'o @ 2013-03-18 17:34 ` Eric Sandeen 2013-03-18 20:41 ` Ben Myers 0 siblings, 1 reply; 19+ messages in thread From: Eric Sandeen @ 2013-03-18 17:34 UTC (permalink / raw) To: Theodore Ts'o; +Cc: Eric Whitney, linux-ext4, xfs-oss On 3/18/13 12:09 PM, Theodore Ts'o wrote: > On Mon, Mar 18, 2013 at 11:10:51AM -0500, Eric Sandeen wrote: <previous discussion thread about test 285 SEEK_HOLE test breaking on ext4 due to change in opportunistic hole-filling behavior and how to make it work again on ext4, and mention of sysctl which makes it pass> >> The test could do this too, right? >> >> _need_to_be_root >> >> and: >> >> if [ "$FSTYP" == "ext4" ]; then >> ORIG_ZEROOUT_KB=`cat /sys/fs/ext4/$TEST_DEV/extent_max_zeroout_kb` >> echo 0 > /sys/fs/ext4/$TEST_DEV/extent_max_zeroout_kb >> fi >> >> and put it back to default in _cleanup: >> >> echo $ORIG_ZEROOUT_KB > /sys/fs/ext4/$TEST_DEV/extent_max_zeroout_kb >> >> That way we'd be testing seek hole correctness w/o being subject to >> the vagaries in allocator behavior. > > Yeah, the question is whether it would be more acceptable to put > ext4-specific hacks like this into the test, or to modify > src/seek_sanity_test.c so that it writes the test block-size block > using pwrite at offset blocksize*42 instead of offset blocksize*10. That seems like more of an obtuse hack, since it depends on current default behavior, right? Explicitly setting the zeroout to 0, with a comment as to why, should make it clear to the reader of the test I think. I'll have to look, xfs speculative preallocation fills in holes in some cases as well, I'm not certain how it behaves on this test. But we could put in a specific tuning for xfs as well if needed. If it becomes clear that every fs requires tuning to not opportunistically fill in holes, then maybe we should make it non-generic, and only support filesystems we've tested or tuned to work with the testcase. > I had assumed putting hacks which tweaked sysfs tunables into the > xfstest script itself would be frowned upon, but if that's considered > OK, that would be great. I don't see any real problem with it, myself. cc: xfs list to see if there are any objections... -Eric > > - Ted > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: possible dev branch regression - xfstest 285/1k 2013-03-18 17:34 ` Eric Sandeen @ 2013-03-18 20:41 ` Ben Myers 2013-03-18 23:12 ` Dave Chinner 0 siblings, 1 reply; 19+ messages in thread From: Ben Myers @ 2013-03-18 20:41 UTC (permalink / raw) To: Eric Sandeen; +Cc: Eric Whitney, linux-ext4, Theodore Ts'o, xfs-oss Hi Eric, On Mon, Mar 18, 2013 at 12:34:59PM -0500, Eric Sandeen wrote: > On 3/18/13 12:09 PM, Theodore Ts'o wrote: > > On Mon, Mar 18, 2013 at 11:10:51AM -0500, Eric Sandeen wrote: > > <previous discussion thread about test 285 SEEK_HOLE test > breaking on ext4 due to change in opportunistic hole-filling > behavior and how to make it work again on ext4, and mention > of sysctl which makes it pass> > > >> The test could do this too, right? > >> > >> _need_to_be_root > >> > >> and: > >> > >> if [ "$FSTYP" == "ext4" ]; then > >> ORIG_ZEROOUT_KB=`cat /sys/fs/ext4/$TEST_DEV/extent_max_zeroout_kb` > >> echo 0 > /sys/fs/ext4/$TEST_DEV/extent_max_zeroout_kb > >> fi > >> > >> and put it back to default in _cleanup: > >> > >> echo $ORIG_ZEROOUT_KB > /sys/fs/ext4/$TEST_DEV/extent_max_zeroout_kb > >> > >> That way we'd be testing seek hole correctness w/o being subject to > >> the vagaries in allocator behavior. > > > > Yeah, the question is whether it would be more acceptable to put > > ext4-specific hacks like this into the test, or to modify > > src/seek_sanity_test.c so that it writes the test block-size block > > using pwrite at offset blocksize*42 instead of offset blocksize*10. > > That seems like more of an obtuse hack, since it depends on current > default behavior, right? > > Explicitly setting the zeroout to 0, with a comment as to why, should > make it clear to the reader of the test I think. > > I'll have to look, xfs speculative preallocation fills in holes in > some cases as well, I'm not certain how it behaves on this test. My impression was that we are not zeroing holes, but I'd also have to look to be sure. ;) > But we could put in a specific tuning for xfs as well if needed. > > If it becomes clear that every fs requires tuning to not opportunistically > fill in holes, then maybe we should make it non-generic, and only support > filesystems we've tested or tuned to work with the testcase. > > > I had assumed putting hacks which tweaked sysfs tunables into the xfstest > > script itself would be frowned upon, but if that's considered OK, that > > would be great. > > I don't see any real problem with it, myself. > > cc: xfs list to see if there are any objections... Seems like the options being discussed so far are: 1) make the test fs specific 2) filesystem specific hacks to disable opportunistic zeroing of holes 3) modify the test output to work with current ext4 default behavior It might be hard to find a tuning to produce identical output for xfs and ext4 (option 3), and option 1 and 2 are a also bit clunky. How about option 4) fs-specific test output? We wouldn't have multiple copies of the same test laying around, and ext4 could still run with default settings. Regards, Ben _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: possible dev branch regression - xfstest 285/1k 2013-03-18 20:41 ` Ben Myers @ 2013-03-18 23:12 ` Dave Chinner 2013-03-19 1:40 ` Theodore Ts'o 2013-03-19 1:47 ` Dave Chinner 0 siblings, 2 replies; 19+ messages in thread From: Dave Chinner @ 2013-03-18 23:12 UTC (permalink / raw) To: Ben Myers Cc: Eric Sandeen, Theodore Ts'o, xfs-oss, linux-ext4, Eric Whitney On Mon, Mar 18, 2013 at 03:41:33PM -0500, Ben Myers wrote: > Hi Eric, > > On Mon, Mar 18, 2013 at 12:34:59PM -0500, Eric Sandeen wrote: > > On 3/18/13 12:09 PM, Theodore Ts'o wrote: > > > On Mon, Mar 18, 2013 at 11:10:51AM -0500, Eric Sandeen wrote: > > > > <previous discussion thread about test 285 SEEK_HOLE test > > breaking on ext4 due to change in opportunistic hole-filling > > behavior and how to make it work again on ext4, and mention > > of sysctl which makes it pass> > > > > >> The test could do this too, right? > > >> > > >> _need_to_be_root > > >> > > >> and: > > >> > > >> if [ "$FSTYP" == "ext4" ]; then > > >> ORIG_ZEROOUT_KB=`cat /sys/fs/ext4/$TEST_DEV/extent_max_zeroout_kb` > > >> echo 0 > /sys/fs/ext4/$TEST_DEV/extent_max_zeroout_kb > > >> fi > > >> > > >> and put it back to default in _cleanup: > > >> > > >> echo $ORIG_ZEROOUT_KB > /sys/fs/ext4/$TEST_DEV/extent_max_zeroout_kb > > >> > > >> That way we'd be testing seek hole correctness w/o being subject to > > >> the vagaries in allocator behavior. > > > > > > Yeah, the question is whether it would be more acceptable to put > > > ext4-specific hacks like this into the test, or to modify > > > src/seek_sanity_test.c so that it writes the test block-size block > > > using pwrite at offset blocksize*42 instead of offset blocksize*10. > > > > That seems like more of an obtuse hack, since it depends on current > > default behavior, right? > > > > Explicitly setting the zeroout to 0, with a comment as to why, should > > make it clear to the reader of the test I think. > > > > I'll have to look, xfs speculative preallocation fills in holes in > > some cases as well, I'm not certain how it behaves on this test. > > My impression was that we are not zeroing holes, but I'd also have to look to > be sure. ;) Depends if the holes fall at EOF and there's a specualtive delalloc over them. There isn't for this test, and the recent changes will prevent it altogether, so there isn't a worry for XFS here. > > But we could put in a specific tuning for xfs as well if needed. > > > > If it becomes clear that every fs requires tuning to not opportunistically > > fill in holes, then maybe we should make it non-generic, and only support > > filesystems we've tested or tuned to work with the testcase. > > > > > I had assumed putting hacks which tweaked sysfs tunables into the xfstest > > > script itself would be frowned upon, but if that's considered OK, that > > > would be great. > > > > I don't see any real problem with it, myself. > > > > cc: xfs list to see if there are any objections... > > Seems like the options being discussed so far are: > > 1) make the test fs specific It was, originally. > 2) filesystem specific hacks to disable opportunistic zeroing of holes Not scalable. > 3) modify the test output to work with current ext4 default behavior And then we need hacks for each filesystem as the change behaviour. > It might be hard to find a tuning to produce identical output for xfs and ext4 > (option 3), and option 1 and 2 are a also bit clunky. > > How about option 4) fs-specific test output? No, that's even worse. If we have filesystem specific output, then write a set of filesystem specific tests that use a common piece of code to run the test. the only difference between the tests will then be the _supported_fs line. No need to hack in special output file handling, etc. > We wouldn't have multiple copies of the same test laying around, and ext4 could > still run with default settings. And when the default settings change, or some other bug fix comes along? So, let's step back a moment and ask ourselves what the test is actaully trying to test. zero-out is not what it is trying to test, nor is it trying to test specific file layouts. This is a basic *defragmenter* sanity test. SO, we're testing 2 things: 1. the defragmenter can recognise a fragmented file and fix it; and 2. the defragmenter can recognise a sparse file and not modify it. I know that Ted has already asked "what is an extent", but that's also missing the point. An extent is defined, just like for on-disk extent records, as a region of a file that is both logically and physically contiguous. From that, a fragmented file is a file that is logically contiguous but physically disjointed, and a sparse file is one that is logically disjointed. i.e. it is the relationship between extents that defines "sparse" and "fragmented", not the definition of an extent itself. Looking at the test itself, then. The backwards synchronous write trick that is used by 218? That's an underhanded trick to make XFS create a fragmented file. We are not testing that the defragmenter knows that it's a backwards written file - we are testing that it sees the file as logically contiguous and physically disjointed, and then defragments it successfully. Similarly, the remaining two tests are checking that a sparse file with a couple of different layouts are detected and left alone. The first sparse file will be both logically and physically disjointed, the second one is logically disjointed but often ends up physically contiguous. That's what we are actually testing here. We are not testing that exact, specific file layouts are handled correctly, we are checking that the defragmenter recognises the different extent relationships to determine which it shoul defragment and those it should leave alone. IOWs, we can change this test to create files in any way we want, as long as the files fit the same 3 categories: 1. logically contiguous, physically disjointed - successful defragmentation 2. logically disjointed, physically disjointed - unchanged 3. logically disjointed, physically contiguous - unchanged. If that means we need to create the files differently to ensure we end up with the layouts we need for different filesystems, then so be it. If we can't do that generically for all the supported filesystems, then lets split the test apart again into filesystem specific tests and make 218 an XFS only test again. Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: possible dev branch regression - xfstest 285/1k 2013-03-18 23:12 ` Dave Chinner @ 2013-03-19 1:40 ` Theodore Ts'o 2013-03-19 2:07 ` Dave Chinner 2013-03-19 1:47 ` Dave Chinner 1 sibling, 1 reply; 19+ messages in thread From: Theodore Ts'o @ 2013-03-19 1:40 UTC (permalink / raw) To: Dave Chinner; +Cc: Eric Whitney, Eric Sandeen, Ben Myers, linux-ext4, xfs-oss On Tue, Mar 19, 2013 at 10:12:33AM +1100, Dave Chinner wrote: > I know that Ted has already asked "what is an extent", but that's > also missing the point. An extent is defined, just like for on-disk > extent records, as a region of a file that is both logically and > physically contiguous. From that, a fragmented file is a file that > is logically contiguous but physically disjointed, and a sparse file > is one that is logically disjointed. i.e. it is the relationship > between extents that defines "sparse" and "fragmented", not the > definition of an extent itself. Dave --- I think we're talking about two different tests. This particular test is xfstest #285. The test in question is subtest #8, which preallocates a 4MB file, and then writes a block filled with 'a' which is sized to the file system block size, at offset 10*fs_block_size. It then checks to make sure SEEK_HOLE and SEEK_DATA is what it expects. This is why opportunistic hole filling (to avoid unnecessary expansion of the extent tree) is making a difference here. The problem with filesystem specific output is that the output is different depending on the blocksize. The test is also determining what's considered good or not as hard-coded logic in src/seek_sanity_test.c. So there's no fs-specific output at all in xfstest #285. > Looking at the test itself, then. The backwards synchronous write > trick that is used by 218? That's an underhanded trick to make XFS > create a fragmented file. We are not testing that the defragmenter > knows that it's a backwards written file - we are testing that it > sees the file as logically contiguous and physically disjointed, and > then defragments it successfully. What I was saying --- in the other mail thread --- is that it's open to question whether a file which is being written via a random-write pattern, resulting in a physically contiguous, but not contiguous from a logical block number point of view, is worth defragging or not. It all depends on whether the file is likely to be read sequentially in the future, or whether it will continue to be accessed via a random access pattern. In the latter case, it might not be worth defragging the file. In fact, I tend to agree with the argument we might as well attempt to make the file logically contiguous so that it's efficient to read the file sequentially. But the people at Fujitsu who wrote the algorithms in e2defrag had gone out of their way to detect this case and avoid defragging the file so long as the physical blocks in use were contiguous --- and I believe that's also a valid design decision. Depending on how we resolve this particular design question, we can then decide whether we need to make test #218 fs specific or not. There was no thought design choics made by ext4 should drive changes in how the defragger works in xfs or btrfs, or vice versa. So I was looking for discussion by the ext4 developers; I was not requesting any changes from the XFS developers with respect to test #218. (Not yet; and perhaps not ever.) Regards, - Ted _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: possible dev branch regression - xfstest 285/1k 2013-03-19 1:40 ` Theodore Ts'o @ 2013-03-19 2:07 ` Dave Chinner 0 siblings, 0 replies; 19+ messages in thread From: Dave Chinner @ 2013-03-19 2:07 UTC (permalink / raw) To: Theodore Ts'o Cc: Ben Myers, Eric Sandeen, xfs-oss, linux-ext4, Eric Whitney On Mon, Mar 18, 2013 at 09:40:14PM -0400, Theodore Ts'o wrote: > On Tue, Mar 19, 2013 at 10:12:33AM +1100, Dave Chinner wrote: > > I know that Ted has already asked "what is an extent", but that's > > also missing the point. An extent is defined, just like for on-disk > > extent records, as a region of a file that is both logically and > > physically contiguous. From that, a fragmented file is a file that > > is logically contiguous but physically disjointed, and a sparse file > > is one that is logically disjointed. i.e. it is the relationship > > between extents that defines "sparse" and "fragmented", not the > > definition of an extent itself. > > Dave --- I think we're talking about two different tests. This > particular test is xfstest #285. Yeah, I just realised that as I was reading through my ext4 list feed... > The test in question is subtest #8, which preallocates a 4MB file, and > then writes a block filled with 'a' which is sized to the file system > block size, at offset 10*fs_block_size. It then checks to make sure > SEEK_HOLE and SEEK_DATA is what it expects. Yup, and as I just said in reply to myself, this means the same reasoning applies - we can simply change the file layout to make holes large enough that zero-out isn't an issue. > > Looking at the test itself, then. The backwards synchronous write > > trick that is used by 218? That's an underhanded trick to make XFS > > create a fragmented file. We are not testing that the defragmenter > > knows that it's a backwards written file - we are testing that it > > sees the file as logically contiguous and physically disjointed, and > > then defragments it successfully. > > What I was saying --- in the other mail thread --- is that it's open > to question whether a file which is being written via a random-write > pattern, resulting in a physically contiguous, but not contiguous from > a logical block number point of view, is worth defragging or not. It > all depends on whether the file is likely to be read sequentially in > the future, or whether it will continue to be accessed via a random > access pattern. In the latter case, it might not be worth defragging > the file. AFAICT, that's something the defragmenter has no information on. For example, two files with identical fragmentation patterns may be accessed differently - how does the defragmenter know about that and hence treat each file differently? > In fact, I tend to agree with the argument we might as well attempt to > make the file logically contiguous so that it's efficient to read the > file sequentially. But the people at Fujitsu who wrote the algorithms > in e2defrag had gone out of their way to detect this case and avoid > defragging the file so long as the physical blocks in use were > contiguous --- and I believe that's also a valid design decision. Sure - I never said it wasn't a valid categorisation. What is now obvious to everyone is that it's a different defintion of fragmentation to what the test (and xfs_fsr) expects. ;) > Depending on how we resolve this particular design question, we can > then decide whether we need to make test #218 fs specific or not. > There was no thought design choics made by ext4 should drive changes > in how the defragger works in xfs or btrfs, or vice versa. Exactly. :) > So I was looking for discussion by the ext4 developers; I was not > requesting any changes from the XFS developers with respect to test > #218. (Not yet; and perhaps not ever.) I know - what i was trying to do was to make sure that everyone understood the theory behind the test before the discussion went too far off the beaten track... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: possible dev branch regression - xfstest 285/1k 2013-03-18 23:12 ` Dave Chinner 2013-03-19 1:40 ` Theodore Ts'o @ 2013-03-19 1:47 ` Dave Chinner 2013-03-19 2:00 ` Theodore Ts'o 1 sibling, 1 reply; 19+ messages in thread From: Dave Chinner @ 2013-03-19 1:47 UTC (permalink / raw) To: Ben Myers Cc: Eric Sandeen, Theodore Ts'o, xfs-oss, linux-ext4, Eric Whitney On Tue, Mar 19, 2013 at 10:12:33AM +1100, Dave Chinner wrote: > On Mon, Mar 18, 2013 at 03:41:33PM -0500, Ben Myers wrote: > > Hi Eric, > > > > On Mon, Mar 18, 2013 at 12:34:59PM -0500, Eric Sandeen wrote: > > > On 3/18/13 12:09 PM, Theodore Ts'o wrote: > > > > On Mon, Mar 18, 2013 at 11:10:51AM -0500, Eric Sandeen wrote: > > still run with default settings. > > And when the default settings change, or some other bug fix comes > along? > > So, let's step back a moment and ask ourselves what the test is > actaully trying to test. zero-out is not what it is trying to test, > nor is it trying to test specific file layouts. This is a basic > *defragmenter* sanity test. SO, we're testing 2 things: Sorry about this - I've mixed up my threads about ext4 having problems with zero-out being re-enabled. I thought this was a cross-post of the 218 issue.... However, the same reasoning can be applied to 285 - the file sizes, the size of the holes and the size of the data is all completely arbitrary. If we make the holes in the files larger, then the zero-out problem simply goes away. Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: possible dev branch regression - xfstest 285/1k 2013-03-19 1:47 ` Dave Chinner @ 2013-03-19 2:00 ` Theodore Ts'o 2013-03-19 2:22 ` Dave Chinner 2013-03-19 2:28 ` Eric Sandeen 0 siblings, 2 replies; 19+ messages in thread From: Theodore Ts'o @ 2013-03-19 2:00 UTC (permalink / raw) To: Dave Chinner; +Cc: Eric Whitney, Eric Sandeen, Ben Myers, linux-ext4, xfs-oss On Tue, Mar 19, 2013 at 12:47:18PM +1100, Dave Chinner wrote: > Sorry about this - I've mixed up my threads about ext4 having > problems with zero-out being re-enabled. I thought this was a > cross-post of the 218 issue.... > > However, the same reasoning can be applied to 285 - the file sizes, > the size of the holes and the size of the data is all completely > arbitrary. If we make the holes in the files larger, then the > zero-out problem simply goes away. Right. That was my observation. We can either make the holes larger, by changing: pwrite(fd, buf, bufsize, bufsize*10); to pwrite(fd, buf, bufsize, bufsize*42); ... and then changing the expected values returned by SEEK_HOLE/SEEK_DATA. (By the way; this only matters when we are testing 1k blocks; if we are using a 4k block size in ext4, the test currently passes.) Or we could set some ext4-specific tuning parameters into the #218 shell script, if the file system in question was ext4. I had assumed that folks would prefer making the holes larger, but Eric seemed to prefer the second choice as a better one. Hmm.... Another possibility is to define a directory structure where each test would look for the existence of some file such as fscust/<fs>/<test>, and so if fscust/ext4/218 exists, it would get sourced, and this would define potential hook functions that would get called after the file system is mounted. This way, the file system specific stuff is kept out of the way of the test script. Would that make adding fs-specific tuning/setup for tests more palatable? Regards, - Ted _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: possible dev branch regression - xfstest 285/1k 2013-03-19 2:00 ` Theodore Ts'o @ 2013-03-19 2:22 ` Dave Chinner 2013-03-19 2:28 ` Eric Sandeen 1 sibling, 0 replies; 19+ messages in thread From: Dave Chinner @ 2013-03-19 2:22 UTC (permalink / raw) To: Theodore Ts'o Cc: Ben Myers, Eric Sandeen, xfs-oss, linux-ext4, Eric Whitney On Mon, Mar 18, 2013 at 10:00:56PM -0400, Theodore Ts'o wrote: > On Tue, Mar 19, 2013 at 12:47:18PM +1100, Dave Chinner wrote: > > Sorry about this - I've mixed up my threads about ext4 having > > problems with zero-out being re-enabled. I thought this was a > > cross-post of the 218 issue.... > > > > However, the same reasoning can be applied to 285 - the file sizes, > > the size of the holes and the size of the data is all completely > > arbitrary. If we make the holes in the files larger, then the > > zero-out problem simply goes away. > > Right. That was my observation. We can either make the holes larger, > by changing: > > pwrite(fd, buf, bufsize, bufsize*10); > > to > > pwrite(fd, buf, bufsize, bufsize*42); > > ... and then changing the expected values returned by > SEEK_HOLE/SEEK_DATA. (By the way; this only matters when we are > testing 1k blocks; if we are using a 4k block size in ext4, the test > currently passes.) > > Or we could set some ext4-specific tuning parameters into the #218 > shell script, if the file system in question was ext4. Heh, you just mixed up 218 and 285 yourself. I crossed the streams, and now the universe is going to end. ;) Seriously, though, I'd prefer we don't need to tweak generic tests for specific filesystems if changing the file layout will solve the problem.... > I had assumed that folks would prefer making the holes larger, but > Eric seemed to prefer the second choice as a better one. > > > Hmm.... Another possibility is to define a directory structure where > each test would look for the existence of some file such as > fscust/<fs>/<test>, and so if fscust/ext4/218 exists, it would get > sourced, and this would define potential hook functions that would get > called after the file system is mounted. This way, the file system > specific stuff is kept out of the way of the test script. Would that > make adding fs-specific tuning/setup for tests more palatable? >From an architectural POV, I think that if we need filesystem specific tuning, it's not a generic test. If we have common test that needs different setup and tunings for each filesystem, then I'd prefer to think of a test "template" that can be used by the filesytem specific tests. We already have this sort of structure for some tests (e.g. _test_generic_punch()) where we have factored out the common parts of several tests so they can be shared. Hence if we end up with needing to do this, I'd prefer to see something like: tests/template/foo and the individual fs tests do: tests/fs/foo-test <setup test> _clean_up() { .... <undo fs specific tuning> } <do fs specific tuning> . tests/template/foo <run test> That way we can create shared test templates without needing to add functions to the common/ directory, and so the common/ directory can slowly be cleaned up to contain only shared infrastructure code.... Indeed, this makes it easy to run the same test with different tunings and be able to see which tuning broke just by looking at the test results... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: possible dev branch regression - xfstest 285/1k 2013-03-19 2:00 ` Theodore Ts'o 2013-03-19 2:22 ` Dave Chinner @ 2013-03-19 2:28 ` Eric Sandeen 2013-03-19 8:50 ` Lukáš Czerner 1 sibling, 1 reply; 19+ messages in thread From: Eric Sandeen @ 2013-03-19 2:28 UTC (permalink / raw) To: Theodore Ts'o Cc: Eric Sandeen, Eric Whitney, xfs-oss, Ben Myers, linux-ext4 On 3/18/13 9:00 PM, Theodore Ts'o wrote: > On Tue, Mar 19, 2013 at 12:47:18PM +1100, Dave Chinner wrote: >> Sorry about this - I've mixed up my threads about ext4 having >> problems with zero-out being re-enabled. I thought this was a >> cross-post of the 218 issue.... >> >> However, the same reasoning can be applied to 285 - the file sizes, >> the size of the holes and the size of the data is all completely >> arbitrary. If we make the holes in the files larger, then the >> zero-out problem simply goes away. > > Right. That was my observation. We can either make the holes larger, > by changing: > > pwrite(fd, buf, bufsize, bufsize*10); > > to > > pwrite(fd, buf, bufsize, bufsize*42); > > ... and then changing the expected values returned by > SEEK_HOLE/SEEK_DATA. (By the way; this only matters when we are > testing 1k blocks; if we are using a 4k block size in ext4, the test > currently passes.) > > Or we could set some ext4-specific tuning parameters into the #218 285! :) > shell script, if the file system in question was ext4. > > I had assumed that folks would prefer making the holes larger, but > Eric seemed to prefer the second choice as a better one. Ok, after the discussion I'm convinced too. Stretching out the allocation to avoid fill-in probably makes sense. But maybe not "42" - how about something much larger, so that any "reasonable" filesystem wouldn't even consider zeroing the range in between? -Eric _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: possible dev branch regression - xfstest 285/1k 2013-03-19 2:28 ` Eric Sandeen @ 2013-03-19 8:50 ` Lukáš Czerner 0 siblings, 0 replies; 19+ messages in thread From: Lukáš Czerner @ 2013-03-19 8:50 UTC (permalink / raw) To: Eric Sandeen Cc: Eric Sandeen, xfs-oss, Theodore Ts'o, Eric Whitney, Ben Myers, linux-ext4 On Mon, 18 Mar 2013, Eric Sandeen wrote: > Date: Mon, 18 Mar 2013 21:28:22 -0500 > From: Eric Sandeen <sandeen@sandeen.net> > To: Theodore Ts'o <tytso@mit.edu> > Cc: Dave Chinner <david@fromorbit.com>, Eric Whitney <enwlinux@gmail.com>, > Eric Sandeen <sandeen@redhat.com>, Ben Myers <bpm@sgi.com>, > linux-ext4@vger.kernel.org, xfs-oss <xfs@oss.sgi.com> > Subject: Re: possible dev branch regression - xfstest 285/1k > > On 3/18/13 9:00 PM, Theodore Ts'o wrote: > > On Tue, Mar 19, 2013 at 12:47:18PM +1100, Dave Chinner wrote: > >> Sorry about this - I've mixed up my threads about ext4 having > >> problems with zero-out being re-enabled. I thought this was a > >> cross-post of the 218 issue.... > >> > >> However, the same reasoning can be applied to 285 - the file sizes, > >> the size of the holes and the size of the data is all completely > >> arbitrary. If we make the holes in the files larger, then the > >> zero-out problem simply goes away. > > > > Right. That was my observation. We can either make the holes larger, > > by changing: > > > > pwrite(fd, buf, bufsize, bufsize*10); > > > > to > > > > pwrite(fd, buf, bufsize, bufsize*42); > > > > ... and then changing the expected values returned by > > SEEK_HOLE/SEEK_DATA. (By the way; this only matters when we are > > testing 1k blocks; if we are using a 4k block size in ext4, the test > > currently passes.) > > > > Or we could set some ext4-specific tuning parameters into the #218 > > 285! :) > > > shell script, if the file system in question was ext4. > > > > I had assumed that folks would prefer making the holes larger, but > > Eric seemed to prefer the second choice as a better one. > > Ok, after the discussion I'm convinced too. Stretching out the allocation > to avoid fill-in probably makes sense. But maybe not "42" - > how about something much larger, so that any "reasonable" filesystem > wouldn't even consider zeroing the range in between? I am actually in favour of 42. 42 is "The answer" here :) -Lukas > > -Eric > > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: possible dev branch regression - xfstest 285/1k 2013-03-16 15:09 ` Zheng Liu 2013-03-17 3:06 ` Theodore Ts'o @ 2013-03-17 3:36 ` Eric Whitney 1 sibling, 0 replies; 19+ messages in thread From: Eric Whitney @ 2013-03-17 3:36 UTC (permalink / raw) To: Eric Whitney, linux-ext4, tytso * Zheng Liu <gnehzuil.liu@gmail.com>: > On Fri, Mar 15, 2013 at 06:28:18PM -0400, Eric Whitney wrote: > > I'm seeing Xfstest 285 consistently fail for the 1k test case using the > > latest dev branch while running on both x86 and ARM. Subtest 08 is > > the problem. From the test output: > > > > 08. Test file with unwritten extents, only have unwritten pages > > 08.01 SEEK_HOLE expected 0 or 4194304, got 11264. FAIL > > 08.02 SEEK_HOLE expected 1 or 4194304, got 11264. FAIL > > 08.03 SEEK_DATA expected 10240 or 10240, got 0. FAIL > > 08.04 SEEK_DATA expected 10240 or 10240, got 1. FAIL > > > > From previous discussions, we expect 285 to fail in the ext3 (nodelalloc, > > no flex_bg, and no extents) test case, but in subtest 07. It still does > > that. > > > > In the dev branch, reverting 4f42f80a8f - "ext4: use s_extent_max_zeroout_kb > > value as number of kb" - results in success for 285 in the 1k test case. > > Hi Eric, > > I see what's going on. First of all it isn't a bug. :-) Please let me > describe why it happens. > > In this commit (4f42f80a8f), it tries to fix a bug that we never zero > out an unwritten extent. So after applied it, when an unwritten extent > is converted, it could be zeroed out. In xfstests #285 subtest 08 it > preallocates an unwritten extent which is 4MB. Then it writes some data > at offset 10 * blocksize, which the length is one blocksize, and calles > sync_file_range(2) to flush it. So the call trace looks like: > > ext4_fallocate() > ->ext4_map_blocks() > [one unwritten extent is allocated] > ext4_file_write() > ext4_da_writepages() > ->ext4_map_blocks() with EXT4_GET_BLOCKS_CREATE flag > ->ext4_ext_handle_uninitialized_extents() > ->ext4_ext_convert_to_initialized() > > In ext4_ext_convert_to_initialized() it tries to zero out unwritten > extent if condition is matched. Let's see what happens. > > case a) 1k block size > max_zeroout: 32 > ee_len: 4096 > allocated: 4086 > m_len: 1 > > In this case, the following condition is matched. > > fs/ext4/extents.c:3310 > > else if (map->m_lblk - ee_block + map-m_len < max_zeroout) > 10 - 0 + 1 < 32 > > So unwritten extent [0,11] will be converted to written. That is why > 11264 (11 * 1k) is returned when we seek a hole from offset 0 and 1, > and 0 and 1 are returned when we seek a data from offset 0 and 1. > > case b) 4k block size > max_zeroout: 8 > ee_len: 1024 > allocated: 1014 > m_len: 1 > > In this case, the above condition won't be matched. > > else if (map->m_lblk - ee_block + map-m_len < max_zeroout) > 10 - 0 + 1 < 8 > > So only one unwritten extent [10, 1] is converted, and the test can > pass. > Hi Zheng: Thanks very much for taking the time to look at this and for your clear explanation - much appreciated. I'm happy to hear there's no reason to be concerned about a regression, and that 4f42f80a8f simply exposed another problem in xfstest 285 when applied to ext4. Thanks, Eric ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2013-03-19 8:50 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-03-15 22:28 possible dev branch regression - xfstest 285/1k Eric Whitney 2013-03-16 2:32 ` Zheng Liu 2013-03-16 15:09 ` Zheng Liu 2013-03-17 3:06 ` Theodore Ts'o 2013-03-17 6:13 ` Zheng Liu 2013-03-18 16:10 ` Eric Sandeen 2013-03-18 16:54 ` gnehzuil.liu 2013-03-18 17:09 ` Theodore Ts'o 2013-03-18 17:34 ` Eric Sandeen 2013-03-18 20:41 ` Ben Myers 2013-03-18 23:12 ` Dave Chinner 2013-03-19 1:40 ` Theodore Ts'o 2013-03-19 2:07 ` Dave Chinner 2013-03-19 1:47 ` Dave Chinner 2013-03-19 2:00 ` Theodore Ts'o 2013-03-19 2:22 ` Dave Chinner 2013-03-19 2:28 ` Eric Sandeen 2013-03-19 8:50 ` Lukáš Czerner 2013-03-17 3:36 ` Eric Whitney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).