From: Petr Vorel <pvorel@suse.cz>
To: Eric Sandeen <sandeen@sandeen.net>
Cc: Dave Chinner <david@fromorbit.com>,
linux-block@vger.kernel.org, Jens Axboe <axboe@kernel.dk>,
Jan Kara <jack@suse.cz>, Hannes Reinecke <hare@suse.de>,
linux-xfs@vger.kernel.org, ltp@lists.linux.it
Subject: Re: LTP test df01.sh detected different size of loop device in v5.19
Date: Thu, 18 Aug 2022 17:25:33 +0200 [thread overview]
Message-ID: <Yv5Z7eu5RGnutMly@pevik> (raw)
In-Reply-To: <8d33a7a0-7a7c-47a1-ed84-83fd25089897@sandeen.net>
Hi Eric, all,
> On 8/15/22 4:31 AM, Petr Vorel wrote:
> > Hi Dave,
> >> On Fri, Aug 12, 2022 at 03:20:37PM +0200, Petr Vorel wrote:
> >>> Hi all,
> >>> LTP test df01.sh found different size of loop device in v5.19.
> >>> Test uses loop device formatted on various file systems, only XFS fails.
> >>> It randomly fails during verifying that loop size usage changes:
> >>> grep ${TST_DEVICE} output | grep -q "${total}.*${used}" [1]
> >>> How to reproduce:
> >>> # PATH="/opt/ltp/testcases/bin:$PATH" df01.sh -f xfs # it needs several tries to hit
> >>> df saved output:
> >>> Filesystem 1024-blocks Used Available Capacity Mounted on
> >>> ...
> >>> /dev/loop0 256672 16208 240464 7% /tmp/LTP_df01.1kRwoUCCR7/mntpoint
> >>> df output:
> >>> Filesystem 1024-blocks Used Available Capacity Mounted on
> >>> ...
> >>> tmpfs 201780 0 201780 0% /run/user/0
> >>> /dev/loop0 256672 15160 241512 6% /tmp/LTP_df01.1kRwoUCCR7/mntpoint
> >>> => different size
> >>> df01 4 TFAIL: 'df -k -P' failed, not expected.
> >> Yup, most likely because we changed something in XFS related to
> >> internal block reservation spaces. That is, the test is making
> >> fundamentally flawed assumptions about filesystem used space
> >> accounting.
> >> It is wrong to assuming that the available capacity of a given empty
> >> filesystem will never change. Assumptions like this have been
> >> invalid for decades because the available space can change based on
> >> the underlying configuration or the filesystem. e.g. different
> >> versions of mkfs.xfs set different default parameters and so simply
> >> changing the version of xfsprogs you use between the two comparision
> >> tests will make it fail....
> >> And, well, XFS also has XFS_IOC_{GS}ET_RESBLKS ioctls that allow
> >> userspace to change the amount of reserved blocks. They were
> >> introduced in 1997, and since then we've changed the default
> >> reservation the filesystem takes at least a dozen times.
> > Thanks a lot for valuable info.
> >>>> It might be a false positive / bug in the test, but it's at least a changed behavior.
> >> Yup, any test that assumes "available space" does not change from
> >> kernel version to kernel version is flawed. There is no guarantee
> >> that this ever stays the same, nor that it needs to stay the same.
> > I'm sorry I was not clear. Test [1] does not measure "available space" between
> > kernel releases. It just run df command with parameters, saves it's output
> > and compares "1024-blocks" and "Used" columns of df output with stat output:
> Annotating what these do...
> > local total=$(stat -f mntpoint --printf=%b) # number of blocks allocated
> > local free=$(stat -f mntpoint --printf=%f) # free blocks in filesystem
> > local used=$((total-free)) # (number of blocks free)
> > local bsize=$(stat -f mntpoint --printf=%s) # block size ("for faster transfers")
> > total=$((($total * $bsize + 512)/ 1024)) # number of 1k blocks allocated?
> > used=$((($used * $bsize + 512) / 1024)) # number of 1k blocks used?
> > And comparison with "$used" is what sometimes fails.
> > BTW this happens on both distros when loop device is on tmpfs. I'm trying to
> > trigger it on ext4 and btrfs, not successful so far. Looks like it's tmpfs
> > related.
> > If that's really expected, we might remove this check for used for XFS
> > (not sure if check only for total makes sense).
> It's kind of hard to follow this test, but it seems to be trying to
> ensure that df output is consistent with du (statfs) numbers, before
> and after creating and removing a 1MB file. I guess it's literally
> testing df itself, i.e. that it actually presents the numbers it obtained
> from statfs.
> AFAICT the difference in the failure is 1024 1K blocks, which is the size
> the file that's been created and removed during the test.
> My best guess is that this is xfs inode deferred inode inactivation hanging
> onto the space a little longer than the test expects.
> This is probably because the new-ish xfs inode inactivation no longer blocks
> on inode garbage collection during statfs().
> IOWS, I think the test expects that free space is reflected in statfs numbers
> immediately after a file is removed, and that's no longer the case here. They
> change in between the df check and the statfs check.
> (The test isn't just checking that the values are correct, it is checking that
> the values are /immediately/ correct.)
> Putting a "sleep 1" after the "rm -f" in the test seems to fix it; IIRC
> the max time to wait for inodegc is 1s. This does slow the test down a bit.
Sure, it looks like we can sleep just 50ms on my hw (although better might be to
poll for the result [1]), I just wanted to make sure there is no bug/regression
before hiding it with sleep.
Thanks for your input!
Kind regards,
Petr
[1] https://people.kernel.org/metan/why-sleep-is-almost-never-acceptable-in-tests
> -Eric
+++ testcases/commands/df/df01.sh
@@ -63,6 +63,10 @@ df_test()
tst_res TFAIL "'$cmd' failed."
fi
+ if [ "$DF_FS_TYPE" = xfs ]; then
+ tst_sleep 50ms
+ fi
+
ROD_SILENT rm -rf mntpoint/testimg
# flush file system buffers, then we can get the actual sizes.
WARNING: multiple messages have this Message-ID (diff)
From: Petr Vorel <pvorel@suse.cz>
To: Eric Sandeen <sandeen@sandeen.net>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-xfs@vger.kernel.org, Jan Kara <jack@suse.cz>,
Dave Chinner <david@fromorbit.com>,
linux-block@vger.kernel.org, Hannes Reinecke <hare@suse.de>,
ltp@lists.linux.it
Subject: Re: [LTP] LTP test df01.sh detected different size of loop device in v5.19
Date: Thu, 18 Aug 2022 17:25:33 +0200 [thread overview]
Message-ID: <Yv5Z7eu5RGnutMly@pevik> (raw)
In-Reply-To: <8d33a7a0-7a7c-47a1-ed84-83fd25089897@sandeen.net>
Hi Eric, all,
> On 8/15/22 4:31 AM, Petr Vorel wrote:
> > Hi Dave,
> >> On Fri, Aug 12, 2022 at 03:20:37PM +0200, Petr Vorel wrote:
> >>> Hi all,
> >>> LTP test df01.sh found different size of loop device in v5.19.
> >>> Test uses loop device formatted on various file systems, only XFS fails.
> >>> It randomly fails during verifying that loop size usage changes:
> >>> grep ${TST_DEVICE} output | grep -q "${total}.*${used}" [1]
> >>> How to reproduce:
> >>> # PATH="/opt/ltp/testcases/bin:$PATH" df01.sh -f xfs # it needs several tries to hit
> >>> df saved output:
> >>> Filesystem 1024-blocks Used Available Capacity Mounted on
> >>> ...
> >>> /dev/loop0 256672 16208 240464 7% /tmp/LTP_df01.1kRwoUCCR7/mntpoint
> >>> df output:
> >>> Filesystem 1024-blocks Used Available Capacity Mounted on
> >>> ...
> >>> tmpfs 201780 0 201780 0% /run/user/0
> >>> /dev/loop0 256672 15160 241512 6% /tmp/LTP_df01.1kRwoUCCR7/mntpoint
> >>> => different size
> >>> df01 4 TFAIL: 'df -k -P' failed, not expected.
> >> Yup, most likely because we changed something in XFS related to
> >> internal block reservation spaces. That is, the test is making
> >> fundamentally flawed assumptions about filesystem used space
> >> accounting.
> >> It is wrong to assuming that the available capacity of a given empty
> >> filesystem will never change. Assumptions like this have been
> >> invalid for decades because the available space can change based on
> >> the underlying configuration or the filesystem. e.g. different
> >> versions of mkfs.xfs set different default parameters and so simply
> >> changing the version of xfsprogs you use between the two comparision
> >> tests will make it fail....
> >> And, well, XFS also has XFS_IOC_{GS}ET_RESBLKS ioctls that allow
> >> userspace to change the amount of reserved blocks. They were
> >> introduced in 1997, and since then we've changed the default
> >> reservation the filesystem takes at least a dozen times.
> > Thanks a lot for valuable info.
> >>>> It might be a false positive / bug in the test, but it's at least a changed behavior.
> >> Yup, any test that assumes "available space" does not change from
> >> kernel version to kernel version is flawed. There is no guarantee
> >> that this ever stays the same, nor that it needs to stay the same.
> > I'm sorry I was not clear. Test [1] does not measure "available space" between
> > kernel releases. It just run df command with parameters, saves it's output
> > and compares "1024-blocks" and "Used" columns of df output with stat output:
> Annotating what these do...
> > local total=$(stat -f mntpoint --printf=%b) # number of blocks allocated
> > local free=$(stat -f mntpoint --printf=%f) # free blocks in filesystem
> > local used=$((total-free)) # (number of blocks free)
> > local bsize=$(stat -f mntpoint --printf=%s) # block size ("for faster transfers")
> > total=$((($total * $bsize + 512)/ 1024)) # number of 1k blocks allocated?
> > used=$((($used * $bsize + 512) / 1024)) # number of 1k blocks used?
> > And comparison with "$used" is what sometimes fails.
> > BTW this happens on both distros when loop device is on tmpfs. I'm trying to
> > trigger it on ext4 and btrfs, not successful so far. Looks like it's tmpfs
> > related.
> > If that's really expected, we might remove this check for used for XFS
> > (not sure if check only for total makes sense).
> It's kind of hard to follow this test, but it seems to be trying to
> ensure that df output is consistent with du (statfs) numbers, before
> and after creating and removing a 1MB file. I guess it's literally
> testing df itself, i.e. that it actually presents the numbers it obtained
> from statfs.
> AFAICT the difference in the failure is 1024 1K blocks, which is the size
> the file that's been created and removed during the test.
> My best guess is that this is xfs inode deferred inode inactivation hanging
> onto the space a little longer than the test expects.
> This is probably because the new-ish xfs inode inactivation no longer blocks
> on inode garbage collection during statfs().
> IOWS, I think the test expects that free space is reflected in statfs numbers
> immediately after a file is removed, and that's no longer the case here. They
> change in between the df check and the statfs check.
> (The test isn't just checking that the values are correct, it is checking that
> the values are /immediately/ correct.)
> Putting a "sleep 1" after the "rm -f" in the test seems to fix it; IIRC
> the max time to wait for inodegc is 1s. This does slow the test down a bit.
Sure, it looks like we can sleep just 50ms on my hw (although better might be to
poll for the result [1]), I just wanted to make sure there is no bug/regression
before hiding it with sleep.
Thanks for your input!
Kind regards,
Petr
[1] https://people.kernel.org/metan/why-sleep-is-almost-never-acceptable-in-tests
> -Eric
+++ testcases/commands/df/df01.sh
@@ -63,6 +63,10 @@ df_test()
tst_res TFAIL "'$cmd' failed."
fi
+ if [ "$DF_FS_TYPE" = xfs ]; then
+ tst_sleep 50ms
+ fi
+
ROD_SILENT rm -rf mntpoint/testimg
# flush file system buffers, then we can get the actual sizes.
--
Mailing list info: https://lists.linux.it/listinfo/ltp
next prev parent reply other threads:[~2022-08-18 15:25 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-12 13:20 LTP test df01.sh detected different size of loop device in v5.19 Petr Vorel
2022-08-12 13:24 ` Petr Vorel
2022-08-12 14:00 ` Petr Vorel
2022-08-12 14:00 ` [LTP] " Petr Vorel
2022-08-14 22:44 ` Dave Chinner
2022-08-14 22:44 ` [LTP] " Dave Chinner
2022-08-15 9:31 ` Petr Vorel
2022-08-15 9:31 ` [LTP] " Petr Vorel
2022-08-15 20:09 ` Eric Sandeen
2022-08-15 20:09 ` [LTP] " Eric Sandeen
2022-08-18 15:25 ` Petr Vorel [this message]
2022-08-18 15:25 ` Petr Vorel
2022-08-18 16:05 ` Eric Sandeen
2022-08-18 16:05 ` [LTP] " Eric Sandeen
2022-08-18 16:27 ` Darrick J. Wong
2022-08-18 16:27 ` [LTP] " Darrick J. Wong
2022-08-18 17:01 ` Petr Vorel
2022-08-18 17:01 ` [LTP] " Petr Vorel
2022-08-18 21:19 ` Eric Sandeen
2022-08-18 21:19 ` [LTP] " Eric Sandeen
2022-08-19 16:00 ` Petr Vorel
2022-08-19 16:00 ` [LTP] " Petr Vorel
2022-08-19 16:06 ` Bird, Tim
2022-08-19 16:06 ` Bird, Tim
2022-08-19 19:30 ` Petr Vorel
2022-08-19 19:30 ` Petr Vorel
2022-08-18 21:02 ` Dave Chinner
2022-08-18 21:02 ` [LTP] " Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yv5Z7eu5RGnutMly@pevik \
--to=pvorel@suse.cz \
--cc=axboe@kernel.dk \
--cc=david@fromorbit.com \
--cc=hare@suse.de \
--cc=jack@suse.cz \
--cc=linux-block@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=ltp@lists.linux.it \
--cc=sandeen@sandeen.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.