public inbox for fstests@vger.kernel.org
 help / color / mirror / Atom feed
* Excessive media wearout caused by generic/129
@ 2023-12-17 15:43 Alexander E. Patrakov
  2023-12-17 17:24 ` [PATCH] generic/129: add a safeguard against media wearout Alexander Patrakov
  2023-12-17 18:45 ` Excessive media wearout caused by generic/129 Alexander E. Patrakov
  0 siblings, 2 replies; 8+ messages in thread
From: Alexander E. Patrakov @ 2023-12-17 15:43 UTC (permalink / raw)
  To: fstests

Hello.

While running xfstests on an SMB share exposed by a Windows Server
2022 VM running on my laptop, I found an unpleasant surprise: a single
run subtracted 1.5% of the write resource of my SSD, which is quite a
lot. Could you please modify the test to avoid this booby trap? Or
actually, let me do it myself...

The root cause is that the test runs this command:

src/looptest -i 10000 -t -r -w -s -b 102400 /mnt/test/looptest/looptest2.tst

strace shows this pattern:

lseek(3, 0, SEEK_SET)                   = 0
ftruncate(3, 0)                         = 0
lseek(3, 1013145600, SEEK_SET)          = 1013145600
write(3, "\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\32\33\34\35\36\37"...,
102400) = 102400
lseek(3, 1013145600, SEEK_SET)          = 1013145600
read(3, "\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\32\33\34\35\36\37"...,
102400) = 102400
lseek(3, 0, SEEK_SET)                   = 0
ftruncate(3, 0)                         = 0
lseek(3, 1013248000, SEEK_SET)          = 1013248000
write(3, "\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\32\33\34\35\36\37"...,
102400) = 102400
lseek(3, 1013248000, SEEK_SET)          = 1013248000
read(3, "\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\32\33\34\35\36\37"...,
102400) = 102400

This is completely OK for filesystems that support sparse files,
however, such 1-gigabyte seeks are emulated by Windows as 1-gigabyte
writes because the NTFS filesystem does not actually support sparse
files. The result is the "looptest2" part of the test writes more than
4 TB of data.

I think that the test should be preceded by a small check of whether
files intended to be sparse are actually sparse. If not, either fail
or skip the test.

I might be able to create a patch in the next few hours...

-- 
Alexander E. Patrakov

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] generic/129: add a safeguard against media wearout
  2023-12-17 15:43 Excessive media wearout caused by generic/129 Alexander E. Patrakov
@ 2023-12-17 17:24 ` Alexander Patrakov
  2023-12-17 18:45 ` Excessive media wearout caused by generic/129 Alexander E. Patrakov
  1 sibling, 0 replies; 8+ messages in thread
From: Alexander Patrakov @ 2023-12-17 17:24 UTC (permalink / raw)
  To: fstests; +Cc: Alexander Patrakov

_require_sparse_files is implemented as a list of filesystems known not to
support sparse files, and so misses some cases.

If sparse files do not really work (as it is the case with CIFS and a
Windows Server 2022), this test writes to the disk all the zeros that
would normally be free due to sparse files. This amounts to many
terabytes and presents a significant media wearout concern.

Mitigate this by doing a small-scale test first and checking if the
resulting file ends up being not sufficiently sparse.

Signed-off-by: Alexander Patrakov <patrakov@gmail.com>
---
 src/looptest.c    | 9 +++++++--
 tests/generic/129 | 8 ++++++++
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/src/looptest.c b/src/looptest.c
index 7194a97d..2e620a60 100644
--- a/src/looptest.c
+++ b/src/looptest.c
@@ -56,6 +56,7 @@ enum {
     FLAG_FLUSH      = 64,
     FLAG_DELETE     = 128,
     FLAG_DIRECT     = 256,
+    FLAG_LEAVE      = 512,
 };
 
 void
@@ -74,6 +75,7 @@ usage(char *argv0)
         "              -s           = sequential\n"
         "              -f           = flush\n"
         "              -D           = direct-IO\n"
+        "              -L           = skip the final truncation\n"
 	"              -h           = usage\n",
             argv0);
 }
@@ -94,7 +96,7 @@ main(int argc, char *argv[])
     char                *buf            = NULL;
     int64_t 		seek_to 	= 0;
         
-    while ((c = getopt(argc, argv, "i:orwb:svthfFDd?")) != EOF) {
+    while ((c = getopt(argc, argv, "i:orwb:svthfFDLd?")) != EOF) {
 	switch (c) {
 	    case 'i':
 		count = atoi(optarg);
@@ -126,6 +128,9 @@ main(int argc, char *argv[])
             case 'D':
                 flags |= FLAG_DIRECT;
                 break;
+            case 'L':
+                flags |= FLAG_LEAVE;
+                break;
             case 'd':
                 flags |= FLAG_DELETE;
                 break;
@@ -230,7 +235,7 @@ main(int argc, char *argv[])
             }
         }
         
-        if (flags & FLAG_TRUNCATE) {
+        if ((flags & FLAG_TRUNCATE) && !((flags & FLAG_LEAVE) && (i == count - 1))) {
             if (flags & FLAG_VERBOSE)
                 printf("seek 0\n");
 
diff --git a/tests/generic/129 b/tests/generic/129
index 3d3a42a2..d1149b65 100755
--- a/tests/generic/129
+++ b/tests/generic/129
@@ -29,6 +29,14 @@ _scratch_mount "-o nosuid"
 
 mkdir $SCRATCH_MNT/looptest
 
+# looptest2, if the missing sparse file support is not detected, writes more
+# than 4 TB to the device. There was a complaint about media wearout, so do a
+# small-scale test first.
+$here/src/looptest -i 10 -L -t -r -w -s -b 102400 $SCRATCH_MNT/looptest/looptest0.tst
+resulting_file_size_kb=$( du -sk $SCRATCH_MNT/looptest/looptest0.tst | cut -f 1 )
+rm -f $SCRATCH_MNT/looptest/looptest0.tst
+[ $resulting_file_size_kb -gt 256 ] && _notrun "Test skipped due to media wearout concerns - sparse files do not work"
+
 $here/src/looptest -i 100000 -r -w -b 8192 -s $SCRATCH_MNT/looptest/looptest1.tst
 $here/src/looptest -i 10000 -t -r -w -s -b 102400 $SCRATCH_MNT/looptest/looptest2.tst
 $here/src/looptest -i 50000 -r -w -b 256 -s $SCRATCH_MNT/looptest/looptest3.tst
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Excessive media wearout caused by generic/129
  2023-12-17 15:43 Excessive media wearout caused by generic/129 Alexander E. Patrakov
  2023-12-17 17:24 ` [PATCH] generic/129: add a safeguard against media wearout Alexander Patrakov
@ 2023-12-17 18:45 ` Alexander E. Patrakov
  2023-12-17 21:00   ` [PATCH v2] _require_sparse_files: add a safeguard against media wearout Alexander Patrakov
  1 sibling, 1 reply; 8+ messages in thread
From: Alexander E. Patrakov @ 2023-12-17 18:45 UTC (permalink / raw)
  To: fstests

On Sun, Dec 17, 2023 at 11:43 PM Alexander E. Patrakov
<patrakov@gmail.com> wrote:
>
> Hello.
>
> While running xfstests on an SMB share exposed by a Windows Server
> 2022 VM running on my laptop, I found an unpleasant surprise: a single
> run subtracted 1.5% of the write resource of my SSD, which is quite a
> lot. Could you please modify the test to avoid this booby trap? Or
> actually, let me do it myself...

The same concern, with the same root cause (truncation at a
hundred-megabyte offset being transformed into a hundred-megabyte
write, and then repeated 10000 times), exists for generic/014. At this
point, I have a question: should this be mitigated at the individual
test level, or should the _require_sparse_files check be modified?

-- 
Alexander E. Patrakov

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] _require_sparse_files: add a safeguard against media wearout
  2023-12-17 18:45 ` Excessive media wearout caused by generic/129 Alexander E. Patrakov
@ 2023-12-17 21:00   ` Alexander Patrakov
  2023-12-18 17:45     ` Darrick J. Wong
  0 siblings, 1 reply; 8+ messages in thread
From: Alexander Patrakov @ 2023-12-17 21:00 UTC (permalink / raw)
  To: fstests; +Cc: Alexander Patrakov

_require_sparse_files is implemented as a list of filesystems known not to
support sparse files, and therefore it misses some cases.

However, if sparse files do not work as expected during a test, the risk
is that the test will write out to the disk all the zeros that would
normally be unwritten. This amounts to at least 4 TB for the generic/129
test, and therefore there is a significant media wearout concern here.

Adding more filesystems to the list of exclusions would not scale and
would not work anyway because CIFS backed by SAMBA is safe, while CIFS
backed by Windows Server 2022 is not.

In other words, Windows reserves the right to sometimes (!) ignore our
intent to create a sparse file.

More discussion: https://lore.kernel.org/fstests/20231206184759.GA3964019@frogsfrogsfrogs/T/#t

Mitigate this risk by doing a small-scale test that reliably triggers
Windows misbehavior and checking if the resulting file ends up being
not sufficiently sparse.

Signed-off-by: Alexander Patrakov <patrakov@gmail.com>
---
 common/rc | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/common/rc b/common/rc
index cc92fe06..5d27602a 100644
--- a/common/rc
+++ b/common/rc
@@ -2871,6 +2871,12 @@ _require_fs_space()
 # Check if the filesystem supports sparse files.
 #
 # Unfortunately there is no better way to do this than a manual black list.
+# However, letting tests expand all the holes and write terabytes of zeros to
+# the media is also not acceptable due to wearout concerns.
+#
+# Note: even though CIFS supports sparse files, this test will mark it as
+# failing the requirement if we can coax the server into allocating and writing
+# the ranges where holes are expected. This happens with Windows servers.
 #
 _require_sparse_files()
 {
@@ -2881,6 +2887,23 @@ _require_sparse_files()
     *)
         ;;
     esac
+
+    local testfile="$TEST_DIR/$$.sparsefiletest"
+    rm -f "$testfile"
+
+    # A small-scale version of looptest - known to trigger Microsoft SMB server
+    # into the decision to write zeros to the disk. Also creates a non-sparse file
+    # on vfat.
+    # See also the discussion at https://lore.kernel.org/fstests/20231206184759.GA3964019@frogsfrogsfrogs/T/#t
+    $XFS_IO_PROG -f \
+	-c 'truncate 0' -c 'pwrite -b 102400 -S 0x61 102400 102400' \
+	-c 'truncate 0' -c 'pwrite -b 102400 -S 0x61 204800 102400' \
+	-c 'truncate 0' -c 'pwrite -b 102400 -S 0x61 307200 102400' \
+	-c 'truncate 0' -c 'pwrite -b 102400 -S 0x61 409600 102400' "$testfile" >/dev/null
+    resulting_file_size_kb=$( du -sk "$testfile" | cut -f 1 )
+    rm -f "$testfile"
+    [ $resulting_file_size_kb -ge 300 ] && \
+	_notrun "Sparse files do not work as expected, skipping test due to media wearout concerns"
 }
 
 _require_debugfs()
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] _require_sparse_files: add a safeguard against media wearout
  2023-12-17 21:00   ` [PATCH v2] _require_sparse_files: add a safeguard against media wearout Alexander Patrakov
@ 2023-12-18 17:45     ` Darrick J. Wong
  2023-12-18 18:27       ` Alexander E. Patrakov
  0 siblings, 1 reply; 8+ messages in thread
From: Darrick J. Wong @ 2023-12-18 17:45 UTC (permalink / raw)
  To: Alexander Patrakov; +Cc: fstests

On Mon, Dec 18, 2023 at 05:00:53AM +0800, Alexander Patrakov wrote:
> _require_sparse_files is implemented as a list of filesystems known not to
> support sparse files, and therefore it misses some cases.
> 
> However, if sparse files do not work as expected during a test, the risk
> is that the test will write out to the disk all the zeros that would
> normally be unwritten. This amounts to at least 4 TB for the generic/129
> test, and therefore there is a significant media wearout concern here.
> 
> Adding more filesystems to the list of exclusions would not scale and
> would not work anyway because CIFS backed by SAMBA is safe, while CIFS
> backed by Windows Server 2022 is not.
> 
> In other words, Windows reserves the right to sometimes (!) ignore our
> intent to create a sparse file.
> 
> More discussion: https://lore.kernel.org/fstests/20231206184759.GA3964019@frogsfrogsfrogs/T/#t
> 
> Mitigate this risk by doing a small-scale test that reliably triggers
> Windows misbehavior and checking if the resulting file ends up being
> not sufficiently sparse.
> 
> Signed-off-by: Alexander Patrakov <patrakov@gmail.com>
> ---
>  common/rc | 23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/common/rc b/common/rc
> index cc92fe06..5d27602a 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -2871,6 +2871,12 @@ _require_fs_space()
>  # Check if the filesystem supports sparse files.
>  #
>  # Unfortunately there is no better way to do this than a manual black list.

FWIW if the truncate/write/nblocks check strategy below can be made more
general, then the comment above (and the FSTYP-filtering) become
unnecessary, right?

> +# However, letting tests expand all the holes and write terabytes of zeros to
> +# the media is also not acceptable due to wearout concerns.
> +#
> +# Note: even though CIFS supports sparse files, this test will mark it as
> +# failing the requirement if we can coax the server into allocating and writing
> +# the ranges where holes are expected. This happens with Windows servers.
>  #
>  _require_sparse_files()
>  {
> @@ -2881,6 +2887,23 @@ _require_sparse_files()
>      *)
>          ;;
>      esac
> +
> +    local testfile="$TEST_DIR/$$.sparsefiletest"
> +    rm -f "$testfile"
> +
> +    # A small-scale version of looptest - known to trigger Microsoft SMB server
> +    # into the decision to write zeros to the disk. Also creates a non-sparse file
> +    # on vfat.
> +    # See also the discussion at https://lore.kernel.org/fstests/20231206184759.GA3964019@frogsfrogsfrogs/T/#t
> +    $XFS_IO_PROG -f \
> +	-c 'truncate 0' -c 'pwrite -b 102400 -S 0x61 102400 102400' \
> +	-c 'truncate 0' -c 'pwrite -b 102400 -S 0x61 204800 102400' \
> +	-c 'truncate 0' -c 'pwrite -b 102400 -S 0x61 307200 102400' \
> +	-c 'truncate 0' -c 'pwrite -b 102400 -S 0x61 409600 102400' "$testfile" >/dev/null
> +    resulting_file_size_kb=$( du -sk "$testfile" | cut -f 1 )
> +    rm -f "$testfile"
> +    [ $resulting_file_size_kb -ge 300 ] && \

I might be missing something here because I've long forgotten how CIFS
and Windows work, but -- why is it necessary to truncate and write past
eof four times?  Won't the truncates free all the blocks associated with
the file?

Also, why isn't it sufficient to check that the du output doesn't exceed
~110K (adding 10% overhead)?

> +	_notrun "Sparse files do not work as expected, skipping test due to media wearout concerns"

I think the notrun message should be restricted to stating that sparse
files do not work as expected -- the other callers aren't necessarily
worried about wearout.

--D

>  }
>  
>  _require_debugfs()
> -- 
> 2.43.0
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] _require_sparse_files: add a safeguard against media wearout
  2023-12-18 17:45     ` Darrick J. Wong
@ 2023-12-18 18:27       ` Alexander E. Patrakov
  2023-12-18 20:57         ` [PATCH v3] _require_sparse_files: rewrite as a direct test instead of a black list Alexander Patrakov
  0 siblings, 1 reply; 8+ messages in thread
From: Alexander E. Patrakov @ 2023-12-18 18:27 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: fstests

Hello Darrick, and thanks for the feedback.

On Tue, Dec 19, 2023 at 1:45 AM Darrick J. Wong <djwong@kernel.org> wrote:
>
> On Mon, Dec 18, 2023 at 05:00:53AM +0800, Alexander Patrakov wrote:
> > _require_sparse_files is implemented as a list of filesystems known not to
> > support sparse files, and therefore it misses some cases.
> >
> > However, if sparse files do not work as expected during a test, the risk
> > is that the test will write out to the disk all the zeros that would
> > normally be unwritten. This amounts to at least 4 TB for the generic/129
> > test, and therefore there is a significant media wearout concern here.
> >
> > Adding more filesystems to the list of exclusions would not scale and
> > would not work anyway because CIFS backed by SAMBA is safe, while CIFS
> > backed by Windows Server 2022 is not.
> >
> > In other words, Windows reserves the right to sometimes (!) ignore our
> > intent to create a sparse file.
> >
> > More discussion: https://lore.kernel.org/fstests/20231206184759.GA3964019@frogsfrogsfrogs/T/#t
> >
> > Mitigate this risk by doing a small-scale test that reliably triggers
> > Windows misbehavior and checking if the resulting file ends up being
> > not sufficiently sparse.
> >
> > Signed-off-by: Alexander Patrakov <patrakov@gmail.com>
> > ---
> >  common/rc | 23 +++++++++++++++++++++++
> >  1 file changed, 23 insertions(+)
> >
> > diff --git a/common/rc b/common/rc
> > index cc92fe06..5d27602a 100644
> > --- a/common/rc
> > +++ b/common/rc
> > @@ -2871,6 +2871,12 @@ _require_fs_space()
> >  # Check if the filesystem supports sparse files.
> >  #
> >  # Unfortunately there is no better way to do this than a manual black list.
>
> FWIW if the truncate/write/nblocks check strategy below can be made more
> general, then the comment above (and the FSTYP-filtering) become
> unnecessary, right?

I think so. I will retest with exfat and hfsplus, and, if their
failures are caught by the code that I added, I will delete the black
list.

>
> > +# However, letting tests expand all the holes and write terabytes of zeros to
> > +# the media is also not acceptable due to wearout concerns.
> > +#
> > +# Note: even though CIFS supports sparse files, this test will mark it as
> > +# failing the requirement if we can coax the server into allocating and writing
> > +# the ranges where holes are expected. This happens with Windows servers.
> >  #
> >  _require_sparse_files()
> >  {
> > @@ -2881,6 +2887,23 @@ _require_sparse_files()
> >      *)
> >          ;;
> >      esac
> > +
> > +    local testfile="$TEST_DIR/$$.sparsefiletest"
> > +    rm -f "$testfile"
> > +
> > +    # A small-scale version of looptest - known to trigger Microsoft SMB server
> > +    # into the decision to write zeros to the disk. Also creates a non-sparse file
> > +    # on vfat.
> > +    # See also the discussion at https://lore.kernel.org/fstests/20231206184759.GA3964019@frogsfrogsfrogs/T/#t
> > +    $XFS_IO_PROG -f \
> > +     -c 'truncate 0' -c 'pwrite -b 102400 -S 0x61 102400 102400' \
> > +     -c 'truncate 0' -c 'pwrite -b 102400 -S 0x61 204800 102400' \
> > +     -c 'truncate 0' -c 'pwrite -b 102400 -S 0x61 307200 102400' \
> > +     -c 'truncate 0' -c 'pwrite -b 102400 -S 0x61 409600 102400' "$testfile" >/dev/null
> > +    resulting_file_size_kb=$( du -sk "$testfile" | cut -f 1 )
> > +    rm -f "$testfile"
> > +    [ $resulting_file_size_kb -ge 300 ] && \
>
> I might be missing something here because I've long forgotten how CIFS
> and Windows work, but -- why is it necessary to truncate and write past
> eof four times?  Won't the truncates free all the blocks associated with
> the file?

No, Windows has special not-fully-understood logic (or maybe a race)
here. If I leave only the last truncate/pwrite pair, Windows
successfully makes a sparse file here, and then writes multiple
terabytes of real zeros during generic/129, which is what I am trying
to avoid.

In other words, CIFS mounts from Windows do support sparse files in
general, but this particular pattern of writes triggers the "your file
shall not be sparse" logic.

Let me quote from
https://learn.microsoft.com/en-us/windows/win32/api/winioctl/ni-winioctl-fsctl_set_zero_data:

"""
If you use the FSCTL_SET_ZERO_DATA control code to write zeros (0) to
a non-sparse file, zeros (0) are written to the file. The system
allocates disk storage for all of the zero (0) range, which is
equivalent to using the WriteFile function to write zeros (0) to a
file.
"""

I guess here is what happens: initially the file is not created as a
sparse one (although I am not sure why - it would make more sense to
add another iteration at the beginning to ensure that). Then,
apparently, truncation to zero size does not reset the non-sparse
status. Then Windows sees that the file is not sparse, and so any
attempts to seek pas the EOF should, according to the Microsoft rule
quoted above, be translated to writing real zeros.

>
> Also, why isn't it sufficient to check that the du output doesn't exceed
> ~110K (adding 10% overhead)?

The overhead might be larger if the filesystem supports sparse files
but has, e.g., 64 KB native cluster size. Then a single badly-aligned
100 KB blob (which is what happens here) might actually occupy three
clusters and thus use 192 KB on the disk - which is still fine as long
as all preceding zeros are not written.

>
> > +     _notrun "Sparse files do not work as expected, skipping test due to media wearout concerns"
>
> I think the notrun message should be restricted to stating that sparse
> files do not work as expected -- the other callers aren't necessarily
> worried about wearout.

OK, the v3 will have this reworded.

>
> --D
>
> >  }
> >
> >  _require_debugfs()
> > --
> > 2.43.0
> >
> >

-- 
Alexander E. Patrakov

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3] _require_sparse_files: rewrite as a direct test instead of a black list
  2023-12-18 18:27       ` Alexander E. Patrakov
@ 2023-12-18 20:57         ` Alexander Patrakov
  2023-12-18 22:45           ` Darrick J. Wong
  0 siblings, 1 reply; 8+ messages in thread
From: Alexander Patrakov @ 2023-12-18 20:57 UTC (permalink / raw)
  To: fstests; +Cc: djwong, Alexander Patrakov

_require_sparse_files was implemented as a list of filesystems known not to
support sparse files, and therefore it missed some cases.

However, if sparse files do not work as expected during a test, the risk
is that the test will write out to the disk all the zeros that would
normally be unwritten. This amounts to at least 4 TB for the generic/129
test, and therefore there is a significant media wear-out concern here.

Adding more filesystems to the list of exclusions would not scale and
would not work anyway because CIFS backed by SAMBA is safe, while CIFS
backed by Windows Server 2022 is not (because the specific write
patterns found in generic/014 and generic/129 cause it to ignore the
otherwise-supported request to make a file sparse).

Mitigate this risk by rewriting the check as a small-scale test that
reliably triggers Windows misbehavior. The black list becomes unneeded
because the same test creates and detects non-sparse files on exfat and
hfsplus.

Signed-off-by: Alexander Patrakov <patrakov@gmail.com>
---
 common/rc | 29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/common/rc b/common/rc
index cc92fe06..a9e0ba7e 100644
--- a/common/rc
+++ b/common/rc
@@ -2870,17 +2870,30 @@ _require_fs_space()
 #
 # Check if the filesystem supports sparse files.
 #
-# Unfortunately there is no better way to do this than a manual black list.
+# Filesystems (such as CIFS mounted from a Windows server) that generally
+# support sparse files but are tricked into creating a non-sparse file by one
+# of the tests are treated here as not supporting sparse files. This special
+# treatment is done due to media wear-out concerns -- e.g., generic/129 would
+# write multiple terabytes of zeros if allowed to run on a filesystem that
+# ignores the request to make a file sparse.
 #
 _require_sparse_files()
 {
-    case $FSTYP in
-    hfsplus|exfat)
-        _notrun "Sparse files not supported by this filesystem type: $FSTYP"
-	;;
-    *)
-        ;;
-    esac
+    local testfile="$TEST_DIR/$$.sparsefiletest"
+    rm -f "$testfile"
+
+    # The write size and offset are specifically chosen to trick the Windows
+    # SMB server implementation into dishonoring the request to create a sparse
+    # file, while still fitting into the 64 kb SMB1 maximum request size.
+    # This also creates a non-sparse file on vfat, exfat, and hfsplus.
+    $XFS_IO_PROG -f -c 'pwrite -b 51200 -S 0x61 1638400 51200' "$testfile" >/dev/null
+
+    resulting_file_size_kb=$( du -sk "$testfile" | cut -f 1 )
+    rm -f "$testfile"
+
+    # The threshold of 1 MB allows for filesystems with such large clusters.
+    [ $resulting_file_size_kb -gt 1024 ] && \
+	_notrun "Sparse files are not supported or do not work as expected"
 }
 
 _require_debugfs()
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] _require_sparse_files: rewrite as a direct test instead of a black list
  2023-12-18 20:57         ` [PATCH v3] _require_sparse_files: rewrite as a direct test instead of a black list Alexander Patrakov
@ 2023-12-18 22:45           ` Darrick J. Wong
  0 siblings, 0 replies; 8+ messages in thread
From: Darrick J. Wong @ 2023-12-18 22:45 UTC (permalink / raw)
  To: Alexander Patrakov; +Cc: fstests

On Tue, Dec 19, 2023 at 04:57:20AM +0800, Alexander Patrakov wrote:
> _require_sparse_files was implemented as a list of filesystems known not to
> support sparse files, and therefore it missed some cases.
> 
> However, if sparse files do not work as expected during a test, the risk
> is that the test will write out to the disk all the zeros that would
> normally be unwritten. This amounts to at least 4 TB for the generic/129
> test, and therefore there is a significant media wear-out concern here.
> 
> Adding more filesystems to the list of exclusions would not scale and
> would not work anyway because CIFS backed by SAMBA is safe, while CIFS
> backed by Windows Server 2022 is not (because the specific write
> patterns found in generic/014 and generic/129 cause it to ignore the
> otherwise-supported request to make a file sparse).
> 
> Mitigate this risk by rewriting the check as a small-scale test that
> reliably triggers Windows misbehavior. The black list becomes unneeded
> because the same test creates and detects non-sparse files on exfat and
> hfsplus.
> 
> Signed-off-by: Alexander Patrakov <patrakov@gmail.com>

Looks good, thanks for replacing the FSTYP test with a functional test.
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  common/rc | 29 +++++++++++++++++++++--------
>  1 file changed, 21 insertions(+), 8 deletions(-)
> 
> diff --git a/common/rc b/common/rc
> index cc92fe06..a9e0ba7e 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -2870,17 +2870,30 @@ _require_fs_space()
>  #
>  # Check if the filesystem supports sparse files.
>  #
> -# Unfortunately there is no better way to do this than a manual black list.
> +# Filesystems (such as CIFS mounted from a Windows server) that generally
> +# support sparse files but are tricked into creating a non-sparse file by one
> +# of the tests are treated here as not supporting sparse files. This special
> +# treatment is done due to media wear-out concerns -- e.g., generic/129 would
> +# write multiple terabytes of zeros if allowed to run on a filesystem that
> +# ignores the request to make a file sparse.
>  #
>  _require_sparse_files()
>  {
> -    case $FSTYP in
> -    hfsplus|exfat)
> -        _notrun "Sparse files not supported by this filesystem type: $FSTYP"
> -	;;
> -    *)
> -        ;;
> -    esac
> +    local testfile="$TEST_DIR/$$.sparsefiletest"
> +    rm -f "$testfile"
> +
> +    # The write size and offset are specifically chosen to trick the Windows
> +    # SMB server implementation into dishonoring the request to create a sparse
> +    # file, while still fitting into the 64 kb SMB1 maximum request size.
> +    # This also creates a non-sparse file on vfat, exfat, and hfsplus.
> +    $XFS_IO_PROG -f -c 'pwrite -b 51200 -S 0x61 1638400 51200' "$testfile" >/dev/null
> +
> +    resulting_file_size_kb=$( du -sk "$testfile" | cut -f 1 )
> +    rm -f "$testfile"
> +
> +    # The threshold of 1 MB allows for filesystems with such large clusters.
> +    [ $resulting_file_size_kb -gt 1024 ] && \
> +	_notrun "Sparse files are not supported or do not work as expected"
>  }
>  
>  _require_debugfs()
> -- 
> 2.43.0
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-12-18 22:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-17 15:43 Excessive media wearout caused by generic/129 Alexander E. Patrakov
2023-12-17 17:24 ` [PATCH] generic/129: add a safeguard against media wearout Alexander Patrakov
2023-12-17 18:45 ` Excessive media wearout caused by generic/129 Alexander E. Patrakov
2023-12-17 21:00   ` [PATCH v2] _require_sparse_files: add a safeguard against media wearout Alexander Patrakov
2023-12-18 17:45     ` Darrick J. Wong
2023-12-18 18:27       ` Alexander E. Patrakov
2023-12-18 20:57         ` [PATCH v3] _require_sparse_files: rewrite as a direct test instead of a black list Alexander Patrakov
2023-12-18 22:45           ` Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox