* [PATCH v2 0/5] fstests: add some new LBS inspired tests
@ 2024-06-15 0:29 Luis Chamberlain
2024-06-15 0:29 ` [PATCH v2 1/5] common: move mread() to generic helper _mread() Luis Chamberlain
` (4 more replies)
0 siblings, 5 replies; 8+ messages in thread
From: Luis Chamberlain @ 2024-06-15 0:29 UTC (permalink / raw)
To: patches, fstests
Cc: linux-xfs, linux-mm, linux-fsdevel, akpm, ziy, vbabka, seanjc,
willy, david, hughd, linmiaohe, muchun.song, osalvador, p.raghav,
da.gomez, hare, john.g.garry, mcgrof
While working on LBS we've come accross some existing issues, some of them deal
existing kernels without LBS, while the only new corner case specific to LBS is
the xarray bug Willy fixed to help with truncation to larger order folios and
races with writeback.
This adds 3 new tests to help reproduce these issues right away. One test
reproduces an otherwise extremely difficult to reproduce deadlock, we have one
patch fix already merged to help with that deadlock, however the test also
also gives us more homework todo, as more deadlocks are still possible with that
test even on v6.10-rc2.
The 3 tests are:
1) mmap():
The mmap page boundary test let's us discover that a patch on the LBS series
fixes the mmap page boundary restriction when huge pages are enabled on tmpfs
with a 4k base page size system (x86). This is a corner case POSIX semantic
issue, so likley not critical to most users.
2) fsstress + compaction
The fsstress + compaction test reproduces a really difficult to reproduce hang
which is possible without some recent fixes. However the test reveals there is
yet more work is left to do to fix all posssible deadlocks. To be clear these
issues are reproducible without LBS, on a plain 4k block size XFS filesystem.
3) stress truncation + writeback
The stress truncation + writeback test is the only test in this series specific
to LBS, but likely will be useful later for other future uses in the kernel.
Changes on this v2:
- Few cleanups suggested
- Renamed routines as suggested
- Used helpers for proc vmstat as suggested
- Made the mmap() test continue so we can just count the number of failures
of the test
- Made the fio test ignore out of space issues, we care to just blast
the page cache, and detect write errors or crashes. This test now goes also
tested with tmpfs.
- Minor commit log enhancements
Luis Chamberlain (5):
common: move mread() to generic helper _mread()
fstests: add mmap page boundary tests
fstests: add fsstress + compaction test
_require_debugfs(): simplify and fix for debian
fstests: add stress truncation + writeback test
common/rc | 54 ++++++++-
tests/generic/574 | 36 +-----
tests/generic/749 | 256 ++++++++++++++++++++++++++++++++++++++++++
tests/generic/749.out | 2 +
tests/generic/750 | 63 +++++++++++
tests/generic/750.out | 2 +
tests/generic/751 | 170 ++++++++++++++++++++++++++++
tests/generic/751.out | 2 +
8 files changed, 552 insertions(+), 33 deletions(-)
create mode 100755 tests/generic/749
create mode 100644 tests/generic/749.out
create mode 100755 tests/generic/750
create mode 100644 tests/generic/750.out
create mode 100755 tests/generic/751
create mode 100644 tests/generic/751.out
--
2.43.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 1/5] common: move mread() to generic helper _mread()
2024-06-15 0:29 [PATCH v2 0/5] fstests: add some new LBS inspired tests Luis Chamberlain
@ 2024-06-15 0:29 ` Luis Chamberlain
2024-06-15 0:29 ` [PATCH v2 2/5] fstests: add mmap page boundary tests Luis Chamberlain
` (3 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: Luis Chamberlain @ 2024-06-15 0:29 UTC (permalink / raw)
To: patches, fstests
Cc: linux-xfs, linux-mm, linux-fsdevel, akpm, ziy, vbabka, seanjc,
willy, david, hughd, linmiaohe, muchun.song, osalvador, p.raghav,
da.gomez, hare, john.g.garry, mcgrof, Darrick J . Wong
We want a shared way to use mmap in a way that we can test
for the SIGBUS, provide a shared routine which other tests can
leverage.
Suggested-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
common/rc | 28 ++++++++++++++++++++++++++++
tests/generic/574 | 36 ++++--------------------------------
2 files changed, 32 insertions(+), 32 deletions(-)
diff --git a/common/rc b/common/rc
index 163041fea5b9..fa7942809d6c 100644
--- a/common/rc
+++ b/common/rc
@@ -52,6 +52,34 @@ _pwrite_byte() {
$XFS_IO_PROG $xfs_io_args -f -c "pwrite -S $pattern $offset $len" "$file"
}
+_round_up_to_page_boundary()
+{
+ local n=$1
+ local page_size=$(_get_page_size)
+
+ echo $(( (n + page_size - 1) & ~(page_size - 1) ))
+}
+
+_mread()
+{
+ local file=$1
+ local offset=$2
+ local length=$3
+ local map_len=$(_round_up_to_page_boundary $(_get_filesize $file))
+
+ # Some callers expect xfs_io to crash with SIGBUS due to the mread,
+ # causing the shell to print "Bus error" to stderr. To allow this
+ # message to be redirected, execute xfs_io in a new shell instance.
+ # However, for this to work reliably, we also need to prevent the new
+ # shell instance from optimizing out the fork and directly exec'ing
+ # xfs_io. The easiest way to do that is to append 'true' to the
+ # commands, so that xfs_io is no longer the last command the shell sees.
+ # Don't let it write core files to the filesystem.
+ bash -c "trap '' SIGBUS; ulimit -c 0; $XFS_IO_PROG -r $file \
+ -c 'mmap -r 0 $map_len' \
+ -c 'mread -v $offset $length'; true"
+}
+
# mmap-write a byte into a range of a file
_mwrite_byte() {
local pattern="$1"
diff --git a/tests/generic/574 b/tests/generic/574
index cb42baaa67aa..d44c23e5abc2 100755
--- a/tests/generic/574
+++ b/tests/generic/574
@@ -52,34 +52,6 @@ setup_zeroed_file()
cmp $fsv_orig_file $fsv_file
}
-round_up_to_page_boundary()
-{
- local n=$1
- local page_size=$(_get_page_size)
-
- echo $(( (n + page_size - 1) & ~(page_size - 1) ))
-}
-
-mread()
-{
- local file=$1
- local offset=$2
- local length=$3
- local map_len=$(round_up_to_page_boundary $(_get_filesize $file))
-
- # Some callers expect xfs_io to crash with SIGBUS due to the mread,
- # causing the shell to print "Bus error" to stderr. To allow this
- # message to be redirected, execute xfs_io in a new shell instance.
- # However, for this to work reliably, we also need to prevent the new
- # shell instance from optimizing out the fork and directly exec'ing
- # xfs_io. The easiest way to do that is to append 'true' to the
- # commands, so that xfs_io is no longer the last command the shell sees.
- # Don't let it write core files to the filesystem.
- bash -c "trap '' SIGBUS; ulimit -c 0; $XFS_IO_PROG -r $file \
- -c 'mmap -r 0 $map_len' \
- -c 'mread -v $offset $length'; true"
-}
-
corruption_test()
{
local block_size=$1
@@ -142,7 +114,7 @@ corruption_test()
fi
# Reading the full file via mmap should fail.
- mread $fsv_file 0 $file_len >/dev/null 2>$tmp.err
+ _mread $fsv_file 0 $file_len >/dev/null 2>$tmp.err
if ! grep -q 'Bus error' $tmp.err; then
echo "Didn't see SIGBUS when reading file via mmap"
cat $tmp.err
@@ -150,7 +122,7 @@ corruption_test()
# Reading just the corrupted part via mmap should fail.
if ! $is_merkle_tree; then
- mread $fsv_file $zap_offset $zap_len >/dev/null 2>$tmp.err
+ _mread $fsv_file $zap_offset $zap_len >/dev/null 2>$tmp.err
if ! grep -q 'Bus error' $tmp.err; then
echo "Didn't see SIGBUS when reading corrupted part via mmap"
cat $tmp.err
@@ -174,10 +146,10 @@ corrupt_eof_block_test()
head -c $zap_len /dev/zero | tr '\0' X \
| _fsv_scratch_corrupt_bytes $fsv_file $file_len
- mread $fsv_file $file_len $zap_len >$tmp.out 2>$tmp.err
+ _mread $fsv_file $file_len $zap_len >$tmp.out 2>$tmp.err
head -c $file_len /dev/zero >$tmp.zeroes
- mread $tmp.zeroes $file_len $zap_len >$tmp.zeroes_out
+ _mread $tmp.zeroes $file_len $zap_len >$tmp.zeroes_out
grep -q 'Bus error' $tmp.err || diff $tmp.out $tmp.zeroes_out
}
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 2/5] fstests: add mmap page boundary tests
2024-06-15 0:29 [PATCH v2 0/5] fstests: add some new LBS inspired tests Luis Chamberlain
2024-06-15 0:29 ` [PATCH v2 1/5] common: move mread() to generic helper _mread() Luis Chamberlain
@ 2024-06-15 0:29 ` Luis Chamberlain
2024-06-18 14:07 ` Zorro Lang
2024-06-15 0:29 ` [PATCH v2 3/5] fstests: add fsstress + compaction test Luis Chamberlain
` (2 subsequent siblings)
4 siblings, 1 reply; 8+ messages in thread
From: Luis Chamberlain @ 2024-06-15 0:29 UTC (permalink / raw)
To: patches, fstests
Cc: linux-xfs, linux-mm, linux-fsdevel, akpm, ziy, vbabka, seanjc,
willy, david, hughd, linmiaohe, muchun.song, osalvador, p.raghav,
da.gomez, hare, john.g.garry, mcgrof
mmap() POSIX compliance says we should zero fill data beyond a file
size up to page boundary, and issue a SIGBUS if we go beyond. While fsx
helps us test zero-fill sometimes, fsstress also let's us sometimes test
for SIGBUS however that is based on a random value and its not likely we
always test it. Dedicate a specic test for this to make testing for
this specific situation and to easily expand on other corner cases.
The only filesystem currently known to fail is tmpfs with huge pages on
a 4k base page size system, on 64k base page size it does not fail.
The pending upstream patch "filemap: cap PTE range to be created to
allowed zero fill in folio_map_range()" fixes this issue for tmpfs on
4k base page size with huge pages and it also fixes it for LBS support.
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
common/rc | 5 +-
tests/generic/749 | 256 ++++++++++++++++++++++++++++++++++++++++++
tests/generic/749.out | 2 +
3 files changed, 262 insertions(+), 1 deletion(-)
create mode 100755 tests/generic/749
create mode 100644 tests/generic/749.out
diff --git a/common/rc b/common/rc
index fa7942809d6c..e812a2f7cc67 100644
--- a/common/rc
+++ b/common/rc
@@ -60,12 +60,15 @@ _round_up_to_page_boundary()
echo $(( (n + page_size - 1) & ~(page_size - 1) ))
}
+# You can override the $map_len but its optional, by default we use the
+# max allowed size. If you use a length greater than the default you can
+# expect a SIBGUS and test for it.
_mread()
{
local file=$1
local offset=$2
local length=$3
- local map_len=$(_round_up_to_page_boundary $(_get_filesize $file))
+ local map_len=${4:-$(_round_up_to_page_boundary $(_get_filesize $file)) }
# Some callers expect xfs_io to crash with SIGBUS due to the mread,
# causing the shell to print "Bus error" to stderr. To allow this
diff --git a/tests/generic/749 b/tests/generic/749
new file mode 100755
index 000000000000..2dcced4e3c13
--- /dev/null
+++ b/tests/generic/749
@@ -0,0 +1,256 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) Luis Chamberlain. All Rights Reserved.
+#
+# FS QA Test 749
+#
+# As per POSIX NOTES mmap(2) maps multiples of the system page size, but if the
+# data mapped is not multiples of the page size the remaining bytes are zeroed
+# out when mapped and modifications to that region are not written to the file.
+# On Linux when you write data to such partial page after the end of the
+# object, the data stays in the page cache even after the file is closed and
+# unmapped and even though the data is never written to the file itself,
+# subsequent mappings may see the modified content. If you go *beyond* this
+# page, you should get a SIGBUS. This test verifies we zero-fill to page
+# boundary and ensures we get a SIGBUS if we write to data beyond the system
+# page size even if the block size is greater than the system page size.
+. ./common/preamble
+. ./common/rc
+_begin_fstest auto quick prealloc
+
+# Import common functions.
+. ./common/filter
+
+# real QA test starts here
+_supported_fs generic
+_require_scratch_nocheck
+_require_test
+_require_xfs_io_command "truncate"
+_require_xfs_io_command "falloc"
+
+# _fixed_by_git_commit kernel <pending-upstream> \
+# "filemap: cap PTE range to be created to allowed zero fill in folio_map_range()"
+
+filter_xfs_io_data_unique()
+{
+ _filter_xfs_io_offset | sed -e 's| |\n|g' | grep -E -v "\.|XX|\*" | \
+ sort -u | tr -d '\n'
+}
+
+
+setup_zeroed_file()
+{
+ local file_len=$1
+ local sparse=$2
+
+ if $sparse; then
+ $XFS_IO_PROG -f -c "truncate $file_len" $test_file
+ else
+ $XFS_IO_PROG -f -c "falloc 0 $file_len" $test_file
+ fi
+}
+
+mwrite()
+{
+ local file=$1
+ local offset=$2
+ local length=$3
+ local map_len=${4:-$(_round_up_to_page_boundary $(_get_filesize $file)) }
+
+ # Some callers expect xfs_io to crash with SIGBUS due to the mread,
+ # causing the shell to print "Bus error" to stderr. To allow this
+ # message to be redirected, execute xfs_io in a new shell instance.
+ # However, for this to work reliably, we also need to prevent the new
+ # shell instance from optimizing out the fork and directly exec'ing
+ # xfs_io. The easiest way to do that is to append 'true' to the
+ # commands, so that xfs_io is no longer the last command the shell sees.
+ bash -c "trap '' SIGBUS; ulimit -c 0; \
+ $XFS_IO_PROG $file \
+ -c 'mmap -w 0 $map_len' \
+ -c 'mwrite $offset $length'; \
+ true"
+}
+
+do_mmap_tests()
+{
+ local block_size=$1
+ local file_len=$2
+ local offset=$3
+ local len=$4
+ local use_sparse_file=${5:-false}
+ local new_filelen=0
+ local map_len=0
+ local csum=0
+ local fs_block_size=$(_get_file_block_size $SCRATCH_MNT)
+ local failed=0
+
+ echo -en "\n\n==> Testing blocksize $block_size " >> $seqres.full
+ echo -en "file_len: $file_len offset: $offset " >> $seqres.full
+ echo -e "len: $len sparse: $use_sparse_file" >> $seqres.full
+
+ if ((fs_block_size != block_size)); then
+ _fail "Block size created ($block_size) doesn't match _get_file_block_size on mount ($fs_block_size)"
+ fi
+
+ rm -f $SCRATCH_MNT/file
+
+ # This let's us also test against sparse files
+ setup_zeroed_file $file_len $use_sparse_file
+
+ # This will overwrite the old data, the file size is the
+ # delta between offset and len now.
+ $XFS_IO_PROG -f -c "pwrite -S 0xaa -b 512 $offset $len" \
+ $test_file >> $seqres.full
+
+ sync
+ new_filelen=$(_get_filesize $test_file)
+ map_len=$(_round_up_to_page_boundary $new_filelen)
+ csum_orig="$(_md5_checksum $test_file)"
+
+ # A couple of mmap() tests:
+ #
+ # We are allowed to mmap() up to the boundary of the page size of a
+ # data object, but there a few rules to follow we must check for:
+ #
+ # a) zero-fill test for the data: POSIX says we should zero fill any
+ # partial page after the end of the object. Verify zero-fill.
+ # b) do not write this bogus data to disk: on Linux, if we write data
+ # to a partially filled page, it will stay in the page cache even
+ # after the file is closed and unmapped even if it never reaches the
+ # file. As per mmap(2) subsequent mappings *may* see the modified
+ # content. This means that it also can get other data and we have
+ # no rules about what this data should be. Since the data read after
+ # the actual object data can vary this test just verifies that the
+ # filesize does not change.
+ if [[ $map_len -gt $new_filelen ]]; then
+ zero_filled_data_len=$((map_len - new_filelen))
+ _scratch_cycle_mount
+ expected_zero_data="00"
+ zero_filled_data=$($XFS_IO_PROG -r $test_file \
+ -c "mmap -r 0 $map_len" \
+ -c "mread -v $new_filelen $zero_filled_data_len" \
+ -c "munmap" | \
+ filter_xfs_io_data_unique)
+ if [[ "$zero_filled_data" != "$expected_zero_data" ]]; then
+ let failed=$failed+1
+ echo "Expected data: $expected_zero_data"
+ echo " Actual data: $zero_filled_data"
+ echo "Zero-fill expectations with mmap() not respected"
+ fi
+
+ _scratch_cycle_mount
+ $XFS_IO_PROG $test_file \
+ -c "mmap -w 0 $map_len" \
+ -c "mwrite $new_filelen $zero_filled_data_len" \
+ -c "munmap"
+ sync
+ csum_post="$(_md5_checksum $test_file)"
+ if [[ "$csum_orig" != "$csum_post" ]]; then
+ let failed=$failed+1
+ echo "Expected csum: $csum_orig"
+ echo " Actual csum: $csum_post"
+ echo "mmap() write up to page boundary should not change actual file contents"
+ fi
+
+ local filelen_test=$(_get_filesize $test_file)
+ if [[ "$filelen_test" != "$new_filelen" ]]; then
+ let failed=$failed+1
+ echo "Expected file length: $new_filelen"
+ echo " Actual file length: $filelen_test"
+ echo "mmap() write up to page boundary should not change actual file size"
+ fi
+ fi
+
+ # Now lets ensure we get SIGBUS when we go beyond the page boundary
+ _scratch_cycle_mount
+ new_filelen=$(_get_filesize $test_file)
+ map_len=$(_round_up_to_page_boundary $new_filelen)
+ csum_orig="$(_md5_checksum $test_file)"
+ _mread $test_file 0 $map_len >> $seqres.full 2>$tmp.err
+ if grep -q 'Bus error' $tmp.err; then
+ failed=1
+ cat $tmp.err
+ echo "Not expecting SIGBUS when reading up to page boundary"
+ fi
+
+ # This should just work
+ _mread $test_file 0 $map_len >> $seqres.full 2>$tmp.err
+ if [[ $? -ne 0 ]]; then
+ let failed=$failed+1
+ echo "mmap() read up to page boundary should work"
+ fi
+
+ # This should just work
+ mwrite $map_len 0 $map_len >> $seqres.full 2>$tmp.err
+ if [[ $? -ne 0 ]]; then
+ let failed=$failed+1
+ echo "mmap() write up to page boundary should work"
+ fi
+
+ # If we mmap() on the boundary but try to read beyond it just
+ # fails, we don't get a SIGBUS
+ $XFS_IO_PROG -r $test_file \
+ -c "mmap -r 0 $map_len" \
+ -c "mread 0 $((map_len + 10))" >> $seqres.full 2>$tmp.err
+ local mread_err=$?
+ if [[ $mread_err -eq 0 ]]; then
+ let failed=$failed+1
+ echo "mmap() to page boundary works as expected but reading beyond should fail: $mread_err"
+ fi
+
+ $XFS_IO_PROG -w $test_file \
+ -c "mmap -w 0 $map_len" \
+ -c "mwrite 0 $((map_len + 10))" >> $seqres.full 2>$tmp.err
+ local mwrite_err=$?
+ if [[ $mwrite_err -eq 0 ]]; then
+ let failed=$failed+1
+ echo "mmap() to page boundary works as expected but writing beyond should fail: $mwrite_err"
+ fi
+
+ # Now let's go beyond the allowed mmap() page boundary
+ _mread $test_file 0 $((map_len + 10)) $((map_len + 10)) >> $seqres.full 2>$tmp.err
+ if ! grep -q 'Bus error' $tmp.err; then
+ let failed=$failed+1
+ echo "Expected SIGBUS when mmap() reading beyond page boundary"
+ fi
+
+ mwrite $test_file 0 $((map_len + 10)) $((map_len + 10)) >> $seqres.full 2>$tmp.err
+ if ! grep -q 'Bus error' $tmp.err; then
+ let failed=$failed+1
+ echo "Expected SIGBUS when mmap() writing beyond page boundary"
+ fi
+
+ local filelen_test=$(_get_filesize $test_file)
+ if [[ "$filelen_test" != "$new_filelen" ]]; then
+ let failed=$failed+1
+ echo "Expected file length: $new_filelen"
+ echo " Actual file length: $filelen_test"
+ echo "reading or writing beyond file size up to mmap() page boundary should not change file size"
+ fi
+
+ if [[ $failed -eq 1 ]]; then
+ _fail "Test had $failed failures..."
+ fi
+}
+
+test_block_size()
+{
+ local block_size=$1
+
+ do_mmap_tests $block_size 512 3 5
+ do_mmap_tests $block_size 11k 0 $((4096 * 3 + 3))
+ do_mmap_tests $block_size 16k 0 $((16384+3))
+ do_mmap_tests $block_size 16k $((16384-10)) $((16384+20))
+ do_mmap_tests $block_size 64k 0 $((65536+3))
+ do_mmap_tests $block_size 4k 4090 30 true
+}
+
+_scratch_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed"
+_scratch_mount
+test_file=$SCRATCH_MNT/file
+block_size=$(_get_file_block_size "$SCRATCH_MNT")
+test_block_size $block_size
+
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/generic/749.out b/tests/generic/749.out
new file mode 100644
index 000000000000..24658deddb99
--- /dev/null
+++ b/tests/generic/749.out
@@ -0,0 +1,2 @@
+QA output created by 749
+Silence is golden
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 3/5] fstests: add fsstress + compaction test
2024-06-15 0:29 [PATCH v2 0/5] fstests: add some new LBS inspired tests Luis Chamberlain
2024-06-15 0:29 ` [PATCH v2 1/5] common: move mread() to generic helper _mread() Luis Chamberlain
2024-06-15 0:29 ` [PATCH v2 2/5] fstests: add mmap page boundary tests Luis Chamberlain
@ 2024-06-15 0:29 ` Luis Chamberlain
2024-06-15 0:29 ` [PATCH v2 4/5] _require_debugfs(): simplify and fix for debian Luis Chamberlain
2024-06-15 0:29 ` [PATCH v2 5/5] fstests: add stress truncation + writeback test Luis Chamberlain
4 siblings, 0 replies; 8+ messages in thread
From: Luis Chamberlain @ 2024-06-15 0:29 UTC (permalink / raw)
To: patches, fstests
Cc: linux-xfs, linux-mm, linux-fsdevel, akpm, ziy, vbabka, seanjc,
willy, david, hughd, linmiaohe, muchun.song, osalvador, p.raghav,
da.gomez, hare, john.g.garry, mcgrof, Darrick J . Wong
Running compaction while we run fsstress can crash older kernels as per
korg#218227 [0], the fix for that [0] has been posted [1] that patch
was merged on v6.9-rc6 fixed by commit d99e3140a4d3 ("mm: turn
folio_test_hugetlb into a PageType"). This test reproduces that crash
right away.
But we have more work to do ...
Even on v6.10-rc2 where this kernel commit is already merged we can
still deadlock when running fsstress and at the same time triggering
compaction, this is a new issue being reported now through this patch,
but this patch also serves as a reproducer with a high confidence. At
least for XFS running this test ~ 44 times will deadlock.
If you enable CONFIG_PROVE_LOCKING with the defaults you will end up
with a complaint about increasing MAX_LOCKDEP_CHAIN_HLOCKS [1], if
you adjust that you then end up with a few soft lockup complaints and
some possible deadlock candidates to evaluate [2].
Provide a simple reproducer and pave the way so we keep on testing this.
[0] https://bugzilla.kernel.org/show_bug.cgi?id=218227
[1] https://gist.github.com/mcgrof/824913b645892214effeb1631df75072
[2] https://gist.github.com/mcgrof/926e183d21c5c4c55d74ec90197bd77a
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
common/rc | 7 +++++
tests/generic/750 | 63 +++++++++++++++++++++++++++++++++++++++++++
tests/generic/750.out | 2 ++
3 files changed, 72 insertions(+)
create mode 100755 tests/generic/750
create mode 100644 tests/generic/750.out
diff --git a/common/rc b/common/rc
index e812a2f7cc67..18ad25662d5c 100644
--- a/common/rc
+++ b/common/rc
@@ -151,6 +151,13 @@ _require_hugepages()
_notrun "Kernel does not report huge page size"
}
+# Requires CONFIG_COMPACTION
+_require_vm_compaction()
+{
+ if [ ! -f /proc/sys/vm/compact_memory ]; then
+ _notrun "Need compaction enabled CONFIG_COMPACTION=y"
+ fi
+}
# Get hugepagesize in bytes
_get_hugepagesize()
{
diff --git a/tests/generic/750 b/tests/generic/750
new file mode 100755
index 000000000000..3057937d7176
--- /dev/null
+++ b/tests/generic/750
@@ -0,0 +1,63 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2024 Luis Chamberlain. All Rights Reserved.
+#
+# FS QA Test 750
+#
+# fsstress + memory compaction test
+#
+. ./common/preamble
+_begin_fstest auto rw long_rw stress soak smoketest
+
+_cleanup()
+{
+ cd /
+ rm -f $runfile
+ rm -f $tmp.*
+ kill -9 $trigger_compaction_pid > /dev/null 2>&1
+ $KILLALL_PROG -9 fsstress > /dev/null 2>&1
+
+ wait > /dev/null 2>&1
+}
+
+# Import common functions.
+
+# real QA test starts here
+
+_supported_fs generic
+
+_require_scratch
+_require_vm_compaction
+_require_command "$KILLALL_PROG" "killall"
+
+# We still deadlock with this test on v6.10-rc2, we need more work.
+# but the below makes things better.
+_fixed_by_git_commit kernel d99e3140a4d3 \
+ "mm: turn folio_test_hugetlb into a PageType"
+
+echo "Silence is golden"
+
+_scratch_mkfs > $seqres.full 2>&1
+_scratch_mount >> $seqres.full 2>&1
+
+nr_cpus=$((LOAD_FACTOR * 4))
+nr_ops=$((25000 * nr_cpus * TIME_FACTOR))
+fsstress_args=(-w -d $SCRATCH_MNT -n $nr_ops -p $nr_cpus)
+test -n "$SOAK_DURATION" && fsstress_args+=(--duration="$SOAK_DURATION")
+
+# start a background trigger for memory compaction
+runfile="$tmp.compaction"
+touch $runfile
+while [ -e $runfile ]; do
+ echo 1 > /proc/sys/vm/compact_memory
+ sleep 5
+done &
+trigger_compaction_pid=$!
+
+$FSSTRESS_PROG $FSSTRESS_AVOID "${fsstress_args[@]}" >> $seqres.full
+
+rm -f $runfile
+wait > /dev/null 2>&1
+
+status=0
+exit
diff --git a/tests/generic/750.out b/tests/generic/750.out
new file mode 100644
index 000000000000..bd79507b632e
--- /dev/null
+++ b/tests/generic/750.out
@@ -0,0 +1,2 @@
+QA output created by 750
+Silence is golden
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 4/5] _require_debugfs(): simplify and fix for debian
2024-06-15 0:29 [PATCH v2 0/5] fstests: add some new LBS inspired tests Luis Chamberlain
` (2 preceding siblings ...)
2024-06-15 0:29 ` [PATCH v2 3/5] fstests: add fsstress + compaction test Luis Chamberlain
@ 2024-06-15 0:29 ` Luis Chamberlain
2024-06-15 0:29 ` [PATCH v2 5/5] fstests: add stress truncation + writeback test Luis Chamberlain
4 siblings, 0 replies; 8+ messages in thread
From: Luis Chamberlain @ 2024-06-15 0:29 UTC (permalink / raw)
To: patches, fstests
Cc: linux-xfs, linux-mm, linux-fsdevel, akpm, ziy, vbabka, seanjc,
willy, david, hughd, linmiaohe, muchun.song, osalvador, p.raghav,
da.gomez, hare, john.g.garry, mcgrof, Darrick J . Wong,
Zorro Lang
Using findmnt -S debugfs arguments does not really output anything on
debian, and is not needed, fix that.
Fixes: 8e8fb3da709e ("fstests: fix _require_debugfs and call it properly")
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
common/rc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/common/rc b/common/rc
index 18ad25662d5c..30beef4e5c02 100644
--- a/common/rc
+++ b/common/rc
@@ -3025,7 +3025,7 @@ _require_debugfs()
local type
if [ -d "$DEBUGFS_MNT" ];then
- type=$(findmnt -rncv -T $DEBUGFS_MNT -S debugfs -o FSTYPE)
+ type=$(findmnt -rncv -T $DEBUGFS_MNT -o FSTYPE)
[ "$type" = "debugfs" ] && return 0
fi
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 5/5] fstests: add stress truncation + writeback test
2024-06-15 0:29 [PATCH v2 0/5] fstests: add some new LBS inspired tests Luis Chamberlain
` (3 preceding siblings ...)
2024-06-15 0:29 ` [PATCH v2 4/5] _require_debugfs(): simplify and fix for debian Luis Chamberlain
@ 2024-06-15 0:29 ` Luis Chamberlain
2024-06-18 14:10 ` Zorro Lang
4 siblings, 1 reply; 8+ messages in thread
From: Luis Chamberlain @ 2024-06-15 0:29 UTC (permalink / raw)
To: patches, fstests
Cc: linux-xfs, linux-mm, linux-fsdevel, akpm, ziy, vbabka, seanjc,
willy, david, hughd, linmiaohe, muchun.song, osalvador, p.raghav,
da.gomez, hare, john.g.garry, mcgrof
Stress test folio splits by using the new debugfs interface to a target
a new smaller folio order while triggering writeback at the same time.
This is known to only creates a crash with min order enabled, so for example
with a 16k block sized XFS test profile, an xarray fix for that is merged
already. This issue is fixed by kernel commit 2a0774c2886d ("XArray: set the
marks correctly when splitting an entry").
If inspecting more closely, you'll want to enable on your kernel boot:
dyndbg='file mm/huge_memory.c +p'
Since we want to race large folio splits we also augment the full test
output log $seqres.full with the test specific number of successful
splits from vmstat thp_split_page and thp_split_page_failed. The larger
the vmstat thp_split_page the more we stress test this path.
This test reproduces a really hard to reproduce crash immediately.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
common/rc | 14 ++++
tests/generic/751 | 170 ++++++++++++++++++++++++++++++++++++++++++
tests/generic/751.out | 2 +
3 files changed, 186 insertions(+)
create mode 100755 tests/generic/751
create mode 100644 tests/generic/751.out
diff --git a/common/rc b/common/rc
index 30beef4e5c02..31ad30276ca6 100644
--- a/common/rc
+++ b/common/rc
@@ -158,6 +158,20 @@ _require_vm_compaction()
_notrun "Need compaction enabled CONFIG_COMPACTION=y"
fi
}
+
+# Requires CONFIG_DEBUGFS and truncation knobs
+_require_split_huge_pages_knob()
+{
+ if [ ! -f $DEBUGFS_MNT/split_huge_pages ]; then
+ _notrun "Needs CONFIG_DEBUGFS and split_huge_pages"
+ fi
+}
+
+_split_huge_pages_all()
+{
+ echo 1 > $DEBUGFS_MNT/split_huge_pages
+}
+
# Get hugepagesize in bytes
_get_hugepagesize()
{
diff --git a/tests/generic/751 b/tests/generic/751
new file mode 100755
index 000000000000..ac0ca2f07443
--- /dev/null
+++ b/tests/generic/751
@@ -0,0 +1,170 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (C) 2024 Luis Chamberlain. All Rights Reserved.
+#
+# FS QA Test No. 751
+#
+# stress page cache truncation + writeback
+#
+# This aims at trying to reproduce a difficult to reproduce bug found with
+# min order. The issue was root caused to an xarray bug when we split folios
+# to another order other than 0. This functionality is used to support min
+# order. The crash:
+#
+# https://gist.github.com/mcgrof/d12f586ec6ebe32b2472b5d634c397df
+# Crash excerpt is as follows:
+#
+# BUG: kernel NULL pointer dereference, address: 0000000000000036
+# #PF: supervisor read access in kernel mode
+# #PF: error_code(0x0000) - not-present page
+# PGD 0 P4D 0
+# Oops: 0000 [#1] PREEMPT SMP NOPTI
+# CPU: 7 PID: 2190 Comm: kworker/u38:5 Not tainted 6.9.0-rc5+ #14
+# Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
+# Workqueue: writeback wb_workfn (flush-7:5)
+# RIP: 0010:filemap_get_folios_tag+0xa9/0x200
+# Call Trace:
+# <TASK>
+# writeback_iter+0x17d/0x310
+# write_cache_pages+0x42/0xa0
+# iomap_writepages+0x33/0x50
+# xfs_vm_writepages+0x63/0x90 [xfs]
+# do_writepages+0xcc/0x260
+# __writeback_single_inode+0x3d/0x340
+# writeback_sb_inodes+0x1ed/0x4b0
+# __writeback_inodes_wb+0x4c/0xe0
+# wb_writeback+0x267/0x2d0
+# wb_workfn+0x2a4/0x440
+# process_one_work+0x189/0x3b0
+# worker_thread+0x273/0x390
+# kthread+0xda/0x110
+# ret_from_fork+0x2d/0x50
+# ret_from_fork_asm+0x1a/0x30
+# </TASK>
+#
+# This may also find future truncation bugs in the future, as truncating any
+# mapped file through the collateral of using echo 1 > split_huge_pages will
+# always respect the min order. Truncating to a larger order then is excercised
+# when this test is run against any filesystem LBS profile or an LBS device.
+#
+# If you're enabling this and want to check underneath the hood you may want to
+# enable:
+#
+# dyndbg='file mm/huge_memory.c +p'
+#
+# This tests aims at increasing the rate of successful truncations so we want
+# to increase the value of thp_split_page in $seqres.full. Using echo 1 >
+# split_huge_pages is extremely aggressive, and even accounts for anonymous
+# memory on a system, however we accept that tradeoff for the efficiency of
+# doing the work in-kernel for any mapped file too. Our general goal here is to
+# race with folio truncation + writeback.
+
+. ./common/preamble
+
+_begin_fstest auto long_rw stress soak smoketest
+
+# Override the default cleanup function.
+_cleanup()
+{
+ cd /
+ rm -f $tmp.*
+ rm -f $runfile
+ kill -9 $split_huge_pages_files_pid > /dev/null 2>&1
+}
+
+fio_config=$tmp.fio
+fio_out=$tmp.fio.out
+fio_err=$tmp.fio.err
+
+# real QA test starts here
+_supported_fs generic
+_require_test
+_require_scratch
+_require_debugfs
+_require_split_huge_pages_knob
+_require_command "$KILLALL_PROG" "killall"
+_fixed_by_git_commit kernel 2a0774c2886d \
+ "XArray: set the marks correctly when splitting an entry"
+
+proc_vmstat()
+{
+ awk -v name="$1" '{if ($1 ~ name) {print($2)}}' /proc/vmstat | head -1
+}
+
+# we need buffered IO to force truncation races with writeback in the
+# page cache
+cat >$fio_config <<EOF
+[force_large_large_folio_parallel_writes]
+ignore_error=ENOSPC
+nrfiles=10
+direct=0
+bs=4M
+group_reporting=1
+filesize=1GiB
+readwrite=write
+fallocate=none
+numjobs=$(nproc)
+directory=$SCRATCH_MNT
+runtime=100*${TIME_FACTOR}
+time_based
+EOF
+
+_require_fio $fio_config
+
+echo "Silence is golden"
+
+_scratch_mkfs >>$seqres.full 2>&1
+_scratch_mount >> $seqres.full 2>&1
+
+# used to let our loops know when to stop
+runfile="$tmp.keep.running.loop"
+touch $runfile
+
+# The background ops are out of bounds, the goal is to race with fsstress.
+
+# Force folio split if possible, this seems to be screaming for MADV_NOHUGEPAGE
+# for large folios.
+while [ -e $runfile ]; do
+ _split_huge_pages_all >/dev/null 2>&1
+done &
+split_huge_pages_files_pid=$!
+
+split_count_before=0
+split_count_failed_before=0
+
+if grep -q thp_split_page /proc/vmstat; then
+ split_count_before=$(proc_vmstat thp_split_page)
+ split_count_failed_before=$(proc_vmstat thp_split_page_failed)
+else
+ echo "no thp_split_page in /proc/vmstat" >> $seqres.full
+fi
+
+# we blast away with large writes to force large folio writes when
+# possible.
+echo -e "Running fio with config:\n" >> $seqres.full
+cat $fio_config >> $seqres.full
+$FIO_PROG $fio_config --alloc-size=$(( $(nproc) * 8192 )) \
+ --output=$fio_out 2> $fio_err
+FIO_ERR=$?
+
+rm -f $runfile
+
+wait > /dev/null 2>&1
+
+if grep -q thp_split_page /proc/vmstat; then
+ split_count_after=$(proc_vmstat thp_split_page)
+ split_count_failed_after=$(proc_vmstat thp_split_page_failed)
+ thp_split_page=$((split_count_after - split_count_before))
+ thp_split_page_failed=$((split_count_failed_after - split_count_failed_before))
+
+ echo "vmstat thp_split_page: $thp_split_page" >> $seqres.full
+ echo "vmstat thp_split_page_failed: $thp_split_page_failed" >> $seqres.full
+fi
+
+# exitall_on_error=ENOSPC does not work as it should, so we need this eyesore
+if [[ $FIO_ERR -ne 0 ]] && ! grep -q "No space left on device" $fio_err; then
+ _fail "fio failed with err: $FIO_ERR"
+fi
+
+status=0
+exit
diff --git a/tests/generic/751.out b/tests/generic/751.out
new file mode 100644
index 000000000000..6479fa6f1404
--- /dev/null
+++ b/tests/generic/751.out
@@ -0,0 +1,2 @@
+QA output created by 751
+Silence is golden
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/5] fstests: add mmap page boundary tests
2024-06-15 0:29 ` [PATCH v2 2/5] fstests: add mmap page boundary tests Luis Chamberlain
@ 2024-06-18 14:07 ` Zorro Lang
0 siblings, 0 replies; 8+ messages in thread
From: Zorro Lang @ 2024-06-18 14:07 UTC (permalink / raw)
To: Luis Chamberlain
Cc: patches, fstests, linux-xfs, linux-mm, linux-fsdevel, akpm, ziy,
vbabka, seanjc, willy, david, hughd, linmiaohe, muchun.song,
osalvador, p.raghav, da.gomez, hare, john.g.garry
On Fri, Jun 14, 2024 at 05:29:31PM -0700, Luis Chamberlain wrote:
> mmap() POSIX compliance says we should zero fill data beyond a file
> size up to page boundary, and issue a SIGBUS if we go beyond. While fsx
> helps us test zero-fill sometimes, fsstress also let's us sometimes test
> for SIGBUS however that is based on a random value and its not likely we
> always test it. Dedicate a specic test for this to make testing for
> this specific situation and to easily expand on other corner cases.
>
> The only filesystem currently known to fail is tmpfs with huge pages on
> a 4k base page size system, on 64k base page size it does not fail.
> The pending upstream patch "filemap: cap PTE range to be created to
> allowed zero fill in folio_map_range()" fixes this issue for tmpfs on
> 4k base page size with huge pages and it also fixes it for LBS support.
>
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
Good to me,
Reviewed-by: Zorro Lang <zlang@redhat.com>
> common/rc | 5 +-
> tests/generic/749 | 256 ++++++++++++++++++++++++++++++++++++++++++
> tests/generic/749.out | 2 +
> 3 files changed, 262 insertions(+), 1 deletion(-)
> create mode 100755 tests/generic/749
> create mode 100644 tests/generic/749.out
>
> diff --git a/common/rc b/common/rc
> index fa7942809d6c..e812a2f7cc67 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -60,12 +60,15 @@ _round_up_to_page_boundary()
> echo $(( (n + page_size - 1) & ~(page_size - 1) ))
> }
>
> +# You can override the $map_len but its optional, by default we use the
> +# max allowed size. If you use a length greater than the default you can
> +# expect a SIBGUS and test for it.
> _mread()
> {
> local file=$1
> local offset=$2
> local length=$3
> - local map_len=$(_round_up_to_page_boundary $(_get_filesize $file))
> + local map_len=${4:-$(_round_up_to_page_boundary $(_get_filesize $file)) }
>
> # Some callers expect xfs_io to crash with SIGBUS due to the mread,
> # causing the shell to print "Bus error" to stderr. To allow this
> diff --git a/tests/generic/749 b/tests/generic/749
> new file mode 100755
> index 000000000000..2dcced4e3c13
> --- /dev/null
> +++ b/tests/generic/749
> @@ -0,0 +1,256 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) Luis Chamberlain. All Rights Reserved.
> +#
> +# FS QA Test 749
> +#
> +# As per POSIX NOTES mmap(2) maps multiples of the system page size, but if the
> +# data mapped is not multiples of the page size the remaining bytes are zeroed
> +# out when mapped and modifications to that region are not written to the file.
> +# On Linux when you write data to such partial page after the end of the
> +# object, the data stays in the page cache even after the file is closed and
> +# unmapped and even though the data is never written to the file itself,
> +# subsequent mappings may see the modified content. If you go *beyond* this
> +# page, you should get a SIGBUS. This test verifies we zero-fill to page
> +# boundary and ensures we get a SIGBUS if we write to data beyond the system
> +# page size even if the block size is greater than the system page size.
> +. ./common/preamble
> +. ./common/rc
> +_begin_fstest auto quick prealloc
> +
> +# Import common functions.
> +. ./common/filter
> +
> +# real QA test starts here
> +_supported_fs generic
> +_require_scratch_nocheck
> +_require_test
> +_require_xfs_io_command "truncate"
> +_require_xfs_io_command "falloc"
> +
> +# _fixed_by_git_commit kernel <pending-upstream> \
> +# "filemap: cap PTE range to be created to allowed zero fill in folio_map_range()"
> +
> +filter_xfs_io_data_unique()
> +{
> + _filter_xfs_io_offset | sed -e 's| |\n|g' | grep -E -v "\.|XX|\*" | \
> + sort -u | tr -d '\n'
> +}
> +
> +
> +setup_zeroed_file()
> +{
> + local file_len=$1
> + local sparse=$2
> +
> + if $sparse; then
> + $XFS_IO_PROG -f -c "truncate $file_len" $test_file
> + else
> + $XFS_IO_PROG -f -c "falloc 0 $file_len" $test_file
> + fi
> +}
> +
> +mwrite()
> +{
> + local file=$1
> + local offset=$2
> + local length=$3
> + local map_len=${4:-$(_round_up_to_page_boundary $(_get_filesize $file)) }
> +
> + # Some callers expect xfs_io to crash with SIGBUS due to the mread,
> + # causing the shell to print "Bus error" to stderr. To allow this
> + # message to be redirected, execute xfs_io in a new shell instance.
> + # However, for this to work reliably, we also need to prevent the new
> + # shell instance from optimizing out the fork and directly exec'ing
> + # xfs_io. The easiest way to do that is to append 'true' to the
> + # commands, so that xfs_io is no longer the last command the shell sees.
> + bash -c "trap '' SIGBUS; ulimit -c 0; \
> + $XFS_IO_PROG $file \
> + -c 'mmap -w 0 $map_len' \
> + -c 'mwrite $offset $length'; \
> + true"
> +}
> +
> +do_mmap_tests()
> +{
> + local block_size=$1
> + local file_len=$2
> + local offset=$3
> + local len=$4
> + local use_sparse_file=${5:-false}
> + local new_filelen=0
> + local map_len=0
> + local csum=0
> + local fs_block_size=$(_get_file_block_size $SCRATCH_MNT)
> + local failed=0
> +
> + echo -en "\n\n==> Testing blocksize $block_size " >> $seqres.full
> + echo -en "file_len: $file_len offset: $offset " >> $seqres.full
> + echo -e "len: $len sparse: $use_sparse_file" >> $seqres.full
> +
> + if ((fs_block_size != block_size)); then
> + _fail "Block size created ($block_size) doesn't match _get_file_block_size on mount ($fs_block_size)"
> + fi
> +
> + rm -f $SCRATCH_MNT/file
> +
> + # This let's us also test against sparse files
> + setup_zeroed_file $file_len $use_sparse_file
> +
> + # This will overwrite the old data, the file size is the
> + # delta between offset and len now.
> + $XFS_IO_PROG -f -c "pwrite -S 0xaa -b 512 $offset $len" \
> + $test_file >> $seqres.full
> +
> + sync
> + new_filelen=$(_get_filesize $test_file)
> + map_len=$(_round_up_to_page_boundary $new_filelen)
> + csum_orig="$(_md5_checksum $test_file)"
> +
> + # A couple of mmap() tests:
> + #
> + # We are allowed to mmap() up to the boundary of the page size of a
> + # data object, but there a few rules to follow we must check for:
> + #
> + # a) zero-fill test for the data: POSIX says we should zero fill any
> + # partial page after the end of the object. Verify zero-fill.
> + # b) do not write this bogus data to disk: on Linux, if we write data
> + # to a partially filled page, it will stay in the page cache even
> + # after the file is closed and unmapped even if it never reaches the
> + # file. As per mmap(2) subsequent mappings *may* see the modified
> + # content. This means that it also can get other data and we have
> + # no rules about what this data should be. Since the data read after
> + # the actual object data can vary this test just verifies that the
> + # filesize does not change.
> + if [[ $map_len -gt $new_filelen ]]; then
> + zero_filled_data_len=$((map_len - new_filelen))
> + _scratch_cycle_mount
> + expected_zero_data="00"
> + zero_filled_data=$($XFS_IO_PROG -r $test_file \
> + -c "mmap -r 0 $map_len" \
> + -c "mread -v $new_filelen $zero_filled_data_len" \
> + -c "munmap" | \
> + filter_xfs_io_data_unique)
> + if [[ "$zero_filled_data" != "$expected_zero_data" ]]; then
> + let failed=$failed+1
> + echo "Expected data: $expected_zero_data"
> + echo " Actual data: $zero_filled_data"
> + echo "Zero-fill expectations with mmap() not respected"
> + fi
> +
> + _scratch_cycle_mount
> + $XFS_IO_PROG $test_file \
> + -c "mmap -w 0 $map_len" \
> + -c "mwrite $new_filelen $zero_filled_data_len" \
> + -c "munmap"
> + sync
> + csum_post="$(_md5_checksum $test_file)"
> + if [[ "$csum_orig" != "$csum_post" ]]; then
> + let failed=$failed+1
> + echo "Expected csum: $csum_orig"
> + echo " Actual csum: $csum_post"
> + echo "mmap() write up to page boundary should not change actual file contents"
> + fi
> +
> + local filelen_test=$(_get_filesize $test_file)
> + if [[ "$filelen_test" != "$new_filelen" ]]; then
> + let failed=$failed+1
> + echo "Expected file length: $new_filelen"
> + echo " Actual file length: $filelen_test"
> + echo "mmap() write up to page boundary should not change actual file size"
> + fi
> + fi
> +
> + # Now lets ensure we get SIGBUS when we go beyond the page boundary
> + _scratch_cycle_mount
> + new_filelen=$(_get_filesize $test_file)
> + map_len=$(_round_up_to_page_boundary $new_filelen)
> + csum_orig="$(_md5_checksum $test_file)"
> + _mread $test_file 0 $map_len >> $seqres.full 2>$tmp.err
> + if grep -q 'Bus error' $tmp.err; then
> + failed=1
> + cat $tmp.err
> + echo "Not expecting SIGBUS when reading up to page boundary"
> + fi
> +
> + # This should just work
> + _mread $test_file 0 $map_len >> $seqres.full 2>$tmp.err
> + if [[ $? -ne 0 ]]; then
> + let failed=$failed+1
> + echo "mmap() read up to page boundary should work"
> + fi
> +
> + # This should just work
> + mwrite $map_len 0 $map_len >> $seqres.full 2>$tmp.err
> + if [[ $? -ne 0 ]]; then
> + let failed=$failed+1
> + echo "mmap() write up to page boundary should work"
> + fi
> +
> + # If we mmap() on the boundary but try to read beyond it just
> + # fails, we don't get a SIGBUS
> + $XFS_IO_PROG -r $test_file \
> + -c "mmap -r 0 $map_len" \
> + -c "mread 0 $((map_len + 10))" >> $seqres.full 2>$tmp.err
> + local mread_err=$?
> + if [[ $mread_err -eq 0 ]]; then
> + let failed=$failed+1
> + echo "mmap() to page boundary works as expected but reading beyond should fail: $mread_err"
> + fi
> +
> + $XFS_IO_PROG -w $test_file \
> + -c "mmap -w 0 $map_len" \
> + -c "mwrite 0 $((map_len + 10))" >> $seqres.full 2>$tmp.err
> + local mwrite_err=$?
> + if [[ $mwrite_err -eq 0 ]]; then
> + let failed=$failed+1
> + echo "mmap() to page boundary works as expected but writing beyond should fail: $mwrite_err"
> + fi
> +
> + # Now let's go beyond the allowed mmap() page boundary
> + _mread $test_file 0 $((map_len + 10)) $((map_len + 10)) >> $seqres.full 2>$tmp.err
> + if ! grep -q 'Bus error' $tmp.err; then
> + let failed=$failed+1
> + echo "Expected SIGBUS when mmap() reading beyond page boundary"
> + fi
> +
> + mwrite $test_file 0 $((map_len + 10)) $((map_len + 10)) >> $seqres.full 2>$tmp.err
> + if ! grep -q 'Bus error' $tmp.err; then
> + let failed=$failed+1
> + echo "Expected SIGBUS when mmap() writing beyond page boundary"
> + fi
> +
> + local filelen_test=$(_get_filesize $test_file)
> + if [[ "$filelen_test" != "$new_filelen" ]]; then
> + let failed=$failed+1
> + echo "Expected file length: $new_filelen"
> + echo " Actual file length: $filelen_test"
> + echo "reading or writing beyond file size up to mmap() page boundary should not change file size"
> + fi
> +
> + if [[ $failed -eq 1 ]]; then
> + _fail "Test had $failed failures..."
> + fi
> +}
> +
> +test_block_size()
> +{
> + local block_size=$1
> +
> + do_mmap_tests $block_size 512 3 5
> + do_mmap_tests $block_size 11k 0 $((4096 * 3 + 3))
> + do_mmap_tests $block_size 16k 0 $((16384+3))
> + do_mmap_tests $block_size 16k $((16384-10)) $((16384+20))
> + do_mmap_tests $block_size 64k 0 $((65536+3))
> + do_mmap_tests $block_size 4k 4090 30 true
> +}
> +
> +_scratch_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed"
> +_scratch_mount
> +test_file=$SCRATCH_MNT/file
> +block_size=$(_get_file_block_size "$SCRATCH_MNT")
> +test_block_size $block_size
> +
> +echo "Silence is golden"
> +status=0
> +exit
> diff --git a/tests/generic/749.out b/tests/generic/749.out
> new file mode 100644
> index 000000000000..24658deddb99
> --- /dev/null
> +++ b/tests/generic/749.out
> @@ -0,0 +1,2 @@
> +QA output created by 749
> +Silence is golden
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 5/5] fstests: add stress truncation + writeback test
2024-06-15 0:29 ` [PATCH v2 5/5] fstests: add stress truncation + writeback test Luis Chamberlain
@ 2024-06-18 14:10 ` Zorro Lang
0 siblings, 0 replies; 8+ messages in thread
From: Zorro Lang @ 2024-06-18 14:10 UTC (permalink / raw)
To: Luis Chamberlain
Cc: patches, fstests, linux-xfs, linux-mm, linux-fsdevel, akpm, ziy,
vbabka, seanjc, willy, david, hughd, linmiaohe, muchun.song,
osalvador, p.raghav, da.gomez, hare, john.g.garry
On Fri, Jun 14, 2024 at 05:29:34PM -0700, Luis Chamberlain wrote:
> Stress test folio splits by using the new debugfs interface to a target
> a new smaller folio order while triggering writeback at the same time.
>
> This is known to only creates a crash with min order enabled, so for example
> with a 16k block sized XFS test profile, an xarray fix for that is merged
> already. This issue is fixed by kernel commit 2a0774c2886d ("XArray: set the
> marks correctly when splitting an entry").
>
> If inspecting more closely, you'll want to enable on your kernel boot:
>
> dyndbg='file mm/huge_memory.c +p'
>
> Since we want to race large folio splits we also augment the full test
> output log $seqres.full with the test specific number of successful
> splits from vmstat thp_split_page and thp_split_page_failed. The larger
> the vmstat thp_split_page the more we stress test this path.
>
> This test reproduces a really hard to reproduce crash immediately.
>
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
Good to me,
Reviewed-by: Zorro Lang <zlang@redhat.com>
> common/rc | 14 ++++
> tests/generic/751 | 170 ++++++++++++++++++++++++++++++++++++++++++
> tests/generic/751.out | 2 +
> 3 files changed, 186 insertions(+)
> create mode 100755 tests/generic/751
> create mode 100644 tests/generic/751.out
>
> diff --git a/common/rc b/common/rc
> index 30beef4e5c02..31ad30276ca6 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -158,6 +158,20 @@ _require_vm_compaction()
> _notrun "Need compaction enabled CONFIG_COMPACTION=y"
> fi
> }
> +
> +# Requires CONFIG_DEBUGFS and truncation knobs
> +_require_split_huge_pages_knob()
> +{
> + if [ ! -f $DEBUGFS_MNT/split_huge_pages ]; then
> + _notrun "Needs CONFIG_DEBUGFS and split_huge_pages"
> + fi
> +}
> +
> +_split_huge_pages_all()
> +{
> + echo 1 > $DEBUGFS_MNT/split_huge_pages
> +}
> +
> # Get hugepagesize in bytes
> _get_hugepagesize()
> {
> diff --git a/tests/generic/751 b/tests/generic/751
> new file mode 100755
> index 000000000000..ac0ca2f07443
> --- /dev/null
> +++ b/tests/generic/751
> @@ -0,0 +1,170 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (C) 2024 Luis Chamberlain. All Rights Reserved.
> +#
> +# FS QA Test No. 751
> +#
> +# stress page cache truncation + writeback
> +#
> +# This aims at trying to reproduce a difficult to reproduce bug found with
> +# min order. The issue was root caused to an xarray bug when we split folios
> +# to another order other than 0. This functionality is used to support min
> +# order. The crash:
> +#
> +# https://gist.github.com/mcgrof/d12f586ec6ebe32b2472b5d634c397df
> +# Crash excerpt is as follows:
> +#
> +# BUG: kernel NULL pointer dereference, address: 0000000000000036
> +# #PF: supervisor read access in kernel mode
> +# #PF: error_code(0x0000) - not-present page
> +# PGD 0 P4D 0
> +# Oops: 0000 [#1] PREEMPT SMP NOPTI
> +# CPU: 7 PID: 2190 Comm: kworker/u38:5 Not tainted 6.9.0-rc5+ #14
> +# Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> +# Workqueue: writeback wb_workfn (flush-7:5)
> +# RIP: 0010:filemap_get_folios_tag+0xa9/0x200
> +# Call Trace:
> +# <TASK>
> +# writeback_iter+0x17d/0x310
> +# write_cache_pages+0x42/0xa0
> +# iomap_writepages+0x33/0x50
> +# xfs_vm_writepages+0x63/0x90 [xfs]
> +# do_writepages+0xcc/0x260
> +# __writeback_single_inode+0x3d/0x340
> +# writeback_sb_inodes+0x1ed/0x4b0
> +# __writeback_inodes_wb+0x4c/0xe0
> +# wb_writeback+0x267/0x2d0
> +# wb_workfn+0x2a4/0x440
> +# process_one_work+0x189/0x3b0
> +# worker_thread+0x273/0x390
> +# kthread+0xda/0x110
> +# ret_from_fork+0x2d/0x50
> +# ret_from_fork_asm+0x1a/0x30
> +# </TASK>
> +#
> +# This may also find future truncation bugs in the future, as truncating any
> +# mapped file through the collateral of using echo 1 > split_huge_pages will
> +# always respect the min order. Truncating to a larger order then is excercised
> +# when this test is run against any filesystem LBS profile or an LBS device.
> +#
> +# If you're enabling this and want to check underneath the hood you may want to
> +# enable:
> +#
> +# dyndbg='file mm/huge_memory.c +p'
> +#
> +# This tests aims at increasing the rate of successful truncations so we want
> +# to increase the value of thp_split_page in $seqres.full. Using echo 1 >
> +# split_huge_pages is extremely aggressive, and even accounts for anonymous
> +# memory on a system, however we accept that tradeoff for the efficiency of
> +# doing the work in-kernel for any mapped file too. Our general goal here is to
> +# race with folio truncation + writeback.
> +
> +. ./common/preamble
> +
> +_begin_fstest auto long_rw stress soak smoketest
> +
> +# Override the default cleanup function.
> +_cleanup()
> +{
> + cd /
> + rm -f $tmp.*
> + rm -f $runfile
> + kill -9 $split_huge_pages_files_pid > /dev/null 2>&1
> +}
> +
> +fio_config=$tmp.fio
> +fio_out=$tmp.fio.out
> +fio_err=$tmp.fio.err
> +
> +# real QA test starts here
> +_supported_fs generic
> +_require_test
> +_require_scratch
> +_require_debugfs
> +_require_split_huge_pages_knob
> +_require_command "$KILLALL_PROG" "killall"
> +_fixed_by_git_commit kernel 2a0774c2886d \
> + "XArray: set the marks correctly when splitting an entry"
> +
> +proc_vmstat()
> +{
> + awk -v name="$1" '{if ($1 ~ name) {print($2)}}' /proc/vmstat | head -1
> +}
> +
> +# we need buffered IO to force truncation races with writeback in the
> +# page cache
> +cat >$fio_config <<EOF
> +[force_large_large_folio_parallel_writes]
> +ignore_error=ENOSPC
> +nrfiles=10
> +direct=0
> +bs=4M
> +group_reporting=1
> +filesize=1GiB
> +readwrite=write
> +fallocate=none
> +numjobs=$(nproc)
> +directory=$SCRATCH_MNT
> +runtime=100*${TIME_FACTOR}
> +time_based
> +EOF
> +
> +_require_fio $fio_config
> +
> +echo "Silence is golden"
> +
> +_scratch_mkfs >>$seqres.full 2>&1
> +_scratch_mount >> $seqres.full 2>&1
> +
> +# used to let our loops know when to stop
> +runfile="$tmp.keep.running.loop"
> +touch $runfile
> +
> +# The background ops are out of bounds, the goal is to race with fsstress.
> +
> +# Force folio split if possible, this seems to be screaming for MADV_NOHUGEPAGE
> +# for large folios.
> +while [ -e $runfile ]; do
> + _split_huge_pages_all >/dev/null 2>&1
> +done &
> +split_huge_pages_files_pid=$!
> +
> +split_count_before=0
> +split_count_failed_before=0
> +
> +if grep -q thp_split_page /proc/vmstat; then
> + split_count_before=$(proc_vmstat thp_split_page)
> + split_count_failed_before=$(proc_vmstat thp_split_page_failed)
> +else
> + echo "no thp_split_page in /proc/vmstat" >> $seqres.full
> +fi
> +
> +# we blast away with large writes to force large folio writes when
> +# possible.
> +echo -e "Running fio with config:\n" >> $seqres.full
> +cat $fio_config >> $seqres.full
> +$FIO_PROG $fio_config --alloc-size=$(( $(nproc) * 8192 )) \
> + --output=$fio_out 2> $fio_err
> +FIO_ERR=$?
> +
> +rm -f $runfile
> +
> +wait > /dev/null 2>&1
> +
> +if grep -q thp_split_page /proc/vmstat; then
> + split_count_after=$(proc_vmstat thp_split_page)
> + split_count_failed_after=$(proc_vmstat thp_split_page_failed)
> + thp_split_page=$((split_count_after - split_count_before))
> + thp_split_page_failed=$((split_count_failed_after - split_count_failed_before))
> +
> + echo "vmstat thp_split_page: $thp_split_page" >> $seqres.full
> + echo "vmstat thp_split_page_failed: $thp_split_page_failed" >> $seqres.full
> +fi
> +
> +# exitall_on_error=ENOSPC does not work as it should, so we need this eyesore
> +if [[ $FIO_ERR -ne 0 ]] && ! grep -q "No space left on device" $fio_err; then
> + _fail "fio failed with err: $FIO_ERR"
> +fi
> +
> +status=0
> +exit
> diff --git a/tests/generic/751.out b/tests/generic/751.out
> new file mode 100644
> index 000000000000..6479fa6f1404
> --- /dev/null
> +++ b/tests/generic/751.out
> @@ -0,0 +1,2 @@
> +QA output created by 751
> +Silence is golden
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-06-18 14:10 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-15 0:29 [PATCH v2 0/5] fstests: add some new LBS inspired tests Luis Chamberlain
2024-06-15 0:29 ` [PATCH v2 1/5] common: move mread() to generic helper _mread() Luis Chamberlain
2024-06-15 0:29 ` [PATCH v2 2/5] fstests: add mmap page boundary tests Luis Chamberlain
2024-06-18 14:07 ` Zorro Lang
2024-06-15 0:29 ` [PATCH v2 3/5] fstests: add fsstress + compaction test Luis Chamberlain
2024-06-15 0:29 ` [PATCH v2 4/5] _require_debugfs(): simplify and fix for debian Luis Chamberlain
2024-06-15 0:29 ` [PATCH v2 5/5] fstests: add stress truncation + writeback test Luis Chamberlain
2024-06-18 14:10 ` Zorro Lang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).