[PATCH v2 5/5] fstests: add stress truncation + writeback test

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Luis Chamberlain <mcgrof@kernel.org>
To: patches@lists.linux.dev, fstests@vger.kernel.org
Cc: linux-xfs@vger.kernel.org, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org, akpm@linux-foundation.org,
	ziy@nvidia.com, vbabka@suse.cz, seanjc@google.com,
	willy@infradead.org, david@redhat.com, hughd@google.com,
	linmiaohe@huawei.com, muchun.song@linux.dev, osalvador@suse.de,
	p.raghav@samsung.com, da.gomez@samsung.com, hare@suse.de,
	john.g.garry@oracle.com, mcgrof@kernel.org
Subject: [PATCH v2 5/5] fstests: add stress truncation + writeback test
Date: Fri, 14 Jun 2024 17:29:34 -0700	[thread overview]
Message-ID: <20240615002935.1033031-6-mcgrof@kernel.org> (raw)
In-Reply-To: <20240615002935.1033031-1-mcgrof@kernel.org>

Stress test folio splits by using the new debugfs interface to a target
a new smaller folio order while triggering writeback at the same time.

This is known to only creates a crash with min order enabled, so for example
with a 16k block sized XFS test profile, an xarray fix for that is merged
already. This issue is fixed by kernel commit 2a0774c2886d ("XArray: set the
marks correctly when splitting an entry").

If inspecting more closely, you'll want to enable on your kernel boot:

	dyndbg='file mm/huge_memory.c +p'

Since we want to race large folio splits we also augment the full test
output log $seqres.full with the test specific number of successful
splits from vmstat thp_split_page and thp_split_page_failed. The larger
the vmstat thp_split_page the more we stress test this path.

This test reproduces a really hard to reproduce crash immediately.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 common/rc             |  14 ++++
 tests/generic/751     | 170 ++++++++++++++++++++++++++++++++++++++++++
 tests/generic/751.out |   2 +
 3 files changed, 186 insertions(+)
 create mode 100755 tests/generic/751
 create mode 100644 tests/generic/751.out

diff --git a/common/rc b/common/rc
index 30beef4e5c02..31ad30276ca6 100644
--- a/common/rc
+++ b/common/rc
@@ -158,6 +158,20 @@ _require_vm_compaction()
 	    _notrun "Need compaction enabled CONFIG_COMPACTION=y"
 	fi
 }
+
+# Requires CONFIG_DEBUGFS and truncation knobs
+_require_split_huge_pages_knob()
+{
+       if [ ! -f $DEBUGFS_MNT/split_huge_pages ]; then
+           _notrun "Needs CONFIG_DEBUGFS and split_huge_pages"
+       fi
+}
+
+_split_huge_pages_all()
+{
+	echo 1 > $DEBUGFS_MNT/split_huge_pages
+}
+
 # Get hugepagesize in bytes
 _get_hugepagesize()
 {
diff --git a/tests/generic/751 b/tests/generic/751
new file mode 100755
index 000000000000..ac0ca2f07443
--- /dev/null
+++ b/tests/generic/751
@@ -0,0 +1,170 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (C) 2024 Luis Chamberlain. All Rights Reserved.
+#
+# FS QA Test No. 751
+#
+# stress page cache truncation + writeback
+#
+# This aims at trying to reproduce a difficult to reproduce bug found with
+# min order. The issue was root caused to an xarray bug when we split folios
+# to another order other than 0. This functionality is used to support min
+# order. The crash:
+#
+# https://gist.github.com/mcgrof/d12f586ec6ebe32b2472b5d634c397df
+# Crash excerpt is as follows:
+#
+# BUG: kernel NULL pointer dereference, address: 0000000000000036
+# #PF: supervisor read access in kernel mode
+# #PF: error_code(0x0000) - not-present page
+# PGD 0 P4D 0
+# Oops: 0000 [#1] PREEMPT SMP NOPTI
+# CPU: 7 PID: 2190 Comm: kworker/u38:5 Not tainted 6.9.0-rc5+ #14
+# Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
+# Workqueue: writeback wb_workfn (flush-7:5)
+# RIP: 0010:filemap_get_folios_tag+0xa9/0x200
+# Call Trace:
+#  <TASK>
+#   writeback_iter+0x17d/0x310
+#  write_cache_pages+0x42/0xa0
+#  iomap_writepages+0x33/0x50
+#  xfs_vm_writepages+0x63/0x90 [xfs]
+#  do_writepages+0xcc/0x260
+#  __writeback_single_inode+0x3d/0x340
+#  writeback_sb_inodes+0x1ed/0x4b0
+#  __writeback_inodes_wb+0x4c/0xe0
+#  wb_writeback+0x267/0x2d0
+#  wb_workfn+0x2a4/0x440
+#  process_one_work+0x189/0x3b0
+#  worker_thread+0x273/0x390
+#  kthread+0xda/0x110
+#  ret_from_fork+0x2d/0x50
+#  ret_from_fork_asm+0x1a/0x30
+#  </TASK>
+#
+# This may also find future truncation bugs in the future, as truncating any
+# mapped file through the collateral of using echo 1 > split_huge_pages will
+# always respect the min order. Truncating to a larger order then is excercised
+# when this test is run against any filesystem LBS profile or an LBS device.
+#
+# If you're enabling this and want to check underneath the hood you may want to
+# enable:
+#
+# dyndbg='file mm/huge_memory.c +p'
+#
+# This tests aims at increasing the rate of successful truncations so we want
+# to increase the value of thp_split_page in $seqres.full. Using echo 1 >
+# split_huge_pages is extremely aggressive, and even accounts for anonymous
+# memory on a system, however we accept that tradeoff for the efficiency of
+# doing the work in-kernel for any mapped file too. Our general goal here is to
+# race with folio truncation + writeback.
+
+. ./common/preamble
+
+_begin_fstest auto long_rw stress soak smoketest
+
+# Override the default cleanup function.
+_cleanup()
+{
+	cd /
+	rm -f $tmp.*
+	rm -f $runfile
+	kill -9 $split_huge_pages_files_pid > /dev/null 2>&1
+}
+
+fio_config=$tmp.fio
+fio_out=$tmp.fio.out
+fio_err=$tmp.fio.err
+
+# real QA test starts here
+_supported_fs generic
+_require_test
+_require_scratch
+_require_debugfs
+_require_split_huge_pages_knob
+_require_command "$KILLALL_PROG" "killall"
+_fixed_by_git_commit kernel 2a0774c2886d \
+	"XArray: set the marks correctly when splitting an entry"
+
+proc_vmstat()
+{
+	awk -v name="$1" '{if ($1 ~ name) {print($2)}}' /proc/vmstat | head -1
+}
+
+# we need buffered IO to force truncation races with writeback in the
+# page cache
+cat >$fio_config <<EOF
+[force_large_large_folio_parallel_writes]
+ignore_error=ENOSPC
+nrfiles=10
+direct=0
+bs=4M
+group_reporting=1
+filesize=1GiB
+readwrite=write
+fallocate=none
+numjobs=$(nproc)
+directory=$SCRATCH_MNT
+runtime=100*${TIME_FACTOR}
+time_based
+EOF
+
+_require_fio $fio_config
+
+echo "Silence is golden"
+
+_scratch_mkfs >>$seqres.full 2>&1
+_scratch_mount >> $seqres.full 2>&1
+
+# used to let our loops know when to stop
+runfile="$tmp.keep.running.loop"
+touch $runfile
+
+# The background ops are out of bounds, the goal is to race with fsstress.
+
+# Force folio split if possible, this seems to be screaming for MADV_NOHUGEPAGE
+# for large folios.
+while [ -e $runfile ]; do
+	_split_huge_pages_all >/dev/null 2>&1
+done &
+split_huge_pages_files_pid=$!
+
+split_count_before=0
+split_count_failed_before=0
+
+if grep -q thp_split_page /proc/vmstat; then
+	split_count_before=$(proc_vmstat thp_split_page)
+	split_count_failed_before=$(proc_vmstat thp_split_page_failed)
+else
+	echo "no thp_split_page in /proc/vmstat" >> $seqres.full
+fi
+
+# we blast away with large writes to force large folio writes when
+# possible.
+echo -e "Running fio with config:\n" >> $seqres.full
+cat $fio_config >> $seqres.full
+$FIO_PROG $fio_config --alloc-size=$(( $(nproc) * 8192 )) \
+	--output=$fio_out 2> $fio_err
+FIO_ERR=$?
+
+rm -f $runfile
+
+wait > /dev/null 2>&1
+
+if grep -q thp_split_page /proc/vmstat; then
+	split_count_after=$(proc_vmstat thp_split_page)
+	split_count_failed_after=$(proc_vmstat thp_split_page_failed)
+	thp_split_page=$((split_count_after - split_count_before))
+	thp_split_page_failed=$((split_count_failed_after - split_count_failed_before))
+
+	echo "vmstat thp_split_page: $thp_split_page" >> $seqres.full
+	echo "vmstat thp_split_page_failed: $thp_split_page_failed" >> $seqres.full
+fi
+
+# exitall_on_error=ENOSPC does not work as it should, so we need this eyesore
+if [[ $FIO_ERR -ne 0 ]] && ! grep -q "No space left on device" $fio_err; then
+	_fail "fio failed with err: $FIO_ERR"
+fi
+
+status=0
+exit
diff --git a/tests/generic/751.out b/tests/generic/751.out
new file mode 100644
index 000000000000..6479fa6f1404
--- /dev/null
+++ b/tests/generic/751.out
@@ -0,0 +1,2 @@
+QA output created by 751
+Silence is golden
-- 
2.43.0

next prev parent reply	other threads:[~2024-06-15  0:29 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-15  0:29 [PATCH v2 0/5] fstests: add some new LBS inspired tests Luis Chamberlain
2024-06-15  0:29 ` [PATCH v2 1/5] common: move mread() to generic helper _mread() Luis Chamberlain
2024-06-15  0:29 ` [PATCH v2 2/5] fstests: add mmap page boundary tests Luis Chamberlain
2024-06-18 14:07   ` Zorro Lang
2024-06-15  0:29 ` [PATCH v2 3/5] fstests: add fsstress + compaction test Luis Chamberlain
2024-06-15  0:29 ` [PATCH v2 4/5] _require_debugfs(): simplify and fix for debian Luis Chamberlain
2024-06-15  0:29 ` Luis Chamberlain [this message]
2024-06-18 14:10   ` [PATCH v2 5/5] fstests: add stress truncation + writeback test Zorro Lang

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:30beef4e5c0 dfblob:31ad30276ca dfblob:ac0ca2f0744
dfblob:6479fa6f140 )
 OR (
bs:"[PATCH v2 5/5] fstests: add stress truncation + writeback test" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240615002935.1033031-6-mcgrof@kernel.org \
    --to=mcgrof@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=da.gomez@samsung.com \
    --cc=david@redhat.com \
    --cc=fstests@vger.kernel.org \
    --cc=hare@suse.de \
    --cc=hughd@google.com \
    --cc=john.g.garry@oracle.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    --cc=p.raghav@samsung.com \
    --cc=patches@lists.linux.dev \
    --cc=seanjc@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).