public inbox for kdevops@lists.linux.dev
 help / color / mirror / Atom feed
* [PATCH] generic/551: prevent OOM when running on tmpfs with low memory
@ 2025-06-18 13:00 Daniel Gomez
  2025-06-18 18:47 ` Zorro Lang
  0 siblings, 1 reply; 3+ messages in thread
From: Daniel Gomez @ 2025-06-18 13:00 UTC (permalink / raw)
  To: fstests, Hugh Dickins
  Cc: Luis Chamberlain, kdevops, Daniel Gomez, Chuck Lever, gost.dev,
	Daniel Gomez

From: Daniel Gomez <da.gomez@samsung.com>

Running generic/551 on a tmpfs filesystem with less than 10 GB (ish)
of RAM can lead to the system running out of memory, triggering the
kernel's OOM killer and terminating the aio-dio-write-v process.

Fix generic/551 by substracting the amount of available memory allocated
for the tmpfs scratch device to the total available free memory.

Reported-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
---
While integrating tmpfs support for xfstests in kdevops CI [1], we noticed
that generic/551 could trigger the OOM killer when the scratch device
is tmpfs, due to not properly accounting for available system memory.
Fix the test for tmpfs by subtracting the memory allocated to the
scratch tmpfs mount from the total available memory, ensuring the test
runs within safe limits.

[1]
https://lore.kernel.org/all/20250615-ci-workflow-v1-0-53b267cd2f0a@samsung.com/

These are the kernel oom-killer logs for generic/551 run on a system
with less than 10G of RAM:

run fstests generic/551 at 2025-06-18 11:42:44
aio-dio-write-v invoked oom-killer:
gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0,
oom_score_adj=250
CPU: 5 UID: 0 PID: 1717 Comm: aio-dio-write-v Not tainted
6.16.0-rc2-00049-g52da431bf03b #10 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2025.02-6
04/08/2025
{...}
Tasks state (memory values in pages):
[  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem
pgtables_bytes swapents oom_score_adj name
{...}
[   1717]     0  1717   876600   875978   875945       33         0
7065600        0           250 aio-dio-write-v
oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowe
d=0,global_oom,task_memcg=/system.slice/fstests-generic-551.scope,task=a
io-dio-write-v,pid=1717,uid=0
Out of memory: Killed process 1717 (aio-dio-write-v) total-vm:3506400kB,
anon-rss:3503780kB, file-rss:132kB, shmem-rss:0kB, UID:0 pgtables:6900kB
oom_score_adj:250

Results collected with kdevops on the following tmpfs profiles
(before/after the changes):

diff --git a/workflows/fstests/results/last-run/
6.16.0-rc2-00049-g52da431bf03b/xunit_results.txt
b/workflows/fstests/results/last-run/
6.16.0-rc2-00049-g52da431bf03b/xunit_results.txt
index 02a1f09..0229294 100644
--- a/workflows/fstests/results/last-run/
6.16.0-rc2-00049-g52da431bf03b/xunit_results.txt
+++ b/workflows/fstests/results/last-run/
6.16.0-rc2-00049-g52da431bf03b/xunit_results.txt
@@ -1,21 +1,21 @@
 KERNEL:    6.16.0-rc2-00049-g52da431bf03b
 CPUS:      8

-tmpfs_noswap_huge_never: 1 tests, 1 failures, 9 seconds
-  generic/551  Failed   8s
-tmpfs_default: 1 tests, 1 failures, 5 seconds
-  generic/551  Failed   4s
-tmpfs_noswap_huge_within_size: 1 tests, 1 failures, 8 seconds
-  generic/551  Failed   8s
-tmpfs_huge_always: 1 tests, 1 failures, 11 seconds
-  generic/551  Failed   11s
-tmpfs_huge_within_size: 1 tests, 1 failures, 8 seconds
-  generic/551  Failed   8s
-tmpfs_noswap_huge_always: 1 tests, 1 failures, 6 seconds
-  generic/551  Failed   6s
-tmpfs_noswap_huge_advise: 1 tests, 1 failures, 8 seconds
-  generic/551  Failed   7s
-tmpfs_huge_advise: 1 tests, 1 failures, 8 seconds
-  generic/551  Failed   7s
-Totals: 8 tests, 0 skipped, 8 failures, 0 errors, 59s
+tmpfs_noswap_huge_advise: 1 tests, 134 seconds
+  generic/551  Pass     134s
+tmpfs_noswap_huge_never: 1 tests, 141 seconds
+  generic/551  Pass     141s
+tmpfs_huge_advise: 1 tests, 142 seconds
+  generic/551  Pass     142s
+tmpfs_default: 1 tests, 139 seconds
+  generic/551  Pass     139s
+tmpfs_noswap_huge_always: 1 tests, 109 seconds
+  generic/551  Pass     108s
+tmpfs_noswap_huge_within_size: 1 tests, 116 seconds
+  generic/551  Pass     115s
+tmpfs_huge_within_size: 1 tests, 112 seconds
+  generic/551  Pass     111s
+tmpfs_huge_always: 1 tests, 145 seconds
+  generic/551  Pass     145s
+Totals: 8 tests, 0 skipped, 0 failures, 0 errors, 1035s
---
 tests/generic/551 | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/generic/551 b/tests/generic/551
index 4a7f0a638235e272ef55ffeb3b3e548707568379..6a7376d7a8e3580bee0a5c98eacdaf93c60c8d5c 100755
--- a/tests/generic/551
+++ b/tests/generic/551
@@ -38,6 +38,7 @@ do_test()
 	local truncsize
 	local total_size=0
 	local avail_mem=`_available_memory_bytes`
+	[ "$FSTYP" = "tmpfs" ] && avail_mem=$((avail_mem - free_size_k * 1024))
 
 	# the number of AIO write operation
 	num_oper=$((RANDOM % 64 + 1))

---
base-commit: b7680adf9ff7bdc962fb95b5cbd304abd3137b69
change-id: 20250618-fix-tmpfs-generic-551-4c74b15d4c25

Best regards,
-- 
Daniel Gomez <da.gomez@samsung.com>


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-06-18 18:59 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-18 13:00 [PATCH] generic/551: prevent OOM when running on tmpfs with low memory Daniel Gomez
2025-06-18 18:47 ` Zorro Lang
2025-06-18 18:59   ` Daniel Gomez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox