* [PATCH] generic/551: prevent OOM on NFS on systems with no swap memory
@ 2026-03-25 3:58 Yongcheng Yang
2026-03-25 5:52 ` Christoph Hellwig
0 siblings, 1 reply; 3+ messages in thread
From: Yongcheng Yang @ 2026-03-25 3:58 UTC (permalink / raw)
To: fstests; +Cc: smayhew, zlang
From: Scott Mayhew <smayhew@redhat.com>
We have frequently observed the oom-killer killing aio-dio-write-verify
when generic/551 is run on NFS filesystems on virtual machines in AWS.
Virtual machines in AWS typically don't have a swap partition, so check
for that condition when testing NFS and only use 90% of available memory
when generating the list of write operations passed to
aio-dio-write-verify.
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
---
common/rc | 5 +++++
tests/generic/551 | 7 +++++++
2 files changed, 12 insertions(+)
diff --git a/common/rc b/common/rc
index 92cb6982..f2a4496f 100644
--- a/common/rc
+++ b/common/rc
@@ -1201,6 +1201,11 @@ _available_memory_bytes()
fi
}
+_total_swap_bytes()
+{
+ free -b | awk '/^Swap/ { print $2 }'
+}
+
_check_minimal_fs_size()
{
local fssize=$1
diff --git a/tests/generic/551 b/tests/generic/551
index 267c57ec..9c5ec4b2 100755
--- a/tests/generic/551
+++ b/tests/generic/551
@@ -38,11 +38,18 @@ do_test()
local truncsize
local total_size=0
local avail_mem=`_available_memory_bytes`
+ local total_swap=`_total_swap_bytes`
# To avoid OOM on tmpfs, subtract the amount of available memory
# allocated for the tmpfs
[ "$FSTYP" = "tmpfs" ] && avail_mem=$((avail_mem - free_size_k * 1024))
+ # To avoid OOM on NFS on systems with no swap memory, only use 90%
+ # of available memory when generating the list of write operations
+ if [ "$FSTYP" = "nfs" -a $total_swap -eq 0 ]; then
+ avail_mem=$((avail_mem - avail_mem / 10))
+ fi
+
# the number of AIO write operation
num_oper=$((RANDOM % 64 + 1))
--
2.52.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] generic/551: prevent OOM on NFS on systems with no swap memory
2026-03-25 3:58 [PATCH] generic/551: prevent OOM on NFS on systems with no swap memory Yongcheng Yang
@ 2026-03-25 5:52 ` Christoph Hellwig
2026-03-25 15:09 ` Darrick J. Wong
0 siblings, 1 reply; 3+ messages in thread
From: Christoph Hellwig @ 2026-03-25 5:52 UTC (permalink / raw)
To: Yongcheng Yang; +Cc: fstests, smayhew, zlang, linux-nfs
On Wed, Mar 25, 2026 at 11:58:22AM +0800, Yongcheng Yang wrote:
> From: Scott Mayhew <smayhew@redhat.com>
>
> We have frequently observed the oom-killer killing aio-dio-write-verify
> when generic/551 is run on NFS filesystems on virtual machines in AWS.
>
> Virtual machines in AWS typically don't have a swap partition, so check
> for that condition when testing NFS and only use 90% of available memory
> when generating the list of write operations passed to
> aio-dio-write-verify.
I don't think this is a good idea. The proper fix is reduce whatever
crazy large memory allocations this workloads causes in NFS. I suspect
it might be page lists or similar, and just breaking them into somewhat
smaller chunks and/or using potentially failing allocations to
dynamically adjust would help.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] generic/551: prevent OOM on NFS on systems with no swap memory
2026-03-25 5:52 ` Christoph Hellwig
@ 2026-03-25 15:09 ` Darrick J. Wong
0 siblings, 0 replies; 3+ messages in thread
From: Darrick J. Wong @ 2026-03-25 15:09 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Yongcheng Yang, fstests, smayhew, zlang, linux-nfs
On Tue, Mar 24, 2026 at 10:52:05PM -0700, Christoph Hellwig wrote:
> On Wed, Mar 25, 2026 at 11:58:22AM +0800, Yongcheng Yang wrote:
> > From: Scott Mayhew <smayhew@redhat.com>
> >
> > We have frequently observed the oom-killer killing aio-dio-write-verify
> > when generic/551 is run on NFS filesystems on virtual machines in AWS.
> >
> > Virtual machines in AWS typically don't have a swap partition, so check
> > for that condition when testing NFS and only use 90% of available memory
> > when generating the list of write operations passed to
> > aio-dio-write-verify.
>
> I don't think this is a good idea. The proper fix is reduce whatever
> crazy large memory allocations this workloads causes in NFS. I suspect
> it might be page lists or similar, and just breaking them into somewhat
> smaller chunks and/or using potentially failing allocations to
> dynamically adjust would help.
I run fstests on XFS every night on a fleets of VMs with no swap and
never hit OOM.
Or at least I didn't until IT mandated CrowdStrike last week and now
it's anyone's guess if the test results are valid. <grumble>
--D
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-03-25 15:09 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-25 3:58 [PATCH] generic/551: prevent OOM on NFS on systems with no swap memory Yongcheng Yang
2026-03-25 5:52 ` Christoph Hellwig
2026-03-25 15:09 ` Darrick J. Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox