From: "Darrick J. Wong" <djwong@kernel.org>
To: Joanne Koong <joannelkoong@gmail.com>
Cc: Zorro Lang <zlang@redhat.com>,
fstests@vger.kernel.org, linux-fsdevel@vger.kernel.org,
bfoster@redhat.com, nirjhar@linux.ibm.com, kernel-team@meta.com
Subject: Re: [PATCH v5 1/2] fsx: support reads/writes from buffers backed by hugepages
Date: Mon, 3 Feb 2025 11:41:47 -0800 [thread overview]
Message-ID: <20250203194147.GA134498@frogsfrogsfrogs> (raw)
In-Reply-To: <CAJnrk1bX27KAOxChMs5pRNmrjjuxjYu11GG==vTN0sa8Qf2Uqw@mail.gmail.com>
On Mon, Feb 03, 2025 at 11:23:20AM -0800, Joanne Koong wrote:
> On Mon, Feb 3, 2025 at 10:59 AM Darrick J. Wong <djwong@kernel.org> wrote:
> >
> > On Mon, Feb 03, 2025 at 10:04:04AM -0800, Joanne Koong wrote:
> > > On Sun, Feb 2, 2025 at 6:25 AM Zorro Lang <zlang@redhat.com> wrote:
> > > >
> > > > On Tue, Jan 21, 2025 at 01:56:40PM -0800, Joanne Koong wrote:
> > > > > Add support for reads/writes from buffers backed by hugepages.
> > > > > This can be enabled through the '-h' flag. This flag should only be used
> > > > > on systems where THP capabilities are enabled.
> > > > >
> > > > > This is motivated by a recent bug that was due to faulty handling of
> > > > > userspace buffers backed by hugepages. This patch is a mitigation
> > > > > against problems like this in the future.
> > > > >
> > > > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> > > > > Reviewed-by: Brian Foster <bfoster@redhat.com>
> > > > > Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> > > > > ---
> > > >
> > > > Those two test cases fail on old system which doesn't support
> > > > MADV_COLLAPSE:
> > > >
> > > > fsx -N 10000 -l 500000 -h
> > > > -fsx -N 10000 -o 8192 -l 500000 -h
> > > > -fsx -N 10000 -o 128000 -l 500000 -h
> > > > +MADV_COLLAPSE not supported. Can't support -h
> > > >
> > > > and
> > > >
> > > > fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> > > > -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> > > > -fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> > > > +mapped writes DISABLED
> > > > +MADV_COLLAPSE not supported. Can't support -h
> > > >
> > > > I'm wondering ...
> > > >
> > > > > ltp/fsx.c | 166 +++++++++++++++++++++++++++++++++++++++++++++++++-----
> > > > > 1 file changed, 153 insertions(+), 13 deletions(-)
> > > > >
> > > > > diff --git a/ltp/fsx.c b/ltp/fsx.c
> > > > > index 41933354..3be383c6 100644
> > > > > --- a/ltp/fsx.c
> > > > > +++ b/ltp/fsx.c
> > > > > static struct option longopts[] = {
> > > > > {"replay-ops", required_argument, 0, 256},
> > > > > {"record-ops", optional_argument, 0, 255},
> > > > > @@ -2883,7 +3023,7 @@ main(int argc, char **argv)
> > > > > setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */
> > > > >
> > > > > while ((ch = getopt_long(argc, argv,
> > > > > - "0b:c:de:fg:i:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ",
> > > > > + "0b:c:de:fg:hi:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ",
> > > > > longopts, NULL)) != EOF)
> > > > > switch (ch) {
> > > > > case 'b':
> > > > > @@ -2916,6 +3056,14 @@ main(int argc, char **argv)
> > > > > case 'g':
> > > > > filldata = *optarg;
> > > > > break;
> > > > > + case 'h':
> > > > > +#ifndef MADV_COLLAPSE
> > > > > + fprintf(stderr, "MADV_COLLAPSE not supported. "
> > > > > + "Can't support -h\n");
> > > > > + exit(86);
> > > > > +#endif
> > > > > + hugepages = 1;
> > > > > + break;
> > > >
> > > > ...
> > > > if we could change this part to:
> > > >
> > > > #ifdef MADV_COLLAPSE
> > > > hugepages = 1;
> > > > #endif
> > > > break;
> > > >
> > > > to avoid the test failures on old systems.
> > > >
> > > > Or any better ideas from you :)
> > >
> > > Hi Zorro,
> > >
> > > It looks like MADV_COLLAPSE was introduced in kernel version 6.1. What
> > > do you think about skipping generic/758 and generic/759 if the kernel
> > > version is older than 6.1? That to me seems more preferable than the
> > > paste above, as the paste above would execute the test as if it did
> > > test hugepages when in reality it didn't, which would be misleading.
> >
> > Now that I've gotten to try this out --
> >
> > There's a couple of things going on here. The first is that generic/759
> > and 760 need to check if invoking fsx -h causes it to spit out the
> > "MADV_COLLAPSE not supported" error and _notrun the test.
> >
> > The second thing is that userspace programs can ensure the existence of
> > MADV_COLLAPSE in multiple ways. The first way is through sys/mman.h,
> > which requires that the underlying C library headers are new enough to
> > include a definition. glibc 2.37 is new enough, but even things like
> > Debian 12 and RHEL 9 aren't new enough to have that. Other C libraries
> > might not follow glibc's practice of wrapping and/or redefining symbols
> > in a way that you hope is the same as...
> >
> > The second way is through linux/mman.h, which comes from the kernel
> > headers package; and the third way is for the program to define it
> > itself if nobody else does.
> >
> > So I think the easiest way to fix the fsx.c build is to include
> > linux/mman.h in addition to sys/mman.h. Sorry I didn't notice these
>
> Thanks for your input. Do we still need sys/mman.h if linux/mman.h is added?
Yes, because glibc provides the mmap() function that wraps
syscall(__NR_mmap, ...);
> For generic/758 and 759, does it suffice to gate this on whether the
> kernel version if 6.1+ and _notrun if not? My understanding is that
> any kernel version 6.1 or newer will have MADV_COLLAPSE in its kernel
> headers package and support the feature.
No, because some (most?) vendors backport new features into existing
kernels without revving the version number of that kernel.
Maybe the following fixes things?
--D
generic/759,760: fix MADV_COLLAPSE detection and inclusion
On systems with "old" C libraries such as glibc 2.36 in Debian 12, the
MADV_COLLAPSE flag might not be defined in any of the header files
pulled in by sys/mman.h, which means that the fsx binary might not get
built with any of the MADV_COLLAPSE code. If the kernel supports THP,
the test will fail with:
> QA output created by 760
> fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> -fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> +mapped writes DISABLED
> +MADV_COLLAPSE not supported. Can't support -h
Fix both tests to detect fsx binaries that don't support MADV_COLLAPSE,
then fix fsx.c to include the mman.h from the kernel headers (aka
linux/mman.h) so that we can actually test IOs to and from THPs if the
kernel is newer than the rest of userspace.
Cc: <fstests@vger.kernel.org> # v2025.02.02
Fixes: 627289232371e3 ("generic: add tests for read/writes from hugepages-backed buffers")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
ltp/fsx.c | 1 +
tests/generic/759 | 3 +++
tests/generic/760 | 3 +++
3 files changed, 7 insertions(+)
diff --git a/ltp/fsx.c b/ltp/fsx.c
index 634c496ffe9317..cf9502a74c17a7 100644
--- a/ltp/fsx.c
+++ b/ltp/fsx.c
@@ -20,6 +20,7 @@
#include <strings.h>
#include <sys/file.h>
#include <sys/mman.h>
+#include <linux/mman.h>
#include <sys/uio.h>
#include <stdbool.h>
#ifdef HAVE_ERR_H
diff --git a/tests/generic/759 b/tests/generic/759
index 6c74478aa8a0e0..3549c5ed6a9299 100755
--- a/tests/generic/759
+++ b/tests/generic/759
@@ -14,6 +14,9 @@ _begin_fstest rw auto quick
_require_test
_require_thp
+$here/ltp/fsx -N 0 -h $TEST_DIR 2>&1 | grep -q 'MADV_COLLAPSE not supported' && \
+ _notrun "fsx binary does not support MADV_COLLAPSE"
+
run_fsx -N 10000 -l 500000 -h
run_fsx -N 10000 -o 8192 -l 500000 -h
run_fsx -N 10000 -o 128000 -l 500000 -h
diff --git a/tests/generic/760 b/tests/generic/760
index c71a196222ad3b..2fbd28502ae678 100755
--- a/tests/generic/760
+++ b/tests/generic/760
@@ -15,6 +15,9 @@ _require_test
_require_odirect
_require_thp
+$here/ltp/fsx -N 0 -h $TEST_DIR 2>&1 | grep -q 'MADV_COLLAPSE not supported' && \
+ _notrun "fsx binary does not support MADV_COLLAPSE"
+
psize=`$here/src/feature -s`
bsize=`$here/src/min_dio_alignment $TEST_DIR $TEST_DEV`
next prev parent reply other threads:[~2025-02-03 19:41 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-21 21:56 [PATCH v5 0/2] fstests: test reads/writes from hugepages-backed buffers Joanne Koong
2025-01-21 21:56 ` [PATCH v5 1/2] fsx: support reads/writes from buffers backed by hugepages Joanne Koong
2025-02-02 14:25 ` Zorro Lang
2025-02-03 18:04 ` Joanne Koong
2025-02-03 18:59 ` Darrick J. Wong
2025-02-03 19:23 ` Joanne Koong
2025-02-03 19:41 ` Darrick J. Wong [this message]
2025-02-03 19:57 ` Joanne Koong
2025-02-03 20:01 ` Darrick J. Wong
2025-02-03 21:40 ` Joanne Koong
2025-02-03 21:54 ` Darrick J. Wong
2025-02-04 4:21 ` Darrick J. Wong
2025-02-04 20:50 ` Joanne Koong
2025-02-04 21:31 ` Darrick J. Wong
2025-02-04 21:52 ` Joanne Koong
2025-02-04 17:05 ` Zorro Lang
2025-02-04 20:54 ` Joanne Koong
2025-01-21 21:56 ` [PATCH v5 2/2] generic: add tests for read/writes from hugepages-backed buffers Joanne Koong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250203194147.GA134498@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=bfoster@redhat.com \
--cc=fstests@vger.kernel.org \
--cc=joannelkoong@gmail.com \
--cc=kernel-team@meta.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=nirjhar@linux.ibm.com \
--cc=zlang@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox