FS/XFS testing framework
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Joanne Koong <joannelkoong@gmail.com>
Cc: Zorro Lang <zlang@redhat.com>,
	fstests@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	bfoster@redhat.com, nirjhar@linux.ibm.com, kernel-team@meta.com
Subject: Re: [PATCH v5 1/2] fsx: support reads/writes from buffers backed by hugepages
Date: Mon, 3 Feb 2025 12:01:49 -0800	[thread overview]
Message-ID: <20250203200149.GC134490@frogsfrogsfrogs> (raw)
In-Reply-To: <CAJnrk1Y_eDFOnob3N78O3jcRoHy6Y0jaxnXbgVT0okBjwJue3g@mail.gmail.com>

On Mon, Feb 03, 2025 at 11:57:23AM -0800, Joanne Koong wrote:
> On Mon, Feb 3, 2025 at 11:41 AM Darrick J. Wong <djwong@kernel.org> wrote:
> >
> > On Mon, Feb 03, 2025 at 11:23:20AM -0800, Joanne Koong wrote:
> > > On Mon, Feb 3, 2025 at 10:59 AM Darrick J. Wong <djwong@kernel.org> wrote:
> > > >
> > > > On Mon, Feb 03, 2025 at 10:04:04AM -0800, Joanne Koong wrote:
> > > > > On Sun, Feb 2, 2025 at 6:25 AM Zorro Lang <zlang@redhat.com> wrote:
> > > > > >
> > > > > > On Tue, Jan 21, 2025 at 01:56:40PM -0800, Joanne Koong wrote:
> > > > > > > Add support for reads/writes from buffers backed by hugepages.
> > > > > > > This can be enabled through the '-h' flag. This flag should only be used
> > > > > > > on systems where THP capabilities are enabled.
> > > > > > >
> > > > > > > This is motivated by a recent bug that was due to faulty handling of
> > > > > > > userspace buffers backed by hugepages. This patch is a mitigation
> > > > > > > against problems like this in the future.
> > > > > > >
> > > > > > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> > > > > > > Reviewed-by: Brian Foster <bfoster@redhat.com>
> > > > > > > Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> > > > > > > ---
> > > > > >
> > > > > > Those two test cases fail on old system which doesn't support
> > > > > > MADV_COLLAPSE:
> > > > > >
> > > > > >    fsx -N 10000 -l 500000 -h
> > > > > >   -fsx -N 10000 -o 8192 -l 500000 -h
> > > > > >   -fsx -N 10000 -o 128000 -l 500000 -h
> > > > > >   +MADV_COLLAPSE not supported. Can't support -h
> > > > > >
> > > > > > and
> > > > > >
> > > > > >    fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> > > > > >   -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> > > > > >   -fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> > > > > >   +mapped writes DISABLED
> > > > > >   +MADV_COLLAPSE not supported. Can't support -h
> > > > > >
> > > > > > I'm wondering ...
> > > > > >
> > > > > > >  ltp/fsx.c | 166 +++++++++++++++++++++++++++++++++++++++++++++++++-----
> > > > > > >  1 file changed, 153 insertions(+), 13 deletions(-)
> > > > > > >
> > > > > > > diff --git a/ltp/fsx.c b/ltp/fsx.c
> > > > > > > index 41933354..3be383c6 100644
> > > > > > > --- a/ltp/fsx.c
> > > > > > > +++ b/ltp/fsx.c
> > > > > > >  static struct option longopts[] = {
> > > > > > >       {"replay-ops", required_argument, 0, 256},
> > > > > > >       {"record-ops", optional_argument, 0, 255},
> > > > > > > @@ -2883,7 +3023,7 @@ main(int argc, char **argv)
> > > > > > >       setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */
> > > > > > >
> > > > > > >       while ((ch = getopt_long(argc, argv,
> > > > > > > -                              "0b:c:de:fg:i:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ",
> > > > > > > +                              "0b:c:de:fg:hi:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ",
> > > > > > >                                longopts, NULL)) != EOF)
> > > > > > >               switch (ch) {
> > > > > > >               case 'b':
> > > > > > > @@ -2916,6 +3056,14 @@ main(int argc, char **argv)
> > > > > > >               case 'g':
> > > > > > >                       filldata = *optarg;
> > > > > > >                       break;
> > > > > > > +             case 'h':
> > > > > > > +#ifndef MADV_COLLAPSE
> > > > > > > +                     fprintf(stderr, "MADV_COLLAPSE not supported. "
> > > > > > > +                             "Can't support -h\n");
> > > > > > > +                     exit(86);
> > > > > > > +#endif
> > > > > > > +                     hugepages = 1;
> > > > > > > +                     break;
> > > > > >
> > > > > > ...
> > > > > > if we could change this part to:
> > > > > >
> > > > > > #ifdef MADV_COLLAPSE
> > > > > >         hugepages = 1;
> > > > > > #endif
> > > > > >         break;
> > > > > >
> > > > > > to avoid the test failures on old systems.
> > > > > >
> > > > > > Or any better ideas from you :)
> > > > >
> > > > > Hi Zorro,
> > > > >
> > > > > It looks like MADV_COLLAPSE was introduced in kernel version 6.1. What
> > > > > do you think about skipping generic/758 and generic/759 if the kernel
> > > > > version is older than 6.1? That to me seems more preferable than the
> > > > > paste above, as the paste above would execute the test as if it did
> > > > > test hugepages when in reality it didn't, which would be misleading.
> > > >
> > > > Now that I've gotten to try this out --
> > > >
> > > > There's a couple of things going on here.  The first is that generic/759
> > > > and 760 need to check if invoking fsx -h causes it to spit out the
> > > > "MADV_COLLAPSE not supported" error and _notrun the test.
> > > >
> > > > The second thing is that userspace programs can ensure the existence of
> > > > MADV_COLLAPSE in multiple ways.  The first way is through sys/mman.h,
> > > > which requires that the underlying C library headers are new enough to
> > > > include a definition.  glibc 2.37 is new enough, but even things like
> > > > Debian 12 and RHEL 9 aren't new enough to have that.  Other C libraries
> > > > might not follow glibc's practice of wrapping and/or redefining symbols
> > > > in a way that you hope is the same as...
> > > >
> > > > The second way is through linux/mman.h, which comes from the kernel
> > > > headers package; and the third way is for the program to define it
> > > > itself if nobody else does.
> > > >
> > > > So I think the easiest way to fix the fsx.c build is to include
> > > > linux/mman.h in addition to sys/mman.h.  Sorry I didn't notice these
> > >
> > > Thanks for your input. Do we still need sys/mman.h if linux/mman.h is added?
> >
> > Yes, because glibc provides the mmap() function that wraps
> > syscall(__NR_mmap, ...);
> >
> > > For generic/758 and 759, does it suffice to gate this on whether the
> > > kernel version if 6.1+ and _notrun if not? My understanding is that
> > > any kernel version 6.1 or newer will have MADV_COLLAPSE in its kernel
> > > headers package and support the feature.
> >
> > No, because some (most?) vendors backport new features into existing
> > kernels without revving the version number of that kernel.
> 
> Oh okay, I see. That makes sense, thanks for the explanation.
> 
> >
> > Maybe the following fixes things?
> >
> > --D
> >
> > generic/759,760: fix MADV_COLLAPSE detection and inclusion
> >
> > On systems with "old" C libraries such as glibc 2.36 in Debian 12, the
> > MADV_COLLAPSE flag might not be defined in any of the header files
> > pulled in by sys/mman.h, which means that the fsx binary might not get
> > built with any of the MADV_COLLAPSE code.  If the kernel supports THP,
> > the test will fail with:
> >
> > >  QA output created by 760
> > >  fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> > > -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> > > -fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> > > +mapped writes DISABLED
> > > +MADV_COLLAPSE not supported. Can't support -h
> >
> > Fix both tests to detect fsx binaries that don't support MADV_COLLAPSE,
> > then fix fsx.c to include the mman.h from the kernel headers (aka
> > linux/mman.h) so that we can actually test IOs to and from THPs if the
> > kernel is newer than the rest of userspace.
> >
> > Cc: <fstests@vger.kernel.org> # v2025.02.02
> > Fixes: 627289232371e3 ("generic: add tests for read/writes from hugepages-backed buffers")
> > Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> > ---
> >  ltp/fsx.c         |    1 +
> >  tests/generic/759 |    3 +++
> >  tests/generic/760 |    3 +++
> >  3 files changed, 7 insertions(+)
> >
> > diff --git a/ltp/fsx.c b/ltp/fsx.c
> > index 634c496ffe9317..cf9502a74c17a7 100644
> > --- a/ltp/fsx.c
> > +++ b/ltp/fsx.c
> > @@ -20,6 +20,7 @@
> >  #include <strings.h>
> >  #include <sys/file.h>
> >  #include <sys/mman.h>
> > +#include <linux/mman.h>
> >  #include <sys/uio.h>
> >  #include <stdbool.h>
> >  #ifdef HAVE_ERR_H
> > diff --git a/tests/generic/759 b/tests/generic/759
> > index 6c74478aa8a0e0..3549c5ed6a9299 100755
> > --- a/tests/generic/759
> > +++ b/tests/generic/759
> > @@ -14,6 +14,9 @@ _begin_fstest rw auto quick
> >  _require_test
> >  _require_thp
> >
> > +$here/ltp/fsx -N 0 -h $TEST_DIR 2>&1 | grep -q 'MADV_COLLAPSE not supported' && \
> > +       _notrun "fsx binary does not support MADV_COLLAPSE"
> > +
> >  run_fsx -N 10000            -l 500000 -h
> >  run_fsx -N 10000  -o 8192   -l 500000 -h
> >  run_fsx -N 10000  -o 128000 -l 500000 -h
> > diff --git a/tests/generic/760 b/tests/generic/760
> > index c71a196222ad3b..2fbd28502ae678 100755
> > --- a/tests/generic/760
> > +++ b/tests/generic/760
> > @@ -15,6 +15,9 @@ _require_test
> >  _require_odirect
> >  _require_thp
> >
> > +$here/ltp/fsx -N 0 -h $TEST_DIR 2>&1 | grep -q 'MADV_COLLAPSE not supported' && \
> > +       _notrun "fsx binary does not support MADV_COLLAPSE"
> > +
> 
> I tried this out locally and it works for me:
> 
> generic/759 8s ... [not run] fsx binary does not support MADV_COLLAPSE
> Ran: generic/759
> Not run: generic/759
> Passed all 1 tests
> 
> SECTION       -- fuse
> =========================
> Ran: generic/759
> Not run: generic/759
> Passed all 1 tests

Does the test actually run if you /do/ have kernel/libc headers that
define MADV_COLLAPSE?  And if so, does that count as a Tested-by?

--D

> 
> Thanks,
> Joanne
> 
> >  psize=`$here/src/feature -s`
> >  bsize=`$here/src/min_dio_alignment $TEST_DIR $TEST_DEV`
> >
> 

  reply	other threads:[~2025-02-03 20:01 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-21 21:56 [PATCH v5 0/2] fstests: test reads/writes from hugepages-backed buffers Joanne Koong
2025-01-21 21:56 ` [PATCH v5 1/2] fsx: support reads/writes from buffers backed by hugepages Joanne Koong
2025-02-02 14:25   ` Zorro Lang
2025-02-03 18:04     ` Joanne Koong
2025-02-03 18:59       ` Darrick J. Wong
2025-02-03 19:23         ` Joanne Koong
2025-02-03 19:41           ` Darrick J. Wong
2025-02-03 19:57             ` Joanne Koong
2025-02-03 20:01               ` Darrick J. Wong [this message]
2025-02-03 21:40                 ` Joanne Koong
2025-02-03 21:54                   ` Darrick J. Wong
2025-02-04  4:21                     ` Darrick J. Wong
2025-02-04 20:50                       ` Joanne Koong
2025-02-04 21:31                         ` Darrick J. Wong
2025-02-04 21:52                           ` Joanne Koong
2025-02-04 17:05       ` Zorro Lang
2025-02-04 20:54         ` Joanne Koong
2025-01-21 21:56 ` [PATCH v5 2/2] generic: add tests for read/writes from hugepages-backed buffers Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250203200149.GC134490@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=bfoster@redhat.com \
    --cc=fstests@vger.kernel.org \
    --cc=joannelkoong@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=nirjhar@linux.ibm.com \
    --cc=zlang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox