Re: [PATCH v1] generic/476: requires 27GB scratch size

FS/XFS testing framework
 help / color / mirror / Atom feed

From: "Theodore Ts'o" <tytso@mit.edu>
To: Chuck Lever III <chuck.lever@oracle.com>
Cc: Boyang Xue <bxue@redhat.com>,
	"Darrick J. Wong" <djwong@kernel.org>,
	"fstests@vger.kernel.org" <fstests@vger.kernel.org>
Subject: Re: [PATCH v1] generic/476: requires 27GB scratch size
Date: Thu, 21 Jul 2022 22:18:58 -0400	[thread overview]
Message-ID: <YtoJElcucHMtIYQT@mit.edu> (raw)
In-Reply-To: <75AAE413-4142-45F2-BE8A-7A7867CC0DA4@oracle.com>

On Thu, Jul 21, 2022 at 06:13:48PM +0000, Chuck Lever III wrote:
> 
> I agree that Q/A and dev testing needs are distinct, so a dev might
> have a simpler series of tests to run and fewer resources to run them
> in.
> 
> That said, I've had it on my to-do list for some time to find an easy
> way to run automated multi-host tests, and I've been told it should be
> straight-forward, but I've been swamped lately.

Yeah, in a cloud or VM environment it shouldn't be *that* hard.
Especially for cloud setup, it's just a matter of launching another
cloud VM with a metadata flag that says, "please provide an NFS server
using the specified file system as the backing store", noting the IP
address, and then passing it to the client VM.  The only slightly
tricky part is that monitoring and saving the serial console of the
server as a test artifact in case of an oops or triggers a BUG or
WARN_ON.

Unfortunately, like you, I've been swamped lately.  :-/

> Many of us in the NFS community actually don't run the tests that
> require a scratch dev, because many of them don't seem relevant to
> NFS, or they take a long time to run. Someday we should sort through
> all that :-)

It doesn't take *that* long.  In loopback mode, as well as using the
GCE Filestore Basic product as the remote NFS server, it takes between
2.5 and 3 hours to run the auto group with the test and scratch device
sized to 5GB:

nfs/filestore: 785 tests, 3 failures, 323 skipped, 9922 seconds
  Failures: generic/258 generic/444 generic/551
nfs/loopback: 814 tests, 2 failures, 342 skipped, 9364 seconds
  Failures: generic/426 generic/551

That's the same order of magnitude for ext4 or xfs running -g auto,
and at least for me "gce-xfsetsts -c nfs/default -g auto" is fire and
forget kind of thing.  2-3 hours later, the results show up in my
inbox.  It's actually *analyzing* the test failures which takes time
and NFS expertise, both of which I don't have a lot of at the moment.

> For the particular issue with generic/476, I would like to see if
> there's a reason that test takes a long time and fails with a small
> scratch dev before agreeing that excluding it is the proper response.

At the moment, my test runner setup assumes that if a single test
takes more than hour, the system under test is hung and should be
killed.  So if generic/476 is taking ~400 seconds for pre-5.10 LTS
kernels, and over 24 hours if the watchdog safety timer isn't in use
for 5.10+ kernels, I need to exclude it in my test runner, at least
for now.

Once it's fixed, I can use a linux versioned #ifdef to only exclude
the test if the fix is not present.

(Also on my todo wishlist is to have some way to automatically exclude
a test if a specified fix commit isn't present on the tested kernel,
but to run it automatically once the fix commit is present.
Unfortunately, I don't have the time or the business case to put
someone on it as a work project...)

					- Ted

next prev parent reply	other threads:[~2022-07-22  2:19 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-21  2:29 [PATCH v1] generic/476: requires 27GB scratch size bxue
2022-07-21  2:49 ` xuyang2018.jy
2022-07-21  7:17   ` Boyang Xue
2022-07-21  4:40 ` Darrick J. Wong
2022-07-21  7:26   ` Boyang Xue
2022-07-21 11:42     ` Theodore Ts'o
2022-07-21 14:03       ` Theodore Ts'o
2022-07-21 15:12         ` Chuck Lever III
2022-07-21 15:14           ` Chuck Lever III
2022-07-21 18:04             ` Theodore Ts'o
2022-07-21 18:13               ` Chuck Lever III
2022-07-22  2:18                 ` Theodore Ts'o [this message]
2022-08-04  9:10                   ` Boyang Xue
2022-07-21 15:29 ` Zorro Lang
2022-07-21 15:32   ` Chuck Lever III
2022-07-21 16:30     ` Zorro Lang
2022-07-21 18:25       ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YtoJElcucHMtIYQT@mit.edu \
    --to=tytso@mit.edu \
    --cc=bxue@redhat.com \
    --cc=chuck.lever@oracle.com \
    --cc=djwong@kernel.org \
    --cc=fstests@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox