From: Leah Rumancik <leah.rumancik@gmail.com>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: Amir Goldstein <amir73il@gmail.com>,
linux-xfs <linux-xfs@vger.kernel.org>,
"Darrick J. Wong" <djwong@kernel.org>,
Theodore Tso <tytso@mit.edu>
Subject: Re: [PATCH 5.15 00/15] xfs stable candidate patches for 5.15.y
Date: Wed, 8 Jun 2022 15:16:41 -0700 [thread overview]
Message-ID: <YqEfyR0DBbQEFv9s@google.com> (raw)
In-Reply-To: <Yp5V80/7KuM3sdiW@bombadil.infradead.org>
On Mon, Jun 06, 2022 at 12:30:59PM -0700, Luis Chamberlain wrote:
> On Mon, Jun 06, 2022 at 11:57:08AM -0700, Leah Rumancik wrote:
> > On Mon, Jun 06, 2022 at 08:55:24AM -0700, Luis Chamberlain wrote:
> > > On Sat, Jun 04, 2022 at 11:38:35AM +0300, Amir Goldstein wrote:
> > > > On Sat, Jun 4, 2022 at 6:53 AM Leah Rumancik <leah.rumancik@gmail.com> wrote:
> > > > >
> > > > > From: Leah Rumancik <lrumancik@google.com>
> > > > >
> > > > > This first round of patches aims to take care of the easy cases - patches
> > > > > with the Fixes tag that apply cleanly. I have ~30 more patches identified
> > > > > which will be tested next, thanks everyone for the various suggestions
> > > > > for tracking down more bug fixes. No regressions were seen during
> > > > > testing when running fstests 3 times per config with the following configs:
> > >
> > > Leah,
> > >
> > > It is great to see this work move forward.
> > >
> > > How many times was fstest run *without* the patches to establish the
> > > baseline? Do you have a baseline for known failures published somewhere?
> >
> > Currently, the tests are being run 10x per config without the patches.
> > If a failure is seen with the patches, the tests are rerun on the
> > baseline several hundred times to see if the failure was a regression or
> > to determine the baseline failure rate.
>
> This is certainly one way to go about it. This just means that you have
> to do this work then as a second step. Whereas if you first have a high
> confidence in a baseline you then are pretty certain you have a
> regression once a test fails after you start testing deltas on
> a stable release.
>
> Average failure rates for non-deterministic tests tend to be about
> 1/2 - 1/30. Although things such as 1/60, anything beyond 1/100
> exist is *very* rare. So running fstests just 10 times seems to me
> rather low to have any sort of high confidence in a baseline.
>
Unfortunately, I am seeing some failures pop up with a fail rate of
~0.5-2% :( I typically end up rerunning failing tests up to 1000 times to
be confident about the failure rate on both the baseline and the backports
branch. Running each test 1000 times from the start is a bit much, but I
upped the test runs on both the baseline and backports branches to 100
runs per test to hopefully filter out some of the tests that fail more
consistently.
> > >
> > > As discussed at LSFMM is there a chance we can collaborate on a baseline
> > > together? One way I had suggested we could do this for different test
> > > runners is to have git subtree with the expunges which we can all share
> > > for different test runner.
> > >
> >
> > Could you elaborate on this a bit? Are you hoping to gain insight from
> > comparing 5.10.y baseline with 5.15.y baseline or are you hoping to
> > allow people working on the same stable branch to have a joint record of
> > test run output?
>
> Not output, but to share failures known to exist per kernel release and
> per filesystem, and even Linux distribution. We can shared this as
> expressed in an expunge file which can be used as input to running
> fstests so that tests are skipped for the release.
>
> Annotations can be made with comments, you can see an existin list here:
>
> https://github.com/linux-kdevops/kdevops/tree/master/workflows/fstests/expunges/
>
> I currently track *.bad and *.dmesg outputs into gists and refer to them
> with a URL. Likewise when possible I annotate the failure rate.
>
> *If* it makes sense to collaborate on that front I can extract *just*
> the expunges directory and make its own git subtree which then kdevops
> uses. Other test runner can then use the same git tree as a git subtree.
Personally, I don't think I would have much use for a git subtree. I have
been using expunges very sparingly - only for tests which cause crashes -
as I like to run even the failing tests to keep tabs on the failure rates.
>
> Luis
Best,
Leah
next prev parent reply other threads:[~2022-06-08 22:16 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-03 18:46 [PATCH 5.15 00/15] xfs stable candidate patches for 5.15.y Leah Rumancik
2022-06-04 8:38 ` Amir Goldstein
2022-06-06 15:55 ` Luis Chamberlain
2022-06-06 18:57 ` Leah Rumancik
2022-06-06 19:30 ` Luis Chamberlain
2022-06-08 22:16 ` Leah Rumancik [this message]
2022-06-06 17:42 ` Leah Rumancik
2022-06-08 7:56 ` Amir Goldstein
2022-06-08 22:24 ` Leah Rumancik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YqEfyR0DBbQEFv9s@google.com \
--to=leah.rumancik@gmail.com \
--cc=amir73il@gmail.com \
--cc=djwong@kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox