Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)

public inbox for fstests@vger.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)
       [not found] <20220616182749.1200971-1-leah.rumancik@gmail.com>
@ 2022-06-22  0:07 ` Luis Chamberlain
  2022-06-22 21:44   ` Theodore Ts'o
  2022-06-22 21:52   ` [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) Leah Rumancik
  0 siblings, 2 replies; 17+ messages in thread
From: Luis Chamberlain @ 2022-06-22  0:07 UTC (permalink / raw)
  To: Leah Rumancik, Amir Goldstein, Josef Bacik, Chuck Lever,
	chandanrmail, Sweet Tea Dorminy, Pankaj Raghav
  Cc: linux-xfs, fstests

On Thu, Jun 16, 2022 at 11:27:41AM -0700, Leah Rumancik wrote:
> https://gist.github.com/lrumancik/5a9d85d2637f878220224578e173fc23. 

The coverage for XFS is using profiles which seem to come inspired
by ext4's different mkfs configurations.

Long ago (2019) I had asked we strive to address popular configurations
for XFS so that what would be back then oscheck (now kdevops) can cover
them for stable XFS patch candidate test consideration. That was so long
ago no one should be surprised you didn't get the memo:

https://lkml.kernel.org/r/20190208194829.GJ11489@garbanzo.do-not-panic.com

This has grown to cover more now:

https://github.com/linux-kdevops/kdevops/blob/master/playbooks/roles/fstests/templates/xfs/xfs.config

For instance xfs_bigblock and xfs_reflink_normapbt.

My litmus test back then *and* today is to ensure we have no regressions
on the test sections supported by kdevops for XFS as reflected above.
Without that confidence I'd be really reluctant to support stable
efforts.

If you use kdevops, it should be easy to set up even if you are not
using local virtualization technologies. For instance I just fired
up an AWS cloud m5ad.4xlarge image which has 2 nvme drives, which
mimics the reqs for the methodology of using loopback files:

https://github.com/linux-kdevops/kdevops/blob/master/docs/seeing-more-issues.md

GCE is supported as well, so is Azure and OpenStack, and even custom
openstack solutions...

Also, I see on the above URL you posted there is a TODO in the gist which
says, "find a better route for publishing these". If you were to use
kdevops for this it would have the immediate gain in that kdevops users
could reproduce your findings and help augment it.

However if using kdevops as a landing home for this is too large for you,
we could use a new git tree which just tracks expunges and then kdevops can
use it as a git subtree as I had suggested at LSFMM. The benefit of using a
git subtree is then any runner can make use of it. And note that we
track both fstests and blktests.

The downside is for kdevops to use a new git subtree is just that kdevops
developers would have to use two trees to work on, one for code changes just
for kdevops and one for the git subtree for expunges. That workflow would be
new. I don't suspect it would be a really big issue other than addressing the
initial growing pains to adapt. I have used git subtrees before extensively
and the best rule of thumb is just to ensure you keep the code for the git
subtree in its own directory. You can either immediately upstream your
delta or carry the delta until you are ready to try to push those
changes. Right now kdevops uses the directory workflows/fstests/expunges/
for expunges. Your runner could use whatever it wishes.

We should discuss if we just also want to add the respective found
*.bad, *.dmesg *.all files for results for expunged entries, or if
we should be pushing these out to a new shared storage area. Right now
kdevops keeps track of results in the directory workflows/fstests/results/
but this is a path on .gitignore. If we *do* want to use github and a
shared git subtree perhaps a workflows/fstests/artifacts/kdevops/ would
make sense for the kdevops runner ? Then that namespace allows other
runners to also add files, but we all share expunges / tribal knowledge.

Thoughts?

  Luis

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)
  2022-06-22  0:07 ` [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) Luis Chamberlain
@ 2022-06-22 21:44   ` Theodore Ts'o
  2022-06-23  5:31     ` Amir Goldstein
  2022-06-23 21:31     ` Luis Chamberlain
  2022-06-22 21:52   ` [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) Leah Rumancik
  1 sibling, 2 replies; 17+ messages in thread
From: Theodore Ts'o @ 2022-06-22 21:44 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Leah Rumancik, Amir Goldstein, Josef Bacik, Chuck Lever,
	chandanrmail, Sweet Tea Dorminy, Pankaj Raghav, linux-xfs,
	fstests

On Tue, Jun 21, 2022 at 05:07:10PM -0700, Luis Chamberlain wrote:
> On Thu, Jun 16, 2022 at 11:27:41AM -0700, Leah Rumancik wrote:
> > https://gist.github.com/lrumancik/5a9d85d2637f878220224578e173fc23. 
> 
> The coverage for XFS is using profiles which seem to come inspired
> by ext4's different mkfs configurations.

That's not correct, actually.  It's using the gce-xfstests test
framework which is part of the xfstests-bld[1][2] system that I
maintain, yes.  However, the actual config profiles were obtained via
discussions from Darrick and represent the actual configs which the
XFS maintainer uses to test the upstream XFS tree before deciding to
push to Linus.  We figure if it's good enough for the XFS Maintainer,
it's good enough for us.  :-)

[1] https://thunk.org/gce-xfstests
[2] https://github.com/tytso/xfstests-bld

If you think the XFS Maintainer should be running more configs, I
invite you to have that conversation with Darrick.

> GCE is supported as well, so is Azure and OpenStack, and even custom
> openstack solutions...

The way kdevops work is quite different from how gce-xfstests work,
since it is a VM native solution.  Which is to say, when we kick off a
test, VM's are launched, one per each config, whih provide for better
parallelization, and then once everything is completed, the VM's are
automatically shutdown and they go away; so it's far more efficient in
terms of using cloud resources.  The Lightweight Test Manager will ten
take the Junit XML files, plus all of the test artifacts, and these
get combined into a single test report.

The lightweight test manager runs in a small VM, and this is the only
VM which is consuming resources until we ask it to do some work.  For
example:

    gce-xfstests ltm -c xfs --repo stable.git --commit v5.18.6 -c xfs/all -g auto

That single command will result in the LTM launching a large builder
VM which quickly build the kernel.  (And it uses ccache, and a
persistent cache disk, but even if we've never built the kernel, it
can complete the build in a few minutes.)  Then we launch 12 VM's, one
for each config, and since they don't need to be optimized for fast
builds, we can run most of the VM's with a smaller amount of memory,
to better stress test the file system.  (But for the dax config, we'll
launch a VM with more memory, since we need to simulate the PMEM
device using raw memory.)  Once each VM completes each test run, it
uploads its test artifiacts and results XML file to Google Cloud
Storage.  When all of the VM's complete, the LTM VM will download all
of the results files from GCS, combines them together into a single
result file, and then sends e-mail with a summary of the results.

It's optimized for developers, and for our use cases.  I'm sure
kdevops is much more general, since it can work for hardware-based
test machines, as well as many other cloud stacks, and it's also
optimized for the QA department --- not surprising, since where
kdevops has come from.

> Also, I see on the above URL you posted there is a TODO in the gist which
> says, "find a better route for publishing these". If you were to use
> kdevops for this it would have the immediate gain in that kdevops users
> could reproduce your findings and help augment it.

Sure, but with our system, kvm-xfstests and gce-xfstests users can
*easily* reproduce our findings and can help augment it.  :-)

As far as sharing expunge files, as I've observed before, these files
tend to be very specific to the test configuration --- the number of
CPU's, and the amount of memory, the characteristics of the storage
device, etc.  So what works for one developer's test setup will not
necessarily work for others --- and I'm not convinced that trying to
get everyone standardized on the One True Test Setup is actually an
advantage.  Some people may be using large RAID Arrays; some might be
using fast flash; some might be using some kind of emulated log
structured block device; some might be using eMMC flash.  And that's a
*good* thing.

We also have a very different philosophy about how to use expunge
files.  In paticular, if there is test which is only failing 0.5% of
the time, I don't think it makes sense to put that test into an
expunge file.

In general, we are only placing tests into expunge files when
it causes the system under test to crash, or it takes *WAAAY* too
long, or it's a clear test bug that is too hard to fix for real, so we
just suppress the test for that config for now.  (Example: tests in
xfstests for quota don't understand clustered allocation.)

So we want to run the tests, even if we know it will fail, and have a
way of annotating that a test is known to fail for a particular kernel
version, or if it's a flaky test, what the expected flake percentage
is for that particular test.  For flaky tests, we'd like to be able
automatically retry running the test, and so we can flag when a flaky
test has become a hard failure, or a flaky test has radically changed
how often it fails.  We haven't implemented all of this yet, but this
is something that we're exploring the design space at the moment.

More generally, I think competition is a good thing, and for areas
where we are still exploring the best way to automate tests, not just
from a QA department's perspective, but from a file system developer's
perspective, having multiple systems where we can explore these ideas
can be a good thing.

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)
  2022-06-22 21:44   ` Theodore Ts'o
@ 2022-06-23  5:31     ` Amir Goldstein
  2022-06-23 21:39       ` Luis Chamberlain
  2022-06-23 21:31     ` Luis Chamberlain
  1 sibling, 1 reply; 17+ messages in thread
From: Amir Goldstein @ 2022-06-23  5:31 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Luis Chamberlain, Leah Rumancik, Josef Bacik, Chuck Lever,
	chandanrmail, Sweet Tea Dorminy, Pankaj Raghav, linux-xfs,
	fstests

> It's optimized for developers, and for our use cases.  I'm sure
> kdevops is much more general, since it can work for hardware-based
> test machines, as well as many other cloud stacks, and it's also
> optimized for the QA department --- not surprising, since where
> kdevops has come from.
>

[...]

>
> We also have a very different philosophy about how to use expunge
> files.  In paticular, if there is test which is only failing 0.5% of
> the time, I don't think it makes sense to put that test into an
> expunge file.
>
> In general, we are only placing tests into expunge files when
> it causes the system under test to crash, or it takes *WAAAY* too
> long, or it's a clear test bug that is too hard to fix for real, so we
> just suppress the test for that config for now.  (Example: tests in
> xfstests for quota don't understand clustered allocation.)
>
> So we want to run the tests, even if we know it will fail, and have a
> way of annotating that a test is known to fail for a particular kernel
> version, or if it's a flaky test, what the expected flake percentage
> is for that particular test.  For flaky tests, we'd like to be able
> automatically retry running the test, and so we can flag when a flaky
> test has become a hard failure, or a flaky test has radically changed
> how often it fails.  We haven't implemented all of this yet, but this
> is something that we're exploring the design space at the moment.
>
> More generally, I think competition is a good thing, and for areas
> where we are still exploring the best way to automate tests, not just
> from a QA department's perspective, but from a file system developer's
> perspective, having multiple systems where we can explore these ideas
> can be a good thing.
>

I very much agree with Ted on that point.

As a user and big fan of both kdevops and fstests-bld I wouldn't
want to have to choose one over the other, not even to choose
a unified expunge list.

I think we are still at a point where this diversity makes our ecosystem
stronger rather than causing duplicate work.

To put it in more blunt terms, the core test suite, fstests, is not
very reliable. Neither kdevops nor fstests-bld address all the
reliability issue (and they contribute some of their own).
So we need the community to run both to get better and more
reliable filesystem test coverage.

Nevertheless, we should continue to share as much experience
and data points as we can during this co-opetition stage in order to
improve both systems.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)
  2022-06-23  5:31     ` Amir Goldstein
@ 2022-06-23 21:39       ` Luis Chamberlain
  0 siblings, 0 replies; 17+ messages in thread
From: Luis Chamberlain @ 2022-06-23 21:39 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Theodore Ts'o, Leah Rumancik, Josef Bacik, Chuck Lever,
	chandanrmail, Sweet Tea Dorminy, Pankaj Raghav, linux-xfs,
	fstests

On Thu, Jun 23, 2022 at 08:31:30AM +0300, Amir Goldstein wrote:
> To put it in more blunt terms, the core test suite, fstests, is not
> very reliable. Neither kdevops nor fstests-bld address all the
> reliability issue (and they contribute some of their own).
> So we need the community to run both to get better and more
> reliable filesystem test coverage.

The generic pains with fstests / blktests surely can be shared and
perhaps that is just a think we need to start doing more regularly at
LSFMM more so than a one-off thing.

> Nevertheless, we should continue to share as much experience
> and data points as we can during this co-opetition stage in order to
> improve both systems.

Yes, my point was not about killing something off, it was about sharing
data points, and I think we should at least share configs.

I personally see value in sharing expunges, but indeed if we do we'd
have to decide if to put them up on github with just the expunge list
alone, or do we also want to upload artifacts on the same tree. Or
should we dump all the artifacts into a storage pool somewhere. Some
artifacts can grow to insane sizes if a test is bogus, I ran into one
once which was at least 2 GiB of output on a *.bad file. The error was
just reapeating over and over. I think IIRC it was for ZNS for btrfs or
for a blktests zbd test where the ouput was just an error repeating
itself over and over. We could just have a size limit on these. And if
experience is to show us anyting perahps adopt an epoch thing if we
use git.

  Luis

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)
  2022-06-22 21:44   ` Theodore Ts'o
  2022-06-23  5:31     ` Amir Goldstein
@ 2022-06-23 21:31     ` Luis Chamberlain
  2022-06-24  5:32       ` Theodore Ts'o
  1 sibling, 1 reply; 17+ messages in thread
From: Luis Chamberlain @ 2022-06-23 21:31 UTC (permalink / raw)
  To: Theodore Ts'o, Darrick J. Wong
  Cc: Leah Rumancik, Amir Goldstein, Josef Bacik, Chuck Lever,
	chandanrmail, Sweet Tea Dorminy, Pankaj Raghav, linux-xfs,
	fstests

On Wed, Jun 22, 2022 at 05:44:30PM -0400, Theodore Ts'o wrote:
> On Tue, Jun 21, 2022 at 05:07:10PM -0700, Luis Chamberlain wrote:
> > On Thu, Jun 16, 2022 at 11:27:41AM -0700, Leah Rumancik wrote:
> > > https://gist.github.com/lrumancik/5a9d85d2637f878220224578e173fc23. 
> > 
> > The coverage for XFS is using profiles which seem to come inspired
> > by ext4's different mkfs configurations.
> 
> That's not correct, actually.  It's using the gce-xfstests test
> framework which is part of the xfstests-bld[1][2] system that I
> maintain, yes.  However, the actual config profiles were obtained via
> discussions from Darrick and represent the actual configs which the
> XFS maintainer uses to test the upstream XFS tree before deciding to
> push to Linus.  We figure if it's good enough for the XFS Maintainer,
> it's good enough for us.  :-)
> 
> [1] https://thunk.org/gce-xfstests
> [2] https://github.com/tytso/xfstests-bld
> 
> If you think the XFS Maintainer should be running more configs, I
> invite you to have that conversation with Darrick.

Sorry, I did not realize that the test configurations for XFS were already
agreed upon with Darrick for stable for the v5.15 effort.

Darrick, long ago when I started to test xfs for stable I had published
what I had suggested and it seemed to cover the grounds back then in
2019:

https://lore.kernel.org/all/20190208194829.GJ11489@garbanzo.do-not-panic.com/T/#m14e299ce476de104f9ee2038b8d002001e579515

If there is something missing from what we use on kdevops for stable
consideation I'd like to augment it. Note that kdevops supports many
sections and some of them are optional for the distribution, each
distribution can opt-in, but likewise we can make sensible defaults for
stable kernels, and per release too. The list of configurations
supported are:

https://github.com/linux-kdevops/kdevops/blob/master/playbooks/roles/fstests/templates/xfs/xfs.config

For stable today we use all sections except xfs_bigblock and xfs_realtimedev.

Do you have any advice on what to stick to for both v5.10 and v5.15 for
stable for both kdevops and gce-xfstests ? It would seem just odd if we
are not testing the same set of profiles as a minimum requirement.

Likewise, the same quiestion applies to linus' tree and linux-next as
in the future my hope is we get to the point kdevops *will* send out
notices for new regressions detected.

> > GCE is supported as well, so is Azure and OpenStack, and even custom
> > openstack solutions...
> 
> The way kdevops work is quite different from how gce-xfstests work,
> since it is a VM native solution.

To be clear, you seem to suggest gce-xfstests is a VM native solution.
I'd also like to clarify that kdevops supports native VMs, cloud and
baremetal. With kdevops you pick your bringup method.

> Which is to 

<-- a description of how gce-xfstests works -->

Today all artifacts are gathered by kdevops locally, they are not
uploaded anywhere. Consumption of this is yet to be determined,
but typically I put the output into a gist manually and then refer
to the URL of the gist on the expunge entry.

Uploading them can be an option / should, but it is not clear yet
where to upload them to. A team will soon be looking into doing some
more parsing of the results into a pretty flexible form / introspection.

> It's optimized for developers, and for our use cases.  I'm sure
> kdevops is much more general, since it can work for hardware-based
> test machines, as well as many other cloud stacks, and it's also
> optimized for the QA department --- not surprising, since where
> kdevops has come from.

kdevops started as an effort for kernel development and filesystems
testing. It is why the initial guest configuration was to use 8 GiB
of RAM and 4 vcpus, that suffices to do local builds / development.
I always did kernel development on guests back in the day still do
to this day.

It also has support for email reports and you get the xunit summary
output *and* a git diff output of the expunges should a new regression
be found.

A QA team was never involved other than later learning existed and that
the kernel team was using it to proactively find issues. Later kdevops was
used to report bugs proactively as it was finding a lot more issues than
typical fstests QA setups find.

> > Also, I see on the above URL you posted there is a TODO in the gist which
> > says, "find a better route for publishing these". If you were to use
> > kdevops for this it would have the immediate gain in that kdevops users
> > could reproduce your findings and help augment it.
> 
> Sure, but with our system, kvm-xfstests and gce-xfstests users can
> *easily* reproduce our findings and can help augment it.  :-)

Sure, the TODO item on the URL seemed to indicate there was a desire to
find a better place to put failures.

> As far as sharing expunge files, as I've observed before, these files
> tend to be very specific to the test configuration --- the number of
> CPU's, and the amount of memory, the characteristics of the storage
> device, etc.

And as I noted also at LSFMM it is not an imposibility to address this
either if we want to. We can simply use a namespace for test runner and
a generic test configuration.

A parent directory simply would represent the test runner. We have two
main ones for stable:

  * gce-xfstests
  * kdevops

So they can just be the parent directory.

Then I think we can probably agree upon 4 GiB RAM / 4 vpus per guest on
x86_64 for a typical standard requirement. So something like
x86_64_mem4g_cpus4. Then there is the drive setup. kdevops defaults
to loopback drives on nvme drives as the default for both cloud and
native KVM guests. So that can be nvme_loopback. It is not clear
what gce-xfstests but this can probably be described just as well.

> So what works for one developer's test setup will not
> necessarily work for others 

True but it does not mean we cannot automate setup of an agreed upon setup.
Specially if you wan to enable folks to reproduce. We can.

> --- and I'm not convinced that trying to
> get everyone standardized on the One True Test Setup is actually an
> advantage.

That is not a goal, the goal is allow variability! And share results
in the most efficient way.

It just turns an extremely simple setup we *can* *enable* *many* folks
to setup easily with local vms to reproduce *more* issues today is
with nvme drives + loopback drives. You are probably correct that
this methodology was perhaps not as tested today as it was before and
this is probably *why* we find more issues today. But so far it is
true that:

 * all issues found are real and sometimes hard to reproduce with direct
   drives
 * this methodology is easy to bring up
 * it is finding more issues

This is why this is just today's default for kdevops. It does not
mean you can't *grow* to add support for other drive setup. In fact
this is needed for testing ZNS drives.

> Some people may be using large RAID Arrays; some might be
> using fast flash; some might be using some kind of emulated log
> structured block device; some might be using eMMC flash.  And that's a
> *good* thing.

Absolutely!

> We also have a very different philosophy about how to use expunge
> files.

Yes it does not mean we can't share them.

And the variability which exists today *can* also be expressed.

> In paticular, if there is test which is only failing 0.5% of
> the time, I don't think it makes sense to put that test into an
> expunge file.

This preference can be expressed through kconfig and supported
and support added for it.

> More generally, I think competition is a good thing, and for areas
> where we are still exploring the best way to automate tests, not just
> from a QA department's perspective, but from a file system developer's
> perspective, having multiple systems where we can explore these ideas
> can be a good thing.

Sure, sure, but again, but it does not mean we can't or shouldn't
consider to share some things. Differences in strategy on how to process
expunge files can be discussed so that later I can add support for it.

I still think we can share at the very least configurations and
expunges with known failure rates (even if they are runner/config
specific).

  Luis

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)
  2022-06-23 21:31     ` Luis Chamberlain
@ 2022-06-24  5:32       ` Theodore Ts'o
  2022-06-24 22:54         ` Luis Chamberlain
  0 siblings, 1 reply; 17+ messages in thread
From: Theodore Ts'o @ 2022-06-24  5:32 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Darrick J. Wong, Leah Rumancik, Amir Goldstein, Josef Bacik,
	Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav,
	linux-xfs, fstests

On Thu, Jun 23, 2022 at 02:31:12PM -0700, Luis Chamberlain wrote:
> 
> To be clear, you seem to suggest gce-xfstests is a VM native solution.
> I'd also like to clarify that kdevops supports native VMs, cloud and
> baremetal. With kdevops you pick your bringup method.

Yes, that was my point.  Because gce-xfstests is a VM native solution,
it has some advantages, such as the ability to take advantage of the
fact that it's trivially easy to start up multiple cloud VM's which
can run in parallel --- and then the VM's shut themselves down once
they are done running the test, which saves cost and is more
efficient.

It is *because* that we are a VM-native solution that we can optimize
in certain ways because we don't have to also support a bare metal
setup.  So yes, the fact that kdevops also supports bare metal is
certainly granted.  That that kind of flexibility is an advantage for
kdevops, certainly; but being able to fully take advantage of the
unqiue attributes of cloud VM's can also be a good thing.

(I've already made offers to folks working at other cloud vendors that
if they are interested in adding support for other cloud systems
beyond GCE, I'm happy to work with them to enable the use of other
XXX-xfstests test-appliance runners.)

> kdevops started as an effort for kernel development and filesystems
> testing. It is why the initial guest configuration was to use 8 GiB
> of RAM and 4 vcpus, that suffices to do local builds / development.
> I always did kernel development on guests back in the day still do
> to this day.

For kvm-xfstests, the default RAM size for the VM is 2GB.  One of the
reasons why I was interested in low-memory configurations is because
ext4 is often used in smaller devices (such as embedded systesm and
mobile handsets) --- and running in memory constrained environments
can turn up bugs that otherwise are much harder to reproduce on a
system with more memory.

Separating the kernel build system from the test VM's means that the
build can take place on a really powerful machine (either my desktop
with 48 cores and gobs and gobs of memory, or a build VM if you are
using the Lightweight Test Manager's Kernel Compilation Service) so
builds go much faster.  And then, of course, we can then launch a
dozen VM's, one for each test config.  If you force the build to be
done on the test VM, then you either give up parallelism, or you waste
time by building the kernel N times on N test VM's.

And in the case of the android-xfstests, which communicates with a
phone or tablet over a debugging serial cable and Android's fastboot
protocol, of *course* it would be insane to want to build the kernel
on the system under test!

So I've ***always*** done the kernel build on a machine or VM separate
from the System Under Test.  At least for my use cases, it just makes
a heck of a lot more sense.

And that's fine.  I'm *not* trying to convince everyone that my test
infrastructure everyone should standardize on.  Which quite frankly, I
sometimes think you have been evangelizing.  I believe very strongly
that the choice of test infrastructures is a personal choice, which is
heavily dependent on each developer's workflow, and trying to get
everyone to standardize on a single test infrastructure is likely
going to work as well as trying to get everyone to standardize on a
single text editor.

(Although obviously emacs is the one true editor.  :-)

> Sure, the TODO item on the URL seemed to indicate there was a desire to
> find a better place to put failures.

I'm not convinced the "better place" is expunge files.  I suspect it
may need to be some kind of database.  Darrick tells me that he stores
his test results in a postgres database.  (Which is way better than
what I'm doing which is an mbox file and using mail search tools.)

Currently, Leah is using flat text files for the XFS 5.15 stable
backports effort, plus some tools that parse and analyze those text
files.

I'll also note that the number of baseline kernel versions is much
smaller if you are primarily testing an enterprise Linux distribution,
such as SLES.  And if you are working with stable kernels, you can
probably get away with having updating the baseline for each LTS
kernel every so often.  But for upstream kernels development the
number of kernel versions for which a developer might want to track
flaky percentages and far greater, and will need to be updated at
least once every kernel development cycle, and possibly more
frequently than that.  Which is why I'm not entirely sure a flat text
file, such as an expunge file, is really the right answer.  I can
completely understand why Darrick is using a Postgres database.

So there is clearly more thought and design required here, in my
opinion.

> That is not a goal, the goal is allow variability! And share results
> in the most efficient way.

Sure, but are expunge files the most efficient way to "share results"?
If we have a huge amount of variability, such that we have a large
number of directories with different test configs and different
hardware configs, each with different expunge files, I'm not sure how
useful that actually is.  Are we expecting users to do a "git clone",
and then start browsing all of these different expunge files by hand?

It might perhaps be useful to get a bit more clarity about how we
expect the shared results would be used, because that might drive some
of the design decisions about the best way to store these "results".

Cheers,

					- Ted

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)
  2022-06-24  5:32       ` Theodore Ts'o
@ 2022-06-24 22:54         ` Luis Chamberlain
  2022-06-25  2:21           ` Theodore Ts'o
  2022-06-25  7:28           ` sharing fstests results (Was: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)) Amir Goldstein
  0 siblings, 2 replies; 17+ messages in thread
From: Luis Chamberlain @ 2022-06-24 22:54 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Darrick J. Wong, Leah Rumancik, Amir Goldstein, Josef Bacik,
	Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav,
	Zorro Lang, linux-xfs, fstests

On Fri, Jun 24, 2022 at 01:32:23AM -0400, Theodore Ts'o wrote:
> On Thu, Jun 23, 2022 at 02:31:12PM -0700, Luis Chamberlain wrote:
> > 
> > To be clear, you seem to suggest gce-xfstests is a VM native solution.
> > I'd also like to clarify that kdevops supports native VMs, cloud and
> > baremetal. With kdevops you pick your bringup method.
> 
> Yes, that was my point.  Because gce-xfstests is a VM native solution,
> it has some advantages, such as the ability to take advantage of the
> fact that it's trivially easy to start up multiple cloud VM's which
> can run in parallel --- and then the VM's shut themselves down once
> they are done running the test, which saves cost and is more
> efficient.

Perhaps I am not understanding what you are suggesting with a VM native
solution. What do you mean by that? A full KVM VM inside the cloud?

Anyway, kdevops has support to bring up whatever type of node you want
in the clouds providers: GCE, AWS, Azure, and OpenStack and even custom
OpenStack solutions. That could be a VM or a high end bare metal node.
It does this by using terraform and providing the variability through
kconfig.

The initial 'make bringup' brings nodes up, and then all work runs on
each in parallel for fstests as you run 'make fstests-baseline'. At the
end you just run 'make destroy'.

> It is *because* that we are a VM-native solution that we can optimize
> in certain ways because we don't have to also support a bare metal
> setup.  So yes, the fact that kdevops also supports bare metal is
> certainly granted.  That that kind of flexibility is an advantage for
> kdevops, certainly; but being able to fully take advantage of the
> unqiue attributes of cloud VM's can also be a good thing.

Yes, agreed. That is why I focused on technology that would support
all cloud providers, not just one.

I had not touched code for AWS code for example in 2 years, I just
went and tried a bringup and it worked in 10 minutes, most of the time
was getting my .aws/credentials file set up with information from the
website.

> > kdevops started as an effort for kernel development and filesystems
> > testing. It is why the initial guest configuration was to use 8 GiB
> > of RAM and 4 vcpus, that suffices to do local builds / development.
> > I always did kernel development on guests back in the day still do
> > to this day.
> 
> For kvm-xfstests, the default RAM size for the VM is 2GB.  One of the
> reasons why I was interested in low-memory configurations is because
> ext4 is often used in smaller devices (such as embedded systesm and
> mobile handsets) --- and running in memory constrained environments
> can turn up bugs that otherwise are much harder to reproduce on a
> system with more memory.

Yes, I agree. We started with 8 GiB. Long ago while at SUSE I tried 2GiB
and ran into the xfs/074 issue of requiring more due to xfs_scratch.
Then later Amir ran into snags with xfs/084 and generic/627 due to the
OOMs. So in terms of XFS to avoid OOMs with just the tests we need 3GiB.

> Separating the kernel build system from the test VM's means that the
> build can take place on a really powerful machine (either my desktop
> with 48 cores and gobs and gobs of memory, or a build VM if you are
> using the Lightweight Test Manager's Kernel Compilation Service) so
> builds go much faster.  And then, of course, we can then launch a
> dozen VM's, one for each test config.  If you force the build to be
> done on the test VM, then you either give up parallelism, or you waste
> time by building the kernel N times on N test VM's.

The build is done once but I agree this can be optimized for kdevops.

Right now in kdevops the git clone and build of the kernel does take
place on each guest, and that requires at least 3 GiB of RAM. Shallow
git clone support was added as option to help here but the ideal thing
will be to just build locally or perhaps as you suggest dedicated build
VM.

> And in the case of the android-xfstests, which communicates with a
> phone or tablet over a debugging serial cable and Android's fastboot
> protocol, of *course* it would be insane to want to build the kernel
> on the system under test!
> 
> So I've ***always*** done the kernel build on a machine or VM separate
> from the System Under Test.  At least for my use cases, it just makes
> a heck of a lot more sense.

Support for this will be added to kdevops.

> And that's fine.  I'm *not* trying to convince everyone that my test
> infrastructure everyone should standardize on.  Which quite frankly, I
> sometimes think you have been evangelizing.  I believe very strongly
> that the choice of test infrastructures is a personal choice, which is
> heavily dependent on each developer's workflow, and trying to get
> everyone to standardize on a single test infrastructure is likely
> going to work as well as trying to get everyone to standardize on a
> single text editor.

What I think we *should* standardize on is at least configurations
for testing. And now the dialog of how / if we track / share failures
is also important.

What runner you use is up to you.

> (Although obviously emacs is the one true editor.  :-)
> 
> > Sure, the TODO item on the URL seemed to indicate there was a desire to
> > find a better place to put failures.
> 
> I'm not convinced the "better place" is expunge files.  I suspect it
> may need to be some kind of database.  Darrick tells me that he stores
> his test results in a postgres database.  (Which is way better than
> what I'm doing which is an mbox file and using mail search tools.)
> 
> Currently, Leah is using flat text files for the XFS 5.15 stable
> backports effort, plus some tools that parse and analyze those text
> files.

Where does not matter yet, what I'd like to refocus on is *if* sharing
is desirable by folks. We can discuss *how* and *where* if we do think
it is worth to share.

If folks would like to evaluate this I'd encourage to do so perhaps
after a specific distro release moving forward, and to not backtrack.

But for stable kernels I'd imagine it may be easier to see value in
sharing.

> I'll also note that the number of baseline kernel versions is much
> smaller if you are primarily testing an enterprise Linux distribution,
> such as SLES.

Much smaller than what? Android? If so then perhaps. Just recall that
Enterprise supports kernels for at least 10 years.

> And if you are working with stable kernels, you can
> probably get away with having updating the baseline for each LTS
> kernel every so often.  But for upstream kernels development the
> number of kernel versions for which a developer might want to track
> flaky percentages and far greater, and will need to be updated at
> least once every kernel development cycle, and possibly more
> frequently than that.  Which is why I'm not entirely sure a flat text
> file, such as an expunge file, is really the right answer.  I can
> completely understand why Darrick is using a Postgres database.
> 
> So there is clearly more thought and design required here, in my
> opinion.

Sure, let's talk about it, *if* we do find it valuable to share.
kdevops already has stuff in a format which is consistent, that
can change or be ported. We first just need to decide if we want
to as a community share.

The flakyness annotations are important too, and we have a thread
about that, which I have to go and get back to at some point.

> > That is not a goal, the goal is allow variability! And share results
> > in the most efficient way.
> 
> Sure, but are expunge files the most efficient way to "share results"?

There are three things we want to do if we are going to talk about
sharing results:

a) Consuming expunges so check.sh for the Node Under Test (NUT) can expand
   on the expunges given a criteria (flakyness, crash requirements)

b) Sharing updates to expunges per kernel / distro / runner / node-config
   and making patches to this easy.

c) Making updates for failures easy to read for a developer / community.
   These would be in the form of an email or results file for a test
   run through some sort of kernel-ci.

Let's start with a):

We can adopt runners to use anything. My gut tells me postgres is
a bit large unless we need socket communication. I can think of two
ways to go here then. Perhaps others have some other ideas?

1) We go lightweight on the db, maybe sqlite3 ? And embrace the same
   postgres db schema as used by Darrick if he sees value in sharing
   this. If we do this I think it does't make sense to *require*
   sqlite3 on the NUT (nodes), for many reasons, so parsing the db
   on the host to a flat file to be used by the node does seem
   ideal.

2) Keep postgres and provide a REST api for queries from the host to
   this server so it can then construct a flat file / directory
   interpreation of expunges for the nodes under test (NUT).

Given the minimum requirements desirable on the NUTs I think in the end
a flat file hierarchy is nice so to not incur some new dependency on
them.

Determinism is important for tests though so snapshotting a reflection
interpretion of expunges at a specific point in time is also important.
So the database would need to be versioned per updates, so a test is
checkpointed against a specific version of the expunge db.

If we come to some sort of consensus then this code for parsing an
expunge set can be used from directly on fstests's check script, so the
interpreation and use can be done in one place for all test runners.
We also have additional criteria which we may want for the expunges.
For instance, if we had flakyness percentage annotated somehow then
fstests's check could be passed an argument to only include expunges
given a certain flakyness level of some sort, or for example only
include expunges for tests which are known to crash.

Generating the files from a db is nice. But what gains do we have
with using a db then?

Now let's move on to b) sharing the expunges and sending patches for
updates. I think sending a patch against a flat file reads a lot easier
except for the comments / flakyness levels / crash consideration / and
artifacts. For kdevop's purposes this reads well today as we don't
upload artifacts anywhere and just refer to them on github gists as best
effort / optional. There is no convention yet on expression of flakyness
but some tests do mention "failure rate" in one way or another.

So we want to evaluate if we want to share not only expunges but other
meta data associated to why a new test can be expunged or removed:

 * flakyness percentage
 * cause a kernel crash?
 * bogus test?
 * expunged due to a slew of a tons of other reasons, some of them maybe
   categorized and shared, some of them not

And do we want to share artifacts? If so how? Perhaps an optional URL,
with another component describing what it is, gist, or a tarball, etc.

Then for the last part c) making failures easy to read to a developer
let's review what could be done. I gather gce-xfstests explains the
xunit results summary. Right now kdevop's kernel-ci stuff just sends
an email with the same but also a diff to the expunge file hierarchy
augmented for the target kernel directory being tested. The developer
would just go and edit the line with meta data as a comment, but that
is just because we lack a structure for it. If we strive to share
an expunge list I think it would be wise to consider structure for
this metadata.

Perhaps:

<test> # <crashes>|<flayness-percent-as-fraction>|<fs-skip-reason>|<artifact-type>|<artifact-dir-url>|<comments>

Where:

test:                         xfs/123 or btrfs/234
crashes:                      can be either Y or N
flayness-percent-as-percentage: 80%
fs-skip-reason:               can be an enum to represent a series of 
                              fs specific reasons why a test may not be
			      applicable or should be skipped
artifact-type:                optional, if present the type of artifact,
                              can be enum to represent a gist test
			      description, or a tarball
artifact-dir-url:             optional, path to the artifact
comments:                     additional comments

All the above considered, a) b) and c), yes I think a flat file
model works well as an option. I'd love to hear other's feedback.

> If we have a huge amount of variability, such that we have a large
> number of directories with different test configs and different
> hardware configs, each with different expunge files, I'm not sure how
> useful that actually is.

*If* you want to share I think it would be useful.

At least kdevops uses a flat file model with no artifacts, just the
expunges and comments, and over time it has been very useful, even to be
able to review historic issues on older kernels by simply using
something like 'git grep xfs/123' gives me a quick sense of history of
issues of a test.

> Are we expecting users to do a "git clone",
> and then start browsing all of these different expunge files by hand?

If we want to extend fstests check script to look for this, it could
be an optional directory and an arugment could be pased to check so
to enable its hunt for it, so that if passed it would look for the
runner / kernel / host-type. For instance today we already have
a function on initialization for the check script which looks for
the fstests' config file as follows:

known_hosts()
{
	[ "$HOST_CONFIG_DIR" ] || HOST_CONFIG_DIR=`pwd`/configs

	[ -f /etc/xfsqa.config ] && export HOST_OPTIONS=/etc/xfsqa.config
	[ -f $HOST_CONFIG_DIR/$HOST ] && export HOST_OPTIONS=$HOST_CONFIG_DIR/$HOST
	[ -f $HOST_CONFIG_DIR/$HOST.config ] && export HOST_OPTIONS=$HOST_CONFIG_DIR/$HOST.config
}

We could have something similar look for an expugne directory of say
say --expunge-auto-look and that could be something like:

process_expunge_dir()
{
	[ "$HOST_EXPUNGE_DIR" ] || HOST_EXPUNGE_DIR=`pwd`/expunges

	[ -d /etc/fstests/expunges/$HOST ] && export HOST_EXPUNGES=/etc/fstests/expunges/$HOST
	[ -d $HOST_EXPUNGE_DIR/$HOST ] && export HOST_EXPUNGES=$HOST_EXPUNGE_DIR/$HOST
}

The runner could be specified, and the host-type

./check --runner <gce-xfstests|kdevops|whatever> --host-type <kvm-8vcpus-2gb>

And so we can have it look for these directory and if any of these are used
processed (commulative):

  * HOST_EXPUNGES/any/$fstype/                       - regardless of kernel, host type and runner
  * HOST_EXPUNGES/$kernel/$fstype/any                - common between runners for any host type
  * HOST_EXPUNGES/$kernel/$fstype/$hostype           - common between runners for a host type
  * HOST_EXPUNGES/$kernel/$fstype/$hostype/$runner   - only present for the runner

The aggregate set of expugnes are used.

Additional criteria could be passed to check so to ensure that only
certain expunges that meet the criteria are used to skip tests for the
run, provided we can agree on some metatdata for that.

> It might perhaps be useful to get a bit more clarity about how we
> expect the shared results would be used, because that might drive some
> of the design decisions about the best way to store these "results".

Sure.

  Luis

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)
  2022-06-24 22:54         ` Luis Chamberlain
@ 2022-06-25  2:21           ` Theodore Ts'o
  2022-06-25 18:49             ` Luis Chamberlain
  2022-06-25  7:28           ` sharing fstests results (Was: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)) Amir Goldstein
  1 sibling, 1 reply; 17+ messages in thread
From: Theodore Ts'o @ 2022-06-25  2:21 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Darrick J. Wong, Leah Rumancik, Amir Goldstein, Josef Bacik,
	Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav,
	Zorro Lang, linux-xfs, fstests

On Fri, Jun 24, 2022 at 03:54:44PM -0700, Luis Chamberlain wrote:
> 
> Perhaps I am not understanding what you are suggesting with a VM native
> solution. What do you mean by that? A full KVM VM inside the cloud?

"Cloud native" is the better way to put things.  Cloud VM's
are designed to be ephemeral, so the concept of "node bringup" really
doesn't enter into the picture.

When I run the "build-appliance" command, this creates a test
appliance image.  Which is to say, we create a root file system image,
and then "freeze" it into a VM image.

For kvm-xfstests this is a qcow image which is run in snapshot mode,
which means that if any changes is made to the root file system, those
changes disappear when the VM exits.

For gce-xfstests, we create an image which can be used to quickly
bring up a VM which contains a block device which contains a copy of
that image as the root file system.  What so special about this?  I
can create a dozen, or a hundred VM's, all with a copy of that same
image.  So I can do something like

   gce-xfstests ltm -c ext4/all -g full gs://gce-xfstests/bzImage-5.4.200

and this will launch a dozen VM's, with each VM testing a single test
configuration with the kernel found at gs://gce-xfstests/bzImage-5.4.200
in Google Cloud Storage GCS (the rough equivalent of AWS's S3).

And then I can run

   gce-xfstests ltm -c ext4/all -g full --repo stable.git --commit v5.10.124

And this will launch a build VM which is nice and powerful to
*quickly* build the 5.10.124 kernel as found in the stable git tree,
and then launch a dozen additional VM's to test that built kernel
against all of the test configs defined for ext4/all, one VM per each
fs config.

And after running

   gce-xfstests ltm -c ext4/all -g full --repo stable.git --commit v5.15.49
   gce-xfstests ltm -c ext4/all -g full --repo stable.git --commit v5.18.6

... now there will be ~50 VM's all running tests in parallel.  So this
is far faster than doing a "node bringup", and since I am running all
of the tests in parallel, I will get the test results back in a much
shorter amount of wall clock time.  And, as running each test config
complete, the VM's will disappear (after first uploading the test
results into GCS), and I will stop getting charged for them.

And if I were to launch additional tests runs, each containing their
own set of VM's:

   gce-xfstests ltm -c xfs/all -g full --repo stable.git --commit v5.15.49
   gce-xfstests ltm -c xfs/all -g full --repo stable.git --commit v5.18.6
   gce-xfstests ltm -c f2fs/all -g full --repo stable.git --commit v5.15.49

I can very quickly have over 100 test VM's running in parallel, and as
the tests complete, they are automatically shutdown and destroyed ----
which means that we don't store state in the VM.  Instead the state is
stored in a Google Cloud Storage (Amazon S3) bucket, with e-mail sent
with a summary of results.

VM's can get started much more quickly than "make bringup", since
we're not running puppet or ansible to configure each node.  Instead,
we get a clone of the test appliance:

% gce-xfstests describe-image
archiveSizeBytes: '1315142848'
creationTimestamp: '2022-06-20T21:46:24.797-07:00'
description: Linux Kernel File System Test Appliance
diskSizeGb: '10'
family: xfstests
  ...
labels:
  blktests: gaf97b55
  fio: fio-3_30
  fsverity: v1_5
  ima-evm-utils: v1_3_2
  nvme-cli: v1_16
  quota: v4_05-43-gd2256ac
  util-linux: v2_38
  xfsprogs: v5_18_0
  xfstests: v2022_06_05-13-gbc442c4b
  xfstests-bld: g8548bd11
  zz_build-distro: bullseye
  ...

And since these images are cheap to keep around (5-6 cents/month), I
can keep a bunch of older versions of test appliances around, in case
I want to see if a test regression might be caused by a newer version
of the test appliance.  So I can run "gce-xfstests -I
xfstests-202001021302" and this will create a VM using the test
appliance that I built on January 2, 2020.  It also means that I can
release a new test appliance to the xfstests-cloud project for public
use, and if someone wants to pin their testing to an known version of
the test appliance, they can do that.

So the test appliance VM's can be much more dynamic than kdevops
nodes, because they can be created and deleted without a care in the
world.  This is enabled by the fact that there isn't any state which
is stored on the VM.  In contrat, in order to harvest test results
from a kdevops node, you have to ssh into the node and try to find the
test results.

In contrast, I can just run "gce-xfstests ls-results" to see all of
the results that has been saved to GCS, and I can fetch a particular
test result to my laptop via a single command: "gce-xfstests
get-results tytso-20220624210238".  No need to ssh to a host node, and
then ssh to the kdevops test node, yadda, yadda, yadda --- and if you
run "make destroy" you lose all of the test result history on that node,
right?

Speaking of saving the test result history, a full set of test
results/artifiacts for a dozen ext4 configs is around 12MB for the
tar.xz file, and Google Cloud Storage is a penny/GB/month for nearline
storage, and 0.4 cents/GB/month for coldline storage, so I can afford
to keep a *lot* of test results/artifacts for quite a while, which can
occasionally be handy for doing some historic research.

See the difference?

						- Ted

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)
  2022-06-25  2:21           ` Theodore Ts'o
@ 2022-06-25 18:49             ` Luis Chamberlain
  2022-06-25 21:14               ` Theodore Ts'o
  0 siblings, 1 reply; 17+ messages in thread
From: Luis Chamberlain @ 2022-06-25 18:49 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Darrick J. Wong, Leah Rumancik, Amir Goldstein, Josef Bacik,
	Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav,
	Zorro Lang, linux-xfs, fstests

On Fri, Jun 24, 2022 at 10:21:50PM -0400, Theodore Ts'o wrote:
> On Fri, Jun 24, 2022 at 03:54:44PM -0700, Luis Chamberlain wrote:
> > 
> > Perhaps I am not understanding what you are suggesting with a VM native
> > solution. What do you mean by that? A full KVM VM inside the cloud?
> 
> "Cloud native" is the better way to put things.  Cloud VM's
> are designed to be ephemeral, so the concept of "node bringup" really
> doesn't enter into the picture.
> 
> When I run the "build-appliance" command, this creates a test
> appliance image.  Which is to say, we create a root file system image,
> and then "freeze" it into a VM image.

So this seems to build an image from a base distro image. Is that right?
And it would seem your goal is to store that image then after so it can
be re-used.

> For kvm-xfstests this is a qcow image which is run in snapshot mode,
> which means that if any changes is made to the root file system, those
> changes disappear when the VM exits.

Sure, so you use one built image once, makes sense.

You are optimizing usage for GCE. That makes sense. The goal behind
kdevops was to use technology which can *enable* any optimizations in
a cloud agnostic way. What APIs become public is up to the cloud
provider, and one cloud agnostic way to manage cloud solutions using
open source tools is with terraform and so that is used today. If an API
is not yet avilable through terraform kdevops could simply use whatever
cloud tool for additional hooks. But having the ability to ramp up
regardless of cloud provider was extremely important to me from the
beginning.

Optimizing is certainly possible, always :)

Likewise, if you using local virtualized, we can save vagrant images
in the vagrant cloud, if we wanted, which would allow pre-built setups
saved:

https://app.vagrantup.com/boxes/search

That could reduce speed for when doing bringup for local KVM /
Virtualbox guests.

In fact since vagrant images are also just tarballs with qcow2 files,
I do wonder if they can be also leveraged for cloud deployments. Or if
the inverse is true, if your qcow2 images can be used for vagrant
purposes as well. If you're curious:

https://github.com/linux-kdevops/kdevops/blob/master/docs/custom-vagrant-boxes.md

What approach you use is up to you. From a Linux distribution perspective
being able to do reproducible builds was important too, and so that is
why a lot of effort was put to ensure how you cook up a final state from
an initial distro release was supported.

> I can very quickly have over 100 test VM's running in parallel, and as
> the tests complete, they are automatically shutdown and destroyed ----
> which means that we don't store state in the VM.  Instead the state is
> stored in a Google Cloud Storage (Amazon S3) bucket, with e-mail sent
> with a summary of results.

Using cloud object storage is certainly nice if you can afford it. I
think it is valuable, but likewise should be optional. And so with
kdevops support is welcomed should someone want to do that. And so
what you describe is not impossible with kdevops it is just not done
today, but could be enabled.

> VM's can get started much more quickly than "make bringup", since
> we're not running puppet or ansible to configure each node.

You can easily just use pre-built images as well instead of doing
the build from a base distro release, just as you could use custom
vagrant images for local KVM guests.

The usage of ansible to *build* fstests and install can be done once
too and that image saved, exported, etc, and then re-used. The kernel
config I maintain on kdevops has been tested to work on local KVM
virtualization setups, but also all supported cloud providers as well.

So I think there is certainly value in learning from the ways you 
optimizing cloud usage for GCE and generalizing that for *any* cloud
provider.

The steps to get to *build* an image from a base distro release is
glanced over but that alone takes effort and is made pretty well
distro agnostic within kdevops too.

> In contrast, I can just run "gce-xfstests ls-results" to see all of
> the results that has been saved to GCS, and I can fetch a particular
> test result to my laptop via a single command: "gce-xfstests
> get-results tytso-20220624210238".  No need to ssh to a host node, and
> then ssh to the kdevops test node, yadda, yadda, yadda --- and if you
> run "make destroy" you lose all of the test result history on that node,
> right?

Actually all the *.bad, *.dmesg as well as final xunit results for all
nodes for failed tests is copied over locally to the host which is
running kdevops. Xunit files are also merged to represent a final full set
of results too. So no not destroyed. If you wanted to keep all files even
for non-failed stuff we can add that as a new Kconfig bool.

Support for stashing results into object storage sure would be nice, agreed.

> See the difference?

Yes you have optimized usage of GCE. Good stuff, lots to learn from that effort!
Thanks for sharing the details!

  Luis

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)
  2022-06-25 18:49             ` Luis Chamberlain
@ 2022-06-25 21:14               ` Theodore Ts'o
  2022-07-01 23:08                 ` Luis Chamberlain
  0 siblings, 1 reply; 17+ messages in thread
From: Theodore Ts'o @ 2022-06-25 21:14 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Darrick J. Wong, Leah Rumancik, Amir Goldstein, Josef Bacik,
	Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav,
	Zorro Lang, linux-xfs, fstests

On Sat, Jun 25, 2022 at 11:49:54AM -0700, Luis Chamberlain wrote:
> You are optimizing usage for GCE. That makes sense.

This particular usage model is not unique to GCE.  A very similar
thing can be done using Microsoft Azure, Amazon Web Services and
Oracle Cloud Services.  And I've talked to some folks who might be
interested in taking the Test Appliance that is currently built for
use with KVM, Android, and GCE, and extending it to support other
Cloud infrastructures.  So the concept of these optimizations are not
unique to GCE, which is why I've been calling this approach "cloud
native".

Perhaps one other difference is that I make the test appliance images
available, so people don't *have* to build them from scratch.  They
can just download the qcow2 image from:

    https://www.kernel.org/pub/linux/kernel/people/tytso/kvm-xfstests

And for GCE, there is the public image project, xfstests-cloud, just
like there are public images for debian in the debian-cloud project,
for Fedora in the fedora-cloud project, etc.  Of course, for full GPL
compliance, how to build these images from source is fully available,
which is why the images are carefully tagged so all of the git commit
versions and the automated scripts used to build the image are fully
available for anyone who wants to replicate the build.  *BUT*, they
don't have to build the test environment if they are just getting
started.

One of the things which I am trying to do is to make the "out of box"
experience as simple as possible, which means I don't want to force
users to build the test appliance or run "make bringup" if they don't
have to.   

Of course, someone who is doing xfstests development will need to
learn how to build their own test appliance.  But for someone who is
just getting started, the goal is to make the learning curve as flat
as possible.

One of the other things that was important design principles for me
was I didn't want to require that the VM's have networking access, nor
did I want to require users to be able to have to run random scripts
via sudo or as root.  (Some of this was because of corporate security
requirements at the time.)  This also had the benefit that I'm not
asing the user to set up ssh keys if they are using kvm-xfstests, but
instead rely on the serial console.

> The goal behind kdevops was to use technology which can *enable* any
> optimizations in a cloud agnostic way.

Fair enough.  My goal for kvm-xfstests and gce-xfstests was to make
developer velocity the primary goal.  Portability to different cloud
systems took a back seat.  I don't apologize for this, since over the
many years that I've been personally using {kvm,gce}-xfstests, the
fact that I can use my native kernel development environment, and have
the test environment pluck the kernel straight out of my build tree,
has paid for itself many times over.

If I had to push test/debug kernel code to a public git tree just so
the test VM can pull donwn the code and build it in the test VM a
second time --- I'd say, "no thank you, absolutely not."  Having to do
this would slow me down, and as I said, developer velocity is king.  I
want to be able to save a patch from my mail user agent, apply the
patch, and then give the code a test, *without* having to interact
with a public git tree.

Maybe you can do that with kdevops --- but it's not at all obvious
how.  With kvm-xfstests, I have a quickstart doc which gives
instructions, and then it's just a matter of running the command
"kvm-xfstests smoke" or "kvm-xfstests shell" from the developer's
kernel tree.  No muss, no fuss, no dirty dishes....

> In fact since vagrant images are also just tarballs with qcow2 files,
> I do wonder if they can be also leveraged for cloud deployments. Or if
> the inverse is true, if your qcow2 images can be used for vagrant
> purposes as well. 

Well, my qcow2 images don't come with ssh keys, since they are
optimized to be launched from the kvm-xfstests script, where the tests
to be run are passed in via the boot command line:

% kvm-xfstests smoke --no-action
Detected kbuild config; using /build/ext4-4.14 for kernel
Using kernel /build/ext4-4.14/arch/x86/boot/bzImage
Networking disabled.
Would execute:
         ionice -n 5 /usr/bin/kvm -boot order=c -net none -machine type=pc,accel=kvm:tcg \
	 -cpu host -drive file=/usr/projects/xfstests-bld/build-64/test-appliance/root_fs.img,if=virtio,snapshot=on \
	 ....
	-gdb tcp:localhost:7499 --kernel /build/ext4-4.14/arch/x86/boot/bzImage \
	--append "quiet loglevel=0 root=/dev/vda console=ttyS0,115200 nokaslr fstestcfg=4k fstestset=-g,quick fstestopt=aex fstesttz=America/New_York fstesttyp=ext4 fstestapi=1.5 orig_cmdline=c21va2UgLS1uby1hY3Rpb24="

The boot command line options "fstestcfg=4k", "fstestset=-g,quick",
"fstesttyp=ext4", etc. is how the test appliance knows which tests to
run.  So that means *all* the developer needs to do is to type command
"kvm-xfstests smoke".

(By the way, it's a simple config option in ~/.config/kvm-xfstests if
you are a btrfs or xfs developer, and you want the default file system
type to be btrfs or xfs.  Of course you can explicitly specify a test
config if you are an ext4 developer and,, you want to test how a test
runs on xfs: "kvm-xfstests -c xfs/4k generic/223".)

There's no need to set up ssh keys, push the kernel to a public git
tree, ssh into the test VM, yadda, yadda, yadda.  Just one single
command line and you're *done*.

This is what I meant by the fact that kvm-xfstests is optimized for a
file system developer's workflow, which I claim is very different from
what a QA department might want.  I added that capability to
gce-xfstests later, but it's very separate from the very simple
command lines for a file system developer.  If I want the lightweight
test manager to watch a git tree, and to kick off a build whenever a
branch changes, and then run a set of tests, I can do that, but that's
a *very* different command and a very different use case, and I've
optimized for that separately:

    gce-xfstests ltm -c ext4/all -g auto --repo ext4.dev --watch dev

This is what I call the QA department's workflow.  Which is also
totally valid.  But I believe in optimizing for each workflow
separately, and being somewhat opinionated in my choices.

For example, the test appliance uses Debian.  Period.  And that's
because I didn't see the point of investing time in making that be
flexible.  My test infrastructure is optimized for a ***kernel***
developer, and from that perspective, the distro for the test
environment is totally irrelevant.

I understand that if you are working for SuSE, then maybe you would
want to insist on a test environment based on OpenSuSE, or if you're
working for Red Hat, you'd want to use Fedora.  If so, then
kvm-xfstests is not for you.  I'd much rather optimize for a *kernel*
developer, not a Linux distribution's QA department.  They can use
kdevops if they want, for that use case.  :-)

Cheers,

							- Ted

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)
  2022-06-25 21:14               ` Theodore Ts'o
@ 2022-07-01 23:08                 ` Luis Chamberlain
  0 siblings, 0 replies; 17+ messages in thread
From: Luis Chamberlain @ 2022-07-01 23:08 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Darrick J. Wong, Leah Rumancik, Amir Goldstein, Josef Bacik,
	Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav,
	Zorro Lang, linux-xfs, fstests

On Sat, Jun 25, 2022 at 05:14:17PM -0400, Theodore Ts'o wrote:
> On Sat, Jun 25, 2022 at 11:49:54AM -0700, Luis Chamberlain wrote:
> > You are optimizing usage for GCE. That makes sense.
> 
> This particular usage model is not unique to GCE.  A very similar
> thing can be done using Microsoft Azure, Amazon Web Services and
> Oracle Cloud Services.  And I've talked to some folks who might be
> interested in taking the Test Appliance that is currently built for
> use with KVM, Android, and GCE, and extending it to support other
> Cloud infrastructures.  So the concept of these optimizations are not
> unique to GCE, which is why I've been calling this approach "cloud
> native".

I think we have similar goals. I'd like to eventually generalize
what you have done for enablement through *any* cloud.

And, I suspect this may not just be useful for kernel development too
and so there is value in that for other things.

> Perhaps one other difference is that I make the test appliance images
> available, so people don't *have* to build them from scratch.  They
> can just download the qcow2 image from:
> 
>     https://www.kernel.org/pub/linux/kernel/people/tytso/kvm-xfstests

It may make sense for us to consider containers for some of this.

If a distro doesn't have one, for example, well then we just have
to do the build-it-all-step.

> And for GCE, there is the public image project, xfstests-cloud, just
> like there are public images for debian in the debian-cloud project,
> for Fedora in the fedora-cloud project, etc.  Of course, for full GPL
> compliance, how to build these images from source is fully available,
> which is why the images are carefully tagged so all of the git commit
> versions and the automated scripts used to build the image are fully
> available for anyone who wants to replicate the build.  *BUT*, they
> don't have to build the test environment if they are just getting
> started.
> 
> One of the things which I am trying to do is to make the "out of box"
> experience as simple as possible, which means I don't want to force
> users to build the test appliance or run "make bringup" if they don't
> have to.   

You are misunderstanding the goal with 'make bringup', if you already
have pre-built images you can use them and you have less to do. You
*don't* have to run 'make fstests' if you already have that set up.

'make bringup' just abstracts general initial stage nodes, whether on
cloud or local virt.

'make linux' however does get / build / install linux.

And for local virtualization there whedre vagrant images are used
one could enhance these further too. They are just compressed tarball
with a qcow2 file at least when libvirt is used. Since kdevops works off
of these, you can then also use pre-built images with all kernel/modules
need and even binaries. I've extended docs recently to help folks who wish
to optimize on that front:

https://github.com/linux-kdevops/kdevops/blob/master/docs/custom-vagrant-boxes.md

Each stage has its own reproducible builds aspect to it.

So if one *had* these enhanced vagrant images with kernels, one could
just skip the build stage and jump straight to testing after bringup.

I do wonder if we could share simiular qcow2 images for cloud testing
too and for vagrant. If we could... there is a pretty big win.

> Of course, someone who is doing xfstests development will need to
> learn how to build their own test appliance.  But for someone who is
> just getting started, the goal is to make the learning curve as flat
> as possible.

Yup.

> One of the other things that was important design principles for me
> was I didn't want to require that the VM's have networking access, nor
> did I want to require users to be able to have to run random scripts
> via sudo or as root.  (Some of this was because of corporate security
> requirements at the time.)  This also had the benefit that I'm not
> asing the user to set up ssh keys if they are using kvm-xfstests, but
> instead rely on the serial console.

Philosphy.

> > The goal behind kdevops was to use technology which can *enable* any
> > optimizations in a cloud agnostic way.
> 
> Fair enough.  My goal for kvm-xfstests and gce-xfstests was to make
> developer velocity the primary goal.  Portability to different cloud
> systems took a back seat.  I don't apologize for this, since over the
> many years that I've been personally using {kvm,gce}-xfstests, the
> fact that I can use my native kernel development environment, and have
> the test environment pluck the kernel straight out of my build tree,
> has paid for itself many times over.

Yes I realize that. No one typically has time to do that. Which is why
when I had my requirements from a prior $employer to do the tech do
something cloud agnostic, I decided it was tech best shared. It was not
easy.

> If I had to push test/debug kernel code to a public git tree just so
> the test VM can pull donwn the code and build it in the test VM a
> second time --- I'd say, "no thank you, absolutely not."  Having to do
> this would slow me down, and as I said, developer velocity is king.  I
> want to be able to save a patch from my mail user agent, apply the
> patch, and then give the code a test, *without* having to interact
> with a public git tree.

Every developer may have a different way to work and do Linux kernel
development.

> Maybe you can do that with kdevops --- but it's not at all obvious
> how.

The above just explained what you *don't* want to do, not what you want.
But you explained to me in private a while ago you expect to do local
test builds fast to guest.

I think you're just missing that the goal is to support variability
and enable that variability. If such variability is not supported
then its just a matter of adding a few kconfig options and then
adding support for it. So yes its possible and its a matter of
taking a bit of time to do that workflow. My kdev workflow was to
just work with large guests before, and use 'localmodconfig' kernels
which are very small, and so build time is fast, specially after the
first build. The other worklow I then supported was the distro world
one where we tested a "kernel of the day" which is a kernel on a repo
somewhere. So upgradng is just ensuring you have a repo and `zypper in`
the kernel, reboot and test.

To support the workflow you have I'd like to evaluate both a local virt
solution and cloud (for any cloud vendor). For local virt using 9p seems to
make sense. For cloud, not so sure.

I think we really digress from the subject at hand though. This
conversation is useful but it really is just noise to a lot of people.

  Luis

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: sharing fstests results (Was: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1))
  2022-06-24 22:54         ` Luis Chamberlain
  2022-06-25  2:21           ` Theodore Ts'o
@ 2022-06-25  7:28           ` Amir Goldstein
  2022-06-25 19:35             ` Luis Chamberlain
  1 sibling, 1 reply; 17+ messages in thread
From: Amir Goldstein @ 2022-06-25  7:28 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Theodore Ts'o, Darrick J. Wong, Leah Rumancik, Josef Bacik,
	Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav,
	Zorro Lang, linux-xfs, fstests

[Subject change was long due...]

On Sat, Jun 25, 2022 at 1:54 AM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> On Fri, Jun 24, 2022 at 01:32:23AM -0400, Theodore Ts'o wrote:
[...]

> >
> > > Sure, the TODO item on the URL seemed to indicate there was a desire to
> > > find a better place to put failures.
> >
> > I'm not convinced the "better place" is expunge files.  I suspect it
> > may need to be some kind of database.  Darrick tells me that he stores
> > his test results in a postgres database.  (Which is way better than
> > what I'm doing which is an mbox file and using mail search tools.)
> >
> > Currently, Leah is using flat text files for the XFS 5.15 stable
> > backports effort, plus some tools that parse and analyze those text
> > files.
>
> Where does not matter yet, what I'd like to refocus on is *if* sharing
> is desirable by folks. We can discuss *how* and *where* if we do think
> it is worth to share.
>
> If folks would like to evaluate this I'd encourage to do so perhaps
> after a specific distro release moving forward, and to not backtrack.
>
> But for stable kernels I'd imagine it may be easier to see value in
> sharing.
>
> > I'll also note that the number of baseline kernel versions is much
> > smaller if you are primarily testing an enterprise Linux distribution,
> > such as SLES.
>
> Much smaller than what? Android? If so then perhaps. Just recall that
> Enterprise supports kernels for at least 10 years.
>
> > And if you are working with stable kernels, you can
> > probably get away with having updating the baseline for each LTS
> > kernel every so often.  But for upstream kernels development the
> > number of kernel versions for which a developer might want to track
> > flaky percentages and far greater, and will need to be updated at
> > least once every kernel development cycle, and possibly more
> > frequently than that.  Which is why I'm not entirely sure a flat text
> > file, such as an expunge file, is really the right answer.  I can
> > completely understand why Darrick is using a Postgres database.
> >
> > So there is clearly more thought and design required here, in my
> > opinion.
>
> Sure, let's talk about it, *if* we do find it valuable to share.
> kdevops already has stuff in a format which is consistent, that
> can change or be ported. We first just need to decide if we want
> to as a community share.
>
> The flakyness annotations are important too, and we have a thread
> about that, which I have to go and get back to at some point.
>
> > > That is not a goal, the goal is allow variability! And share results
> > > in the most efficient way.
> >
> > Sure, but are expunge files the most efficient way to "share results"?
>
> There are three things we want to do if we are going to talk about
> sharing results:
>
> a) Consuming expunges so check.sh for the Node Under Test (NUT) can expand
>    on the expunges given a criteria (flakyness, crash requirements)
>
> b) Sharing updates to expunges per kernel / distro / runner / node-config
>    and making patches to this easy.
>
> c) Making updates for failures easy to read for a developer / community.
>    These would be in the form of an email or results file for a test
>    run through some sort of kernel-ci.
>
> Let's start with a):
>
> We can adopt runners to use anything. My gut tells me postgres is
> a bit large unless we need socket communication. I can think of two
> ways to go here then. Perhaps others have some other ideas?
>
> 1) We go lightweight on the db, maybe sqlite3 ? And embrace the same
>    postgres db schema as used by Darrick if he sees value in sharing
>    this. If we do this I think it does't make sense to *require*
>    sqlite3 on the NUT (nodes), for many reasons, so parsing the db
>    on the host to a flat file to be used by the node does seem
>    ideal.
>
> 2) Keep postgres and provide a REST api for queries from the host to
>    this server so it can then construct a flat file / directory
>    interpreation of expunges for the nodes under test (NUT).
>
> Given the minimum requirements desirable on the NUTs I think in the end
> a flat file hierarchy is nice so to not incur some new dependency on
> them.
>
> Determinism is important for tests though so snapshotting a reflection
> interpretion of expunges at a specific point in time is also important.
> So the database would need to be versioned per updates, so a test is
> checkpointed against a specific version of the expunge db.

Using the terminology "expunge db" is wrong here because it suggests
that flakey tests (which are obviously part of that db) should be in
expunge list as is done in kdevops and that is not how Josef/Ted/Darrick
treat the flakey tests.

The discussion should be around sharing fstests "results" not expunge
lists. Sharing expunge lists for tests that should not be run at all
with certain kernel/disrto/xfsprogs has great value on its own and I
this the kdevops hierarchical expunge lists are a very good place to
share this *determinitic* information, but only as long as those lists
absolutely do not contain non-deterministic test expunges.

For example, this is a deterministic expunge list that may be worth sharing:
https://github.com/linux-kdevops/kdevops/blob/master/workflows/fstests/expunges/any/xfs/reqs-xfsprogs-5.10.txt
Because for all the tests (it's just one), the failure is analysed
and found to be deterministic and related to the topic of the expunge.

However, this is also a classic example for an expunge list that could
be auto generated by the test runner if xfs/540 had the annotations:

_fixed_in_version xfsprogs 5.13
_fixed_by_git_commit xfsprogs 5f062427 \
      "xfs_repair: validate alignment of inherited rt extent hints"

>
> If we come to some sort of consensus then this code for parsing an
> expunge set can be used from directly on fstests's check script, so the
> interpreation and use can be done in one place for all test runners.
> We also have additional criteria which we may want for the expunges.
> For instance, if we had flakyness percentage annotated somehow then
> fstests's check could be passed an argument to only include expunges
> given a certain flakyness level of some sort, or for example only
> include expunges for tests which are known to crash.
>
> Generating the files from a db is nice. But what gains do we have
> with using a db then?
>
> Now let's move on to b) sharing the expunges and sending patches for
> updates. I think sending a patch against a flat file reads a lot easier
> except for the comments / flakyness levels / crash consideration / and
> artifacts. For kdevop's purposes this reads well today as we don't
> upload artifacts anywhere and just refer to them on github gists as best
> effort / optional. There is no convention yet on expression of flakyness
> but some tests do mention "failure rate" in one way or another.
>
> So we want to evaluate if we want to share not only expunges but other
> meta data associated to why a new test can be expunged or removed:
>
>  * flakyness percentage
>  * cause a kernel crash?
>  * bogus test?
>  * expunged due to a slew of a tons of other reasons, some of them maybe
>    categorized and shared, some of them not
>
> And do we want to share artifacts? If so how? Perhaps an optional URL,
> with another component describing what it is, gist, or a tarball, etc.
>
> Then for the last part c) making failures easy to read to a developer
> let's review what could be done. I gather gce-xfstests explains the
> xunit results summary. Right now kdevop's kernel-ci stuff just sends
> an email with the same but also a diff to the expunge file hierarchy
> augmented for the target kernel directory being tested. The developer
> would just go and edit the line with meta data as a comment, but that
> is just because we lack a structure for it. If we strive to share
> an expunge list I think it would be wise to consider structure for
> this metadata.
>
> Perhaps:
>
> <test> # <crashes>|<flayness-percent-as-fraction>|<fs-skip-reason>|<artifact-type>|<artifact-dir-url>|<comments>
>
> Where:
>
> test:                         xfs/123 or btrfs/234
> crashes:                      can be either Y or N
> flayness-percent-as-percentage: 80%
> fs-skip-reason:               can be an enum to represent a series of
>                               fs specific reasons why a test may not be
>                               applicable or should be skipped
> artifact-type:                optional, if present the type of artifact,
>                               can be enum to represent a gist test
>                               description, or a tarball
> artifact-dir-url:             optional, path to the artifact
> comments:                     additional comments
>
> All the above considered, a) b) and c), yes I think a flat file
> model works well as an option. I'd love to hear other's feedback.
>
> > If we have a huge amount of variability, such that we have a large
> > number of directories with different test configs and different
> > hardware configs, each with different expunge files, I'm not sure how
> > useful that actually is.
>
> *If* you want to share I think it would be useful.
>
> At least kdevops uses a flat file model with no artifacts, just the
> expunges and comments, and over time it has been very useful, even to be
> able to review historic issues on older kernels by simply using
> something like 'git grep xfs/123' gives me a quick sense of history of
> issues of a test.
>
> > Are we expecting users to do a "git clone",
> > and then start browsing all of these different expunge files by hand?
>
> If we want to extend fstests check script to look for this, it could
> be an optional directory and an arugment could be pased to check so
> to enable its hunt for it, so that if passed it would look for the
> runner / kernel / host-type. For instance today we already have
> a function on initialization for the check script which looks for
> the fstests' config file as follows:
>
> known_hosts()
> {
>         [ "$HOST_CONFIG_DIR" ] || HOST_CONFIG_DIR=`pwd`/configs
>
>         [ -f /etc/xfsqa.config ] && export HOST_OPTIONS=/etc/xfsqa.config
>         [ -f $HOST_CONFIG_DIR/$HOST ] && export HOST_OPTIONS=$HOST_CONFIG_DIR/$HOST
>         [ -f $HOST_CONFIG_DIR/$HOST.config ] && export HOST_OPTIONS=$HOST_CONFIG_DIR/$HOST.config
> }
>
> We could have something similar look for an expugne directory of say
> say --expunge-auto-look and that could be something like:
>
> process_expunge_dir()
> {
>         [ "$HOST_EXPUNGE_DIR" ] || HOST_EXPUNGE_DIR=`pwd`/expunges
>
>         [ -d /etc/fstests/expunges/$HOST ] && export HOST_EXPUNGES=/etc/fstests/expunges/$HOST
>         [ -d $HOST_EXPUNGE_DIR/$HOST ] && export HOST_EXPUNGES=$HOST_EXPUNGE_DIR/$HOST
> }
>
> The runner could be specified, and the host-type
>
> ./check --runner <gce-xfstests|kdevops|whatever> --host-type <kvm-8vcpus-2gb>
>
> And so we can have it look for these directory and if any of these are used
> processed (commulative):
>
>   * HOST_EXPUNGES/any/$fstype/                       - regardless of kernel, host type and runner
>   * HOST_EXPUNGES/$kernel/$fstype/any                - common between runners for any host type
>   * HOST_EXPUNGES/$kernel/$fstype/$hostype           - common between runners for a host type
>   * HOST_EXPUNGES/$kernel/$fstype/$hostype/$runner   - only present for the runner
>
> The aggregate set of expugnes are used.
>
> Additional criteria could be passed to check so to ensure that only
> certain expunges that meet the criteria are used to skip tests for the
> run, provided we can agree on some metatdata for that.
>
> > It might perhaps be useful to get a bit more clarity about how we
> > expect the shared results would be used, because that might drive some
> > of the design decisions about the best way to store these "results".
>

As a requirement, what I am looking for is a way to search for anything
known to the community about failures in test FS/NNN.

Because when I get an alert on a possible regression, that's the fastest
way for me to triage and understand how much effort I should put into
the investigation of that failure and which directions I should look into.

Right now, I look at the test header comment and git log, I grep the
kdepops expunge lists to look for juicy details and I search lore for
mentions of that test.

In fact, I already have an auto generated index of lore fstests
mentions in xfs patch discussions [1] that I just grep for failures found
when testing xfs. For LTS testing, I found it to be the best way to
find candidate fix patches that I may have missed.

I would love to have more sources to get search results from.
There doesn't even need to be a standard form for the search or results.

If Leah, Darrick, Ted and Josef would provide me with a script to search
their home brewed fstests db, I would just run all those scripts and
see what they have to tell me about FS/NNN in some form of human
readable format that I can understand.

Going forward, we can try to standardize the search and results
format, but for getting better requirements you first need users!

Thanks,
Amir.

[1] https://github.com/amir73il/b4/blob/xfs-5.10.y/xfs-5.10..5.17-rn.rst

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: sharing fstests results (Was: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1))
  2022-06-25  7:28           ` sharing fstests results (Was: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)) Amir Goldstein
@ 2022-06-25 19:35             ` Luis Chamberlain
  2022-06-25 21:50               ` Theodore Ts'o
  0 siblings, 1 reply; 17+ messages in thread
From: Luis Chamberlain @ 2022-06-25 19:35 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Theodore Ts'o, Darrick J. Wong, Leah Rumancik, Josef Bacik,
	Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav,
	Zorro Lang, linux-xfs, fstests

On Sat, Jun 25, 2022 at 10:28:32AM +0300, Amir Goldstein wrote:
> On Sat, Jun 25, 2022 at 1:54 AM Luis Chamberlain <mcgrof@kernel.org> wrote:
> > Determinism is important for tests though so snapshotting a reflection
> > interpretion of expunges at a specific point in time is also important.
> > So the database would need to be versioned per updates, so a test is
> > checkpointed against a specific version of the expunge db.
> 
> Using the terminology "expunge db" is wrong here because it suggests
> that flakey tests (which are obviously part of that db) should be in
> expunge list as is done in kdevops and that is not how Josef/Ted/Darrick
> treat the flakey tests.

There are flaky tests which can cause a crash, and that is why I started
to expunge these. Not all flaky tests cause a crash though. And so, this
is why in the format I suggested you can specify metadata such as if a
test caused a crash.

At this point I agree that the way kdevops simply skips flaky test which
does not cause a crash should be changed, and if the test is just known
to fail though non deterministically but without a crash it would be
good then at the end to simply not treat that failure as fatal. If
however the failure rate does change it would be useful to update that
information. Without metadata one cannot process that sort of stuff.

> The discussion should be around sharing fstests "results" not expunge
> lists. Sharing expunge lists for tests that should not be run at all
> with certain kernel/disrto/xfsprogs has great value on its own and I
> this the kdevops hierarchical expunge lists are a very good place to
> share think *determinitic* information, but only as long as those lists
> absolutely do not contain non-deterministic test expunges.

The way the expunge list is process could simply be modified in kdevops
so that non-deterministic tests are not expunged but also not treated as
fatal at the end. But think about it, the exception is if the non-deterministic
failure does not lead to a crash, no?

> > > It might perhaps be useful to get a bit more clarity about how we
> > > expect the shared results would be used, because that might drive some
> > > of the design decisions about the best way to store these "results".
> >
> 
> As a requirement, what I am looking for is a way to search for anything
> known to the community about failures in test FS/NNN.

Here's the thing though. Not all developers have incentives to share.
For a while SLE didn't have public expunges, that changed after OpenSUSE
Leap 15.3 as it has binary compatibility with SLE15.3 and so the same
failures on workflows/fstests/expunges/opensuse-leap/15.3/ are applicable/.
It is up to each distro if they wish to share and without a public
vehicle to do so why would they, or how would they?

For upstream and stable I would hope there is more incentives to share.
But again, no shared home ever had existed before. And I don't think
there was ever before dialog about sharing a home for these.

> Because when I get an alert on a possible regression, that's the fastest
> way for me to triage and understand how much effort I should put into
> the investigation of that failure and which directions I should look into.
> 
> Right now, I look at the test header comment and git log, I grep the
> kdepops expunge lists to look for juicy details and I search lore for
> mentions of that test.
> 
> In fact, I already have an auto generated index of lore fstests
> mentions in xfs patch discussions [1] that I just grep for failures found
> when testing xfs. For LTS testing, I found it to be the best way to
> find candidate fix patches that I may have missed.

This effort is valuable and thanks for doing all this.

> Going forward, we can try to standardize the search and results
> format, but for getting better requirements you first need users!

As you are witness to it, running fstests against any fs takes a lot of
time and patience, and as I have noted, not many have incentives to
share. So the best I could do is provide the solution to enable folks to
reproduce testing as fast and as easy as possible and let folks who are
interested to share, to do so. And obvioulsy at least I did get a major
enterprise distro to share some results. Hope others could follow.

So I expect the format for sharing then to be lead by those who have a
clear incentive to do so. Folks working on upstream or stable stakeholders
seem like an obvious candidates. And then it is just volunteer work.

  Luis

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: sharing fstests results (Was: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1))
  2022-06-25 19:35             ` Luis Chamberlain
@ 2022-06-25 21:50               ` Theodore Ts'o
  2022-07-01 23:13                 ` Luis Chamberlain
  0 siblings, 1 reply; 17+ messages in thread
From: Theodore Ts'o @ 2022-06-25 21:50 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Amir Goldstein, Darrick J. Wong, Leah Rumancik, Josef Bacik,
	Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav,
	Zorro Lang, linux-xfs, fstests

On Sat, Jun 25, 2022 at 12:35:50PM -0700, Luis Chamberlain wrote:
> 
> The way the expunge list is process could simply be modified in kdevops
> so that non-deterministic tests are not expunged but also not treated as
> fatal at the end. But think about it, the exception is if the non-deterministic
> failure does not lead to a crash, no?

That's what I'm doing today, but once we have a better test analysis
system, what I think the only thing which should be excluded is:

   a)   bugs which cause the kernel to crash
   b)   test bugs
   c)   tests which take ***forever*** for a particular configuration
   	    (and for which  we probably get enough coverage through
	    other configs)

If we have a non-deterministic failure, which is due to a kernel bug,
I don't see any reason why we should skip the test.  We just need to
have a fully-featured enough test results analyzer so that we can
distinguish between known failures, known flaky failures, and new test
regressions.

So for example, the new tests generic/681, generic/682, and
generic/692 are causing determinsitic failures for the ext4/encrypt
config.  Right now, this is being tracked manually in a flat text
file:

generic/68[12]  encrypt   Failure percentage: 100%
    The directory does grow, but blocks aren't charged to either root or
    the non-privileged users' quota.  So this appears to be a real bug.
    Testing shows this goes all the way back to at least 4.14.

It's currently not tagged by kernel version, because I mostly only
care about upstream.  So once it's fixed upstream, I stop caring about
it.  In the ideal world, we'd track the kernel commit which fixed the
test failure, and when the fix propagated to the various stable
kernels, etc.

I've also resisted putting it in an expunge file, since if it did, I
would ignore it forever.  If it stays in my face, I'm more likely to
fix it, even if it's on my personal time.

> Here's the thing though. Not all developers have incentives to share.

Part of this is the amount of *time* that it takes to share this
information.  Right now, a lot of sharing takes place on the weekly
ext4 conference call.  It doesn't take Eric Whitney a lot of time to
mention that he's seeing a particular test failure, and I can quickly
search my test summary Unix mbox file and say, "yep, I've seen this
fail a couple of times before, starting in February 2020 --- but it's
super rare."

And since Darrick attends the weekly ext4 video chats, once or twice
we've asked him about some test failures on some esoteric xfs config,
such as realtime with an external logdev, and he might say, "oh yeah,
that's a known test bug.  pull this branch from my public xfstests
tree, I just haven't had time to push those fixes upstream yet."

(And I don't blame him for that; I just recently pushed some ext4 test
bug fixes, some of which I had initially sent to the list in late
April --- but on code review, changes were requested, and I just
didn't have *time* to clean up fixes in response to the code reviews.
So the fix which was good enough to suppress the failures sat in my
tree, but didn't go upstream since it was deemed not ready for
upstream.  I'm all for decreasing tech debt in xfstests; but do
understand that sometimes this means fixes to known test bugs will
stay in developers' git trees, since we're all overloaded.)

It's a similar problem with test failures.  Simply reporting a test
failure isn't *that* hard.  But the analysis, even if it's something
like:

generic/68[12]  encrypt   Failure percentage: 100%
    The directory does grow, but blocks aren't charged to either root or
    the non-privileged users' quota.....

... is the critical bit that people *really* want, and it takes real
developer time to come up with that kind of information.  In the ideal
world, I'd have an army of trained minions to run down this kind of
stuff.  In the real world, sometimes this stuff happens after
midnight, local time, on a Friday night.

(Note that Android and Chrome OS, both of which are big users of
fscrypt, don't use quota.  So If I were to open a bug tracker entry on
it, the bug would get prioritized to P2 or P3, and never be heard from
again, since there's no business reason to prioritize fixing it.
Which is why some of this happens on personal time.)

      	      	    	      	    	       - Ted

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: sharing fstests results (Was: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1))
  2022-06-25 21:50               ` Theodore Ts'o
@ 2022-07-01 23:13                 ` Luis Chamberlain
  0 siblings, 0 replies; 17+ messages in thread
From: Luis Chamberlain @ 2022-07-01 23:13 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Amir Goldstein, Darrick J. Wong, Leah Rumancik, Josef Bacik,
	Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav,
	Zorro Lang, linux-xfs, fstests

On Sat, Jun 25, 2022 at 05:50:26PM -0400, Theodore Ts'o wrote:
> On Sat, Jun 25, 2022 at 12:35:50PM -0700, Luis Chamberlain wrote:
> > Here's the thing though. Not all developers have incentives to share.
> 
> Part of this is the amount of *time* that it takes to share this
> information.

There's many reasons. In the end we keep digressing, but I see no
expressed interest to share, and so we can just keep on moving
with how things are.

  Luis

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)
  2022-06-22  0:07 ` [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) Luis Chamberlain
  2022-06-22 21:44   ` Theodore Ts'o
@ 2022-06-22 21:52   ` Leah Rumancik
  2022-06-23 21:40     ` Luis Chamberlain
  1 sibling, 1 reply; 17+ messages in thread
From: Leah Rumancik @ 2022-06-22 21:52 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Amir Goldstein, Josef Bacik, Chuck Lever, chandanrmail,
	Sweet Tea Dorminy, Pankaj Raghav, linux-xfs, fstests

On Tue, Jun 21, 2022 at 05:07:10PM -0700, Luis Chamberlain wrote:
> On Thu, Jun 16, 2022 at 11:27:41AM -0700, Leah Rumancik wrote:
> > https://gist.github.com/lrumancik/5a9d85d2637f878220224578e173fc23. 
> 
> The coverage for XFS is using profiles which seem to come inspired
> by ext4's different mkfs configurations.
The configs I am using for the backports testing were developed with
Darrick's help. If you guys agree on a different set of configs, I'd be
happy to update my configs moving forward. As there has been testing of
these patches on both 5.10 with those configs as well as on 5.15 with
my configs, I don't think this should be blocking for this set of
patches.

- Leah
> 
> Long ago (2019) I had asked we strive to address popular configurations
> for XFS so that what would be back then oscheck (now kdevops) can cover
> them for stable XFS patch candidate test consideration. That was so long
> ago no one should be surprised you didn't get the memo:
> 
> https://lkml.kernel.org/r/20190208194829.GJ11489@garbanzo.do-not-panic.com
> 
> This has grown to cover more now:
> 
> https://github.com/linux-kdevops/kdevops/blob/master/playbooks/roles/fstests/templates/xfs/xfs.config
> 
> For instance xfs_bigblock and xfs_reflink_normapbt.
> 
> My litmus test back then *and* today is to ensure we have no regressions
> on the test sections supported by kdevops for XFS as reflected above.
> Without that confidence I'd be really reluctant to support stable
> efforts.
> 
> If you use kdevops, it should be easy to set up even if you are not
> using local virtualization technologies. For instance I just fired
> up an AWS cloud m5ad.4xlarge image which has 2 nvme drives, which
> mimics the reqs for the methodology of using loopback files:
> 
> https://github.com/linux-kdevops/kdevops/blob/master/docs/seeing-more-issues.md
> 
> GCE is supported as well, so is Azure and OpenStack, and even custom
> openstack solutions...
> 
> Also, I see on the above URL you posted there is a TODO in the gist which
> says, "find a better route for publishing these". If you were to use
> kdevops for this it would have the immediate gain in that kdevops users
> could reproduce your findings and help augment it.
> 
> However if using kdevops as a landing home for this is too large for you,
> we could use a new git tree which just tracks expunges and then kdevops can
> use it as a git subtree as I had suggested at LSFMM. The benefit of using a
> git subtree is then any runner can make use of it. And note that we
> track both fstests and blktests.
> 
> The downside is for kdevops to use a new git subtree is just that kdevops
> developers would have to use two trees to work on, one for code changes just
> for kdevops and one for the git subtree for expunges. That workflow would be
> new. I don't suspect it would be a really big issue other than addressing the
> initial growing pains to adapt. I have used git subtrees before extensively
> and the best rule of thumb is just to ensure you keep the code for the git
> subtree in its own directory. You can either immediately upstream your
> delta or carry the delta until you are ready to try to push those
> changes. Right now kdevops uses the directory workflows/fstests/expunges/
> for expunges. Your runner could use whatever it wishes.
> 
> We should discuss if we just also want to add the respective found
> *.bad, *.dmesg *.all files for results for expunged entries, or if
> we should be pushing these out to a new shared storage area. Right now
> kdevops keeps track of results in the directory workflows/fstests/results/
> but this is a path on .gitignore. If we *do* want to use github and a
> shared git subtree perhaps a workflows/fstests/artifacts/kdevops/ would
> make sense for the kdevops runner ? Then that namespace allows other
> runners to also add files, but we all share expunges / tribal knowledge.
> 
> Thoughts?
> 
>   Luis

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)
  2022-06-22 21:52   ` [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) Leah Rumancik
@ 2022-06-23 21:40     ` Luis Chamberlain
  0 siblings, 0 replies; 17+ messages in thread
From: Luis Chamberlain @ 2022-06-23 21:40 UTC (permalink / raw)
  To: Leah Rumancik
  Cc: Amir Goldstein, Josef Bacik, Chuck Lever, chandanrmail,
	Sweet Tea Dorminy, Pankaj Raghav, linux-xfs, fstests

On Wed, Jun 22, 2022 at 02:52:18PM -0700, Leah Rumancik wrote:
> On Tue, Jun 21, 2022 at 05:07:10PM -0700, Luis Chamberlain wrote:
> > On Thu, Jun 16, 2022 at 11:27:41AM -0700, Leah Rumancik wrote:
> > > https://gist.github.com/lrumancik/5a9d85d2637f878220224578e173fc23. 
> > 
> > The coverage for XFS is using profiles which seem to come inspired
> > by ext4's different mkfs configurations.
> The configs I am using for the backports testing were developed with
> Darrick's help.

Sorry for the noise then.

> If you guys agree on a different set of configs, I'd be
> happy to update my configs moving forward.

Indeed it would be great to unify on target test configs at the very least.

  Luis

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-07-01 23:14 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20220616182749.1200971-1-leah.rumancik@gmail.com>
2022-06-22  0:07 ` [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) Luis Chamberlain
2022-06-22 21:44   ` Theodore Ts'o
2022-06-23  5:31     ` Amir Goldstein
2022-06-23 21:39       ` Luis Chamberlain
2022-06-23 21:31     ` Luis Chamberlain
2022-06-24  5:32       ` Theodore Ts'o
2022-06-24 22:54         ` Luis Chamberlain
2022-06-25  2:21           ` Theodore Ts'o
2022-06-25 18:49             ` Luis Chamberlain
2022-06-25 21:14               ` Theodore Ts'o
2022-07-01 23:08                 ` Luis Chamberlain
2022-06-25  7:28           ` sharing fstests results (Was: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)) Amir Goldstein
2022-06-25 19:35             ` Luis Chamberlain
2022-06-25 21:50               ` Theodore Ts'o
2022-07-01 23:13                 ` Luis Chamberlain
2022-06-22 21:52   ` [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) Leah Rumancik
2022-06-23 21:40     ` Luis Chamberlain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox