* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) [not found] <20220616182749.1200971-1-leah.rumancik@gmail.com> @ 2022-06-22 0:07 ` Luis Chamberlain 2022-06-22 21:44 ` Theodore Ts'o 2022-06-22 21:52 ` [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) Leah Rumancik 0 siblings, 2 replies; 17+ messages in thread From: Luis Chamberlain @ 2022-06-22 0:07 UTC (permalink / raw) To: Leah Rumancik, Amir Goldstein, Josef Bacik, Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav Cc: linux-xfs, fstests On Thu, Jun 16, 2022 at 11:27:41AM -0700, Leah Rumancik wrote: > https://gist.github.com/lrumancik/5a9d85d2637f878220224578e173fc23. The coverage for XFS is using profiles which seem to come inspired by ext4's different mkfs configurations. Long ago (2019) I had asked we strive to address popular configurations for XFS so that what would be back then oscheck (now kdevops) can cover them for stable XFS patch candidate test consideration. That was so long ago no one should be surprised you didn't get the memo: https://lkml.kernel.org/r/20190208194829.GJ11489@garbanzo.do-not-panic.com This has grown to cover more now: https://github.com/linux-kdevops/kdevops/blob/master/playbooks/roles/fstests/templates/xfs/xfs.config For instance xfs_bigblock and xfs_reflink_normapbt. My litmus test back then *and* today is to ensure we have no regressions on the test sections supported by kdevops for XFS as reflected above. Without that confidence I'd be really reluctant to support stable efforts. If you use kdevops, it should be easy to set up even if you are not using local virtualization technologies. For instance I just fired up an AWS cloud m5ad.4xlarge image which has 2 nvme drives, which mimics the reqs for the methodology of using loopback files: https://github.com/linux-kdevops/kdevops/blob/master/docs/seeing-more-issues.md GCE is supported as well, so is Azure and OpenStack, and even custom openstack solutions... Also, I see on the above URL you posted there is a TODO in the gist which says, "find a better route for publishing these". If you were to use kdevops for this it would have the immediate gain in that kdevops users could reproduce your findings and help augment it. However if using kdevops as a landing home for this is too large for you, we could use a new git tree which just tracks expunges and then kdevops can use it as a git subtree as I had suggested at LSFMM. The benefit of using a git subtree is then any runner can make use of it. And note that we track both fstests and blktests. The downside is for kdevops to use a new git subtree is just that kdevops developers would have to use two trees to work on, one for code changes just for kdevops and one for the git subtree for expunges. That workflow would be new. I don't suspect it would be a really big issue other than addressing the initial growing pains to adapt. I have used git subtrees before extensively and the best rule of thumb is just to ensure you keep the code for the git subtree in its own directory. You can either immediately upstream your delta or carry the delta until you are ready to try to push those changes. Right now kdevops uses the directory workflows/fstests/expunges/ for expunges. Your runner could use whatever it wishes. We should discuss if we just also want to add the respective found *.bad, *.dmesg *.all files for results for expunged entries, or if we should be pushing these out to a new shared storage area. Right now kdevops keeps track of results in the directory workflows/fstests/results/ but this is a path on .gitignore. If we *do* want to use github and a shared git subtree perhaps a workflows/fstests/artifacts/kdevops/ would make sense for the kdevops runner ? Then that namespace allows other runners to also add files, but we all share expunges / tribal knowledge. Thoughts? Luis ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) 2022-06-22 0:07 ` [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) Luis Chamberlain @ 2022-06-22 21:44 ` Theodore Ts'o 2022-06-23 5:31 ` Amir Goldstein 2022-06-23 21:31 ` Luis Chamberlain 2022-06-22 21:52 ` [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) Leah Rumancik 1 sibling, 2 replies; 17+ messages in thread From: Theodore Ts'o @ 2022-06-22 21:44 UTC (permalink / raw) To: Luis Chamberlain Cc: Leah Rumancik, Amir Goldstein, Josef Bacik, Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav, linux-xfs, fstests On Tue, Jun 21, 2022 at 05:07:10PM -0700, Luis Chamberlain wrote: > On Thu, Jun 16, 2022 at 11:27:41AM -0700, Leah Rumancik wrote: > > https://gist.github.com/lrumancik/5a9d85d2637f878220224578e173fc23. > > The coverage for XFS is using profiles which seem to come inspired > by ext4's different mkfs configurations. That's not correct, actually. It's using the gce-xfstests test framework which is part of the xfstests-bld[1][2] system that I maintain, yes. However, the actual config profiles were obtained via discussions from Darrick and represent the actual configs which the XFS maintainer uses to test the upstream XFS tree before deciding to push to Linus. We figure if it's good enough for the XFS Maintainer, it's good enough for us. :-) [1] https://thunk.org/gce-xfstests [2] https://github.com/tytso/xfstests-bld If you think the XFS Maintainer should be running more configs, I invite you to have that conversation with Darrick. > GCE is supported as well, so is Azure and OpenStack, and even custom > openstack solutions... The way kdevops work is quite different from how gce-xfstests work, since it is a VM native solution. Which is to say, when we kick off a test, VM's are launched, one per each config, whih provide for better parallelization, and then once everything is completed, the VM's are automatically shutdown and they go away; so it's far more efficient in terms of using cloud resources. The Lightweight Test Manager will ten take the Junit XML files, plus all of the test artifacts, and these get combined into a single test report. The lightweight test manager runs in a small VM, and this is the only VM which is consuming resources until we ask it to do some work. For example: gce-xfstests ltm -c xfs --repo stable.git --commit v5.18.6 -c xfs/all -g auto That single command will result in the LTM launching a large builder VM which quickly build the kernel. (And it uses ccache, and a persistent cache disk, but even if we've never built the kernel, it can complete the build in a few minutes.) Then we launch 12 VM's, one for each config, and since they don't need to be optimized for fast builds, we can run most of the VM's with a smaller amount of memory, to better stress test the file system. (But for the dax config, we'll launch a VM with more memory, since we need to simulate the PMEM device using raw memory.) Once each VM completes each test run, it uploads its test artifiacts and results XML file to Google Cloud Storage. When all of the VM's complete, the LTM VM will download all of the results files from GCS, combines them together into a single result file, and then sends e-mail with a summary of the results. It's optimized for developers, and for our use cases. I'm sure kdevops is much more general, since it can work for hardware-based test machines, as well as many other cloud stacks, and it's also optimized for the QA department --- not surprising, since where kdevops has come from. > Also, I see on the above URL you posted there is a TODO in the gist which > says, "find a better route for publishing these". If you were to use > kdevops for this it would have the immediate gain in that kdevops users > could reproduce your findings and help augment it. Sure, but with our system, kvm-xfstests and gce-xfstests users can *easily* reproduce our findings and can help augment it. :-) As far as sharing expunge files, as I've observed before, these files tend to be very specific to the test configuration --- the number of CPU's, and the amount of memory, the characteristics of the storage device, etc. So what works for one developer's test setup will not necessarily work for others --- and I'm not convinced that trying to get everyone standardized on the One True Test Setup is actually an advantage. Some people may be using large RAID Arrays; some might be using fast flash; some might be using some kind of emulated log structured block device; some might be using eMMC flash. And that's a *good* thing. We also have a very different philosophy about how to use expunge files. In paticular, if there is test which is only failing 0.5% of the time, I don't think it makes sense to put that test into an expunge file. In general, we are only placing tests into expunge files when it causes the system under test to crash, or it takes *WAAAY* too long, or it's a clear test bug that is too hard to fix for real, so we just suppress the test for that config for now. (Example: tests in xfstests for quota don't understand clustered allocation.) So we want to run the tests, even if we know it will fail, and have a way of annotating that a test is known to fail for a particular kernel version, or if it's a flaky test, what the expected flake percentage is for that particular test. For flaky tests, we'd like to be able automatically retry running the test, and so we can flag when a flaky test has become a hard failure, or a flaky test has radically changed how often it fails. We haven't implemented all of this yet, but this is something that we're exploring the design space at the moment. More generally, I think competition is a good thing, and for areas where we are still exploring the best way to automate tests, not just from a QA department's perspective, but from a file system developer's perspective, having multiple systems where we can explore these ideas can be a good thing. Cheers, - Ted ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) 2022-06-22 21:44 ` Theodore Ts'o @ 2022-06-23 5:31 ` Amir Goldstein 2022-06-23 21:39 ` Luis Chamberlain 2022-06-23 21:31 ` Luis Chamberlain 1 sibling, 1 reply; 17+ messages in thread From: Amir Goldstein @ 2022-06-23 5:31 UTC (permalink / raw) To: Theodore Ts'o Cc: Luis Chamberlain, Leah Rumancik, Josef Bacik, Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav, linux-xfs, fstests > It's optimized for developers, and for our use cases. I'm sure > kdevops is much more general, since it can work for hardware-based > test machines, as well as many other cloud stacks, and it's also > optimized for the QA department --- not surprising, since where > kdevops has come from. > [...] > > We also have a very different philosophy about how to use expunge > files. In paticular, if there is test which is only failing 0.5% of > the time, I don't think it makes sense to put that test into an > expunge file. > > In general, we are only placing tests into expunge files when > it causes the system under test to crash, or it takes *WAAAY* too > long, or it's a clear test bug that is too hard to fix for real, so we > just suppress the test for that config for now. (Example: tests in > xfstests for quota don't understand clustered allocation.) > > So we want to run the tests, even if we know it will fail, and have a > way of annotating that a test is known to fail for a particular kernel > version, or if it's a flaky test, what the expected flake percentage > is for that particular test. For flaky tests, we'd like to be able > automatically retry running the test, and so we can flag when a flaky > test has become a hard failure, or a flaky test has radically changed > how often it fails. We haven't implemented all of this yet, but this > is something that we're exploring the design space at the moment. > > More generally, I think competition is a good thing, and for areas > where we are still exploring the best way to automate tests, not just > from a QA department's perspective, but from a file system developer's > perspective, having multiple systems where we can explore these ideas > can be a good thing. > I very much agree with Ted on that point. As a user and big fan of both kdevops and fstests-bld I wouldn't want to have to choose one over the other, not even to choose a unified expunge list. I think we are still at a point where this diversity makes our ecosystem stronger rather than causing duplicate work. To put it in more blunt terms, the core test suite, fstests, is not very reliable. Neither kdevops nor fstests-bld address all the reliability issue (and they contribute some of their own). So we need the community to run both to get better and more reliable filesystem test coverage. Nevertheless, we should continue to share as much experience and data points as we can during this co-opetition stage in order to improve both systems. Thanks, Amir. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) 2022-06-23 5:31 ` Amir Goldstein @ 2022-06-23 21:39 ` Luis Chamberlain 0 siblings, 0 replies; 17+ messages in thread From: Luis Chamberlain @ 2022-06-23 21:39 UTC (permalink / raw) To: Amir Goldstein Cc: Theodore Ts'o, Leah Rumancik, Josef Bacik, Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav, linux-xfs, fstests On Thu, Jun 23, 2022 at 08:31:30AM +0300, Amir Goldstein wrote: > To put it in more blunt terms, the core test suite, fstests, is not > very reliable. Neither kdevops nor fstests-bld address all the > reliability issue (and they contribute some of their own). > So we need the community to run both to get better and more > reliable filesystem test coverage. The generic pains with fstests / blktests surely can be shared and perhaps that is just a think we need to start doing more regularly at LSFMM more so than a one-off thing. > Nevertheless, we should continue to share as much experience > and data points as we can during this co-opetition stage in order to > improve both systems. Yes, my point was not about killing something off, it was about sharing data points, and I think we should at least share configs. I personally see value in sharing expunges, but indeed if we do we'd have to decide if to put them up on github with just the expunge list alone, or do we also want to upload artifacts on the same tree. Or should we dump all the artifacts into a storage pool somewhere. Some artifacts can grow to insane sizes if a test is bogus, I ran into one once which was at least 2 GiB of output on a *.bad file. The error was just reapeating over and over. I think IIRC it was for ZNS for btrfs or for a blktests zbd test where the ouput was just an error repeating itself over and over. We could just have a size limit on these. And if experience is to show us anyting perahps adopt an epoch thing if we use git. Luis ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) 2022-06-22 21:44 ` Theodore Ts'o 2022-06-23 5:31 ` Amir Goldstein @ 2022-06-23 21:31 ` Luis Chamberlain 2022-06-24 5:32 ` Theodore Ts'o 1 sibling, 1 reply; 17+ messages in thread From: Luis Chamberlain @ 2022-06-23 21:31 UTC (permalink / raw) To: Theodore Ts'o, Darrick J. Wong Cc: Leah Rumancik, Amir Goldstein, Josef Bacik, Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav, linux-xfs, fstests On Wed, Jun 22, 2022 at 05:44:30PM -0400, Theodore Ts'o wrote: > On Tue, Jun 21, 2022 at 05:07:10PM -0700, Luis Chamberlain wrote: > > On Thu, Jun 16, 2022 at 11:27:41AM -0700, Leah Rumancik wrote: > > > https://gist.github.com/lrumancik/5a9d85d2637f878220224578e173fc23. > > > > The coverage for XFS is using profiles which seem to come inspired > > by ext4's different mkfs configurations. > > That's not correct, actually. It's using the gce-xfstests test > framework which is part of the xfstests-bld[1][2] system that I > maintain, yes. However, the actual config profiles were obtained via > discussions from Darrick and represent the actual configs which the > XFS maintainer uses to test the upstream XFS tree before deciding to > push to Linus. We figure if it's good enough for the XFS Maintainer, > it's good enough for us. :-) > > [1] https://thunk.org/gce-xfstests > [2] https://github.com/tytso/xfstests-bld > > If you think the XFS Maintainer should be running more configs, I > invite you to have that conversation with Darrick. Sorry, I did not realize that the test configurations for XFS were already agreed upon with Darrick for stable for the v5.15 effort. Darrick, long ago when I started to test xfs for stable I had published what I had suggested and it seemed to cover the grounds back then in 2019: https://lore.kernel.org/all/20190208194829.GJ11489@garbanzo.do-not-panic.com/T/#m14e299ce476de104f9ee2038b8d002001e579515 If there is something missing from what we use on kdevops for stable consideation I'd like to augment it. Note that kdevops supports many sections and some of them are optional for the distribution, each distribution can opt-in, but likewise we can make sensible defaults for stable kernels, and per release too. The list of configurations supported are: https://github.com/linux-kdevops/kdevops/blob/master/playbooks/roles/fstests/templates/xfs/xfs.config For stable today we use all sections except xfs_bigblock and xfs_realtimedev. Do you have any advice on what to stick to for both v5.10 and v5.15 for stable for both kdevops and gce-xfstests ? It would seem just odd if we are not testing the same set of profiles as a minimum requirement. Likewise, the same quiestion applies to linus' tree and linux-next as in the future my hope is we get to the point kdevops *will* send out notices for new regressions detected. > > GCE is supported as well, so is Azure and OpenStack, and even custom > > openstack solutions... > > The way kdevops work is quite different from how gce-xfstests work, > since it is a VM native solution. To be clear, you seem to suggest gce-xfstests is a VM native solution. I'd also like to clarify that kdevops supports native VMs, cloud and baremetal. With kdevops you pick your bringup method. > Which is to <-- a description of how gce-xfstests works --> Today all artifacts are gathered by kdevops locally, they are not uploaded anywhere. Consumption of this is yet to be determined, but typically I put the output into a gist manually and then refer to the URL of the gist on the expunge entry. Uploading them can be an option / should, but it is not clear yet where to upload them to. A team will soon be looking into doing some more parsing of the results into a pretty flexible form / introspection. > It's optimized for developers, and for our use cases. I'm sure > kdevops is much more general, since it can work for hardware-based > test machines, as well as many other cloud stacks, and it's also > optimized for the QA department --- not surprising, since where > kdevops has come from. kdevops started as an effort for kernel development and filesystems testing. It is why the initial guest configuration was to use 8 GiB of RAM and 4 vcpus, that suffices to do local builds / development. I always did kernel development on guests back in the day still do to this day. It also has support for email reports and you get the xunit summary output *and* a git diff output of the expunges should a new regression be found. A QA team was never involved other than later learning existed and that the kernel team was using it to proactively find issues. Later kdevops was used to report bugs proactively as it was finding a lot more issues than typical fstests QA setups find. > > Also, I see on the above URL you posted there is a TODO in the gist which > > says, "find a better route for publishing these". If you were to use > > kdevops for this it would have the immediate gain in that kdevops users > > could reproduce your findings and help augment it. > > Sure, but with our system, kvm-xfstests and gce-xfstests users can > *easily* reproduce our findings and can help augment it. :-) Sure, the TODO item on the URL seemed to indicate there was a desire to find a better place to put failures. > As far as sharing expunge files, as I've observed before, these files > tend to be very specific to the test configuration --- the number of > CPU's, and the amount of memory, the characteristics of the storage > device, etc. And as I noted also at LSFMM it is not an imposibility to address this either if we want to. We can simply use a namespace for test runner and a generic test configuration. A parent directory simply would represent the test runner. We have two main ones for stable: * gce-xfstests * kdevops So they can just be the parent directory. Then I think we can probably agree upon 4 GiB RAM / 4 vpus per guest on x86_64 for a typical standard requirement. So something like x86_64_mem4g_cpus4. Then there is the drive setup. kdevops defaults to loopback drives on nvme drives as the default for both cloud and native KVM guests. So that can be nvme_loopback. It is not clear what gce-xfstests but this can probably be described just as well. > So what works for one developer's test setup will not > necessarily work for others True but it does not mean we cannot automate setup of an agreed upon setup. Specially if you wan to enable folks to reproduce. We can. > --- and I'm not convinced that trying to > get everyone standardized on the One True Test Setup is actually an > advantage. That is not a goal, the goal is allow variability! And share results in the most efficient way. It just turns an extremely simple setup we *can* *enable* *many* folks to setup easily with local vms to reproduce *more* issues today is with nvme drives + loopback drives. You are probably correct that this methodology was perhaps not as tested today as it was before and this is probably *why* we find more issues today. But so far it is true that: * all issues found are real and sometimes hard to reproduce with direct drives * this methodology is easy to bring up * it is finding more issues This is why this is just today's default for kdevops. It does not mean you can't *grow* to add support for other drive setup. In fact this is needed for testing ZNS drives. > Some people may be using large RAID Arrays; some might be > using fast flash; some might be using some kind of emulated log > structured block device; some might be using eMMC flash. And that's a > *good* thing. Absolutely! > We also have a very different philosophy about how to use expunge > files. Yes it does not mean we can't share them. And the variability which exists today *can* also be expressed. > In paticular, if there is test which is only failing 0.5% of > the time, I don't think it makes sense to put that test into an > expunge file. This preference can be expressed through kconfig and supported and support added for it. > More generally, I think competition is a good thing, and for areas > where we are still exploring the best way to automate tests, not just > from a QA department's perspective, but from a file system developer's > perspective, having multiple systems where we can explore these ideas > can be a good thing. Sure, sure, but again, but it does not mean we can't or shouldn't consider to share some things. Differences in strategy on how to process expunge files can be discussed so that later I can add support for it. I still think we can share at the very least configurations and expunges with known failure rates (even if they are runner/config specific). Luis ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) 2022-06-23 21:31 ` Luis Chamberlain @ 2022-06-24 5:32 ` Theodore Ts'o 2022-06-24 22:54 ` Luis Chamberlain 0 siblings, 1 reply; 17+ messages in thread From: Theodore Ts'o @ 2022-06-24 5:32 UTC (permalink / raw) To: Luis Chamberlain Cc: Darrick J. Wong, Leah Rumancik, Amir Goldstein, Josef Bacik, Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav, linux-xfs, fstests On Thu, Jun 23, 2022 at 02:31:12PM -0700, Luis Chamberlain wrote: > > To be clear, you seem to suggest gce-xfstests is a VM native solution. > I'd also like to clarify that kdevops supports native VMs, cloud and > baremetal. With kdevops you pick your bringup method. Yes, that was my point. Because gce-xfstests is a VM native solution, it has some advantages, such as the ability to take advantage of the fact that it's trivially easy to start up multiple cloud VM's which can run in parallel --- and then the VM's shut themselves down once they are done running the test, which saves cost and is more efficient. It is *because* that we are a VM-native solution that we can optimize in certain ways because we don't have to also support a bare metal setup. So yes, the fact that kdevops also supports bare metal is certainly granted. That that kind of flexibility is an advantage for kdevops, certainly; but being able to fully take advantage of the unqiue attributes of cloud VM's can also be a good thing. (I've already made offers to folks working at other cloud vendors that if they are interested in adding support for other cloud systems beyond GCE, I'm happy to work with them to enable the use of other XXX-xfstests test-appliance runners.) > kdevops started as an effort for kernel development and filesystems > testing. It is why the initial guest configuration was to use 8 GiB > of RAM and 4 vcpus, that suffices to do local builds / development. > I always did kernel development on guests back in the day still do > to this day. For kvm-xfstests, the default RAM size for the VM is 2GB. One of the reasons why I was interested in low-memory configurations is because ext4 is often used in smaller devices (such as embedded systesm and mobile handsets) --- and running in memory constrained environments can turn up bugs that otherwise are much harder to reproduce on a system with more memory. Separating the kernel build system from the test VM's means that the build can take place on a really powerful machine (either my desktop with 48 cores and gobs and gobs of memory, or a build VM if you are using the Lightweight Test Manager's Kernel Compilation Service) so builds go much faster. And then, of course, we can then launch a dozen VM's, one for each test config. If you force the build to be done on the test VM, then you either give up parallelism, or you waste time by building the kernel N times on N test VM's. And in the case of the android-xfstests, which communicates with a phone or tablet over a debugging serial cable and Android's fastboot protocol, of *course* it would be insane to want to build the kernel on the system under test! So I've ***always*** done the kernel build on a machine or VM separate from the System Under Test. At least for my use cases, it just makes a heck of a lot more sense. And that's fine. I'm *not* trying to convince everyone that my test infrastructure everyone should standardize on. Which quite frankly, I sometimes think you have been evangelizing. I believe very strongly that the choice of test infrastructures is a personal choice, which is heavily dependent on each developer's workflow, and trying to get everyone to standardize on a single test infrastructure is likely going to work as well as trying to get everyone to standardize on a single text editor. (Although obviously emacs is the one true editor. :-) > Sure, the TODO item on the URL seemed to indicate there was a desire to > find a better place to put failures. I'm not convinced the "better place" is expunge files. I suspect it may need to be some kind of database. Darrick tells me that he stores his test results in a postgres database. (Which is way better than what I'm doing which is an mbox file and using mail search tools.) Currently, Leah is using flat text files for the XFS 5.15 stable backports effort, plus some tools that parse and analyze those text files. I'll also note that the number of baseline kernel versions is much smaller if you are primarily testing an enterprise Linux distribution, such as SLES. And if you are working with stable kernels, you can probably get away with having updating the baseline for each LTS kernel every so often. But for upstream kernels development the number of kernel versions for which a developer might want to track flaky percentages and far greater, and will need to be updated at least once every kernel development cycle, and possibly more frequently than that. Which is why I'm not entirely sure a flat text file, such as an expunge file, is really the right answer. I can completely understand why Darrick is using a Postgres database. So there is clearly more thought and design required here, in my opinion. > That is not a goal, the goal is allow variability! And share results > in the most efficient way. Sure, but are expunge files the most efficient way to "share results"? If we have a huge amount of variability, such that we have a large number of directories with different test configs and different hardware configs, each with different expunge files, I'm not sure how useful that actually is. Are we expecting users to do a "git clone", and then start browsing all of these different expunge files by hand? It might perhaps be useful to get a bit more clarity about how we expect the shared results would be used, because that might drive some of the design decisions about the best way to store these "results". Cheers, - Ted ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) 2022-06-24 5:32 ` Theodore Ts'o @ 2022-06-24 22:54 ` Luis Chamberlain 2022-06-25 2:21 ` Theodore Ts'o 2022-06-25 7:28 ` sharing fstests results (Was: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)) Amir Goldstein 0 siblings, 2 replies; 17+ messages in thread From: Luis Chamberlain @ 2022-06-24 22:54 UTC (permalink / raw) To: Theodore Ts'o Cc: Darrick J. Wong, Leah Rumancik, Amir Goldstein, Josef Bacik, Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav, Zorro Lang, linux-xfs, fstests On Fri, Jun 24, 2022 at 01:32:23AM -0400, Theodore Ts'o wrote: > On Thu, Jun 23, 2022 at 02:31:12PM -0700, Luis Chamberlain wrote: > > > > To be clear, you seem to suggest gce-xfstests is a VM native solution. > > I'd also like to clarify that kdevops supports native VMs, cloud and > > baremetal. With kdevops you pick your bringup method. > > Yes, that was my point. Because gce-xfstests is a VM native solution, > it has some advantages, such as the ability to take advantage of the > fact that it's trivially easy to start up multiple cloud VM's which > can run in parallel --- and then the VM's shut themselves down once > they are done running the test, which saves cost and is more > efficient. Perhaps I am not understanding what you are suggesting with a VM native solution. What do you mean by that? A full KVM VM inside the cloud? Anyway, kdevops has support to bring up whatever type of node you want in the clouds providers: GCE, AWS, Azure, and OpenStack and even custom OpenStack solutions. That could be a VM or a high end bare metal node. It does this by using terraform and providing the variability through kconfig. The initial 'make bringup' brings nodes up, and then all work runs on each in parallel for fstests as you run 'make fstests-baseline'. At the end you just run 'make destroy'. > It is *because* that we are a VM-native solution that we can optimize > in certain ways because we don't have to also support a bare metal > setup. So yes, the fact that kdevops also supports bare metal is > certainly granted. That that kind of flexibility is an advantage for > kdevops, certainly; but being able to fully take advantage of the > unqiue attributes of cloud VM's can also be a good thing. Yes, agreed. That is why I focused on technology that would support all cloud providers, not just one. I had not touched code for AWS code for example in 2 years, I just went and tried a bringup and it worked in 10 minutes, most of the time was getting my .aws/credentials file set up with information from the website. > > kdevops started as an effort for kernel development and filesystems > > testing. It is why the initial guest configuration was to use 8 GiB > > of RAM and 4 vcpus, that suffices to do local builds / development. > > I always did kernel development on guests back in the day still do > > to this day. > > For kvm-xfstests, the default RAM size for the VM is 2GB. One of the > reasons why I was interested in low-memory configurations is because > ext4 is often used in smaller devices (such as embedded systesm and > mobile handsets) --- and running in memory constrained environments > can turn up bugs that otherwise are much harder to reproduce on a > system with more memory. Yes, I agree. We started with 8 GiB. Long ago while at SUSE I tried 2GiB and ran into the xfs/074 issue of requiring more due to xfs_scratch. Then later Amir ran into snags with xfs/084 and generic/627 due to the OOMs. So in terms of XFS to avoid OOMs with just the tests we need 3GiB. > Separating the kernel build system from the test VM's means that the > build can take place on a really powerful machine (either my desktop > with 48 cores and gobs and gobs of memory, or a build VM if you are > using the Lightweight Test Manager's Kernel Compilation Service) so > builds go much faster. And then, of course, we can then launch a > dozen VM's, one for each test config. If you force the build to be > done on the test VM, then you either give up parallelism, or you waste > time by building the kernel N times on N test VM's. The build is done once but I agree this can be optimized for kdevops. Right now in kdevops the git clone and build of the kernel does take place on each guest, and that requires at least 3 GiB of RAM. Shallow git clone support was added as option to help here but the ideal thing will be to just build locally or perhaps as you suggest dedicated build VM. > And in the case of the android-xfstests, which communicates with a > phone or tablet over a debugging serial cable and Android's fastboot > protocol, of *course* it would be insane to want to build the kernel > on the system under test! > > So I've ***always*** done the kernel build on a machine or VM separate > from the System Under Test. At least for my use cases, it just makes > a heck of a lot more sense. Support for this will be added to kdevops. > And that's fine. I'm *not* trying to convince everyone that my test > infrastructure everyone should standardize on. Which quite frankly, I > sometimes think you have been evangelizing. I believe very strongly > that the choice of test infrastructures is a personal choice, which is > heavily dependent on each developer's workflow, and trying to get > everyone to standardize on a single test infrastructure is likely > going to work as well as trying to get everyone to standardize on a > single text editor. What I think we *should* standardize on is at least configurations for testing. And now the dialog of how / if we track / share failures is also important. What runner you use is up to you. > (Although obviously emacs is the one true editor. :-) > > > Sure, the TODO item on the URL seemed to indicate there was a desire to > > find a better place to put failures. > > I'm not convinced the "better place" is expunge files. I suspect it > may need to be some kind of database. Darrick tells me that he stores > his test results in a postgres database. (Which is way better than > what I'm doing which is an mbox file and using mail search tools.) > > Currently, Leah is using flat text files for the XFS 5.15 stable > backports effort, plus some tools that parse and analyze those text > files. Where does not matter yet, what I'd like to refocus on is *if* sharing is desirable by folks. We can discuss *how* and *where* if we do think it is worth to share. If folks would like to evaluate this I'd encourage to do so perhaps after a specific distro release moving forward, and to not backtrack. But for stable kernels I'd imagine it may be easier to see value in sharing. > I'll also note that the number of baseline kernel versions is much > smaller if you are primarily testing an enterprise Linux distribution, > such as SLES. Much smaller than what? Android? If so then perhaps. Just recall that Enterprise supports kernels for at least 10 years. > And if you are working with stable kernels, you can > probably get away with having updating the baseline for each LTS > kernel every so often. But for upstream kernels development the > number of kernel versions for which a developer might want to track > flaky percentages and far greater, and will need to be updated at > least once every kernel development cycle, and possibly more > frequently than that. Which is why I'm not entirely sure a flat text > file, such as an expunge file, is really the right answer. I can > completely understand why Darrick is using a Postgres database. > > So there is clearly more thought and design required here, in my > opinion. Sure, let's talk about it, *if* we do find it valuable to share. kdevops already has stuff in a format which is consistent, that can change or be ported. We first just need to decide if we want to as a community share. The flakyness annotations are important too, and we have a thread about that, which I have to go and get back to at some point. > > That is not a goal, the goal is allow variability! And share results > > in the most efficient way. > > Sure, but are expunge files the most efficient way to "share results"? There are three things we want to do if we are going to talk about sharing results: a) Consuming expunges so check.sh for the Node Under Test (NUT) can expand on the expunges given a criteria (flakyness, crash requirements) b) Sharing updates to expunges per kernel / distro / runner / node-config and making patches to this easy. c) Making updates for failures easy to read for a developer / community. These would be in the form of an email or results file for a test run through some sort of kernel-ci. Let's start with a): We can adopt runners to use anything. My gut tells me postgres is a bit large unless we need socket communication. I can think of two ways to go here then. Perhaps others have some other ideas? 1) We go lightweight on the db, maybe sqlite3 ? And embrace the same postgres db schema as used by Darrick if he sees value in sharing this. If we do this I think it does't make sense to *require* sqlite3 on the NUT (nodes), for many reasons, so parsing the db on the host to a flat file to be used by the node does seem ideal. 2) Keep postgres and provide a REST api for queries from the host to this server so it can then construct a flat file / directory interpreation of expunges for the nodes under test (NUT). Given the minimum requirements desirable on the NUTs I think in the end a flat file hierarchy is nice so to not incur some new dependency on them. Determinism is important for tests though so snapshotting a reflection interpretion of expunges at a specific point in time is also important. So the database would need to be versioned per updates, so a test is checkpointed against a specific version of the expunge db. If we come to some sort of consensus then this code for parsing an expunge set can be used from directly on fstests's check script, so the interpreation and use can be done in one place for all test runners. We also have additional criteria which we may want for the expunges. For instance, if we had flakyness percentage annotated somehow then fstests's check could be passed an argument to only include expunges given a certain flakyness level of some sort, or for example only include expunges for tests which are known to crash. Generating the files from a db is nice. But what gains do we have with using a db then? Now let's move on to b) sharing the expunges and sending patches for updates. I think sending a patch against a flat file reads a lot easier except for the comments / flakyness levels / crash consideration / and artifacts. For kdevop's purposes this reads well today as we don't upload artifacts anywhere and just refer to them on github gists as best effort / optional. There is no convention yet on expression of flakyness but some tests do mention "failure rate" in one way or another. So we want to evaluate if we want to share not only expunges but other meta data associated to why a new test can be expunged or removed: * flakyness percentage * cause a kernel crash? * bogus test? * expunged due to a slew of a tons of other reasons, some of them maybe categorized and shared, some of them not And do we want to share artifacts? If so how? Perhaps an optional URL, with another component describing what it is, gist, or a tarball, etc. Then for the last part c) making failures easy to read to a developer let's review what could be done. I gather gce-xfstests explains the xunit results summary. Right now kdevop's kernel-ci stuff just sends an email with the same but also a diff to the expunge file hierarchy augmented for the target kernel directory being tested. The developer would just go and edit the line with meta data as a comment, but that is just because we lack a structure for it. If we strive to share an expunge list I think it would be wise to consider structure for this metadata. Perhaps: <test> # <crashes>|<flayness-percent-as-fraction>|<fs-skip-reason>|<artifact-type>|<artifact-dir-url>|<comments> Where: test: xfs/123 or btrfs/234 crashes: can be either Y or N flayness-percent-as-percentage: 80% fs-skip-reason: can be an enum to represent a series of fs specific reasons why a test may not be applicable or should be skipped artifact-type: optional, if present the type of artifact, can be enum to represent a gist test description, or a tarball artifact-dir-url: optional, path to the artifact comments: additional comments All the above considered, a) b) and c), yes I think a flat file model works well as an option. I'd love to hear other's feedback. > If we have a huge amount of variability, such that we have a large > number of directories with different test configs and different > hardware configs, each with different expunge files, I'm not sure how > useful that actually is. *If* you want to share I think it would be useful. At least kdevops uses a flat file model with no artifacts, just the expunges and comments, and over time it has been very useful, even to be able to review historic issues on older kernels by simply using something like 'git grep xfs/123' gives me a quick sense of history of issues of a test. > Are we expecting users to do a "git clone", > and then start browsing all of these different expunge files by hand? If we want to extend fstests check script to look for this, it could be an optional directory and an arugment could be pased to check so to enable its hunt for it, so that if passed it would look for the runner / kernel / host-type. For instance today we already have a function on initialization for the check script which looks for the fstests' config file as follows: known_hosts() { [ "$HOST_CONFIG_DIR" ] || HOST_CONFIG_DIR=`pwd`/configs [ -f /etc/xfsqa.config ] && export HOST_OPTIONS=/etc/xfsqa.config [ -f $HOST_CONFIG_DIR/$HOST ] && export HOST_OPTIONS=$HOST_CONFIG_DIR/$HOST [ -f $HOST_CONFIG_DIR/$HOST.config ] && export HOST_OPTIONS=$HOST_CONFIG_DIR/$HOST.config } We could have something similar look for an expugne directory of say say --expunge-auto-look and that could be something like: process_expunge_dir() { [ "$HOST_EXPUNGE_DIR" ] || HOST_EXPUNGE_DIR=`pwd`/expunges [ -d /etc/fstests/expunges/$HOST ] && export HOST_EXPUNGES=/etc/fstests/expunges/$HOST [ -d $HOST_EXPUNGE_DIR/$HOST ] && export HOST_EXPUNGES=$HOST_EXPUNGE_DIR/$HOST } The runner could be specified, and the host-type ./check --runner <gce-xfstests|kdevops|whatever> --host-type <kvm-8vcpus-2gb> And so we can have it look for these directory and if any of these are used processed (commulative): * HOST_EXPUNGES/any/$fstype/ - regardless of kernel, host type and runner * HOST_EXPUNGES/$kernel/$fstype/any - common between runners for any host type * HOST_EXPUNGES/$kernel/$fstype/$hostype - common between runners for a host type * HOST_EXPUNGES/$kernel/$fstype/$hostype/$runner - only present for the runner The aggregate set of expugnes are used. Additional criteria could be passed to check so to ensure that only certain expunges that meet the criteria are used to skip tests for the run, provided we can agree on some metatdata for that. > It might perhaps be useful to get a bit more clarity about how we > expect the shared results would be used, because that might drive some > of the design decisions about the best way to store these "results". Sure. Luis ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) 2022-06-24 22:54 ` Luis Chamberlain @ 2022-06-25 2:21 ` Theodore Ts'o 2022-06-25 18:49 ` Luis Chamberlain 2022-06-25 7:28 ` sharing fstests results (Was: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)) Amir Goldstein 1 sibling, 1 reply; 17+ messages in thread From: Theodore Ts'o @ 2022-06-25 2:21 UTC (permalink / raw) To: Luis Chamberlain Cc: Darrick J. Wong, Leah Rumancik, Amir Goldstein, Josef Bacik, Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav, Zorro Lang, linux-xfs, fstests On Fri, Jun 24, 2022 at 03:54:44PM -0700, Luis Chamberlain wrote: > > Perhaps I am not understanding what you are suggesting with a VM native > solution. What do you mean by that? A full KVM VM inside the cloud? "Cloud native" is the better way to put things. Cloud VM's are designed to be ephemeral, so the concept of "node bringup" really doesn't enter into the picture. When I run the "build-appliance" command, this creates a test appliance image. Which is to say, we create a root file system image, and then "freeze" it into a VM image. For kvm-xfstests this is a qcow image which is run in snapshot mode, which means that if any changes is made to the root file system, those changes disappear when the VM exits. For gce-xfstests, we create an image which can be used to quickly bring up a VM which contains a block device which contains a copy of that image as the root file system. What so special about this? I can create a dozen, or a hundred VM's, all with a copy of that same image. So I can do something like gce-xfstests ltm -c ext4/all -g full gs://gce-xfstests/bzImage-5.4.200 and this will launch a dozen VM's, with each VM testing a single test configuration with the kernel found at gs://gce-xfstests/bzImage-5.4.200 in Google Cloud Storage GCS (the rough equivalent of AWS's S3). And then I can run gce-xfstests ltm -c ext4/all -g full --repo stable.git --commit v5.10.124 And this will launch a build VM which is nice and powerful to *quickly* build the 5.10.124 kernel as found in the stable git tree, and then launch a dozen additional VM's to test that built kernel against all of the test configs defined for ext4/all, one VM per each fs config. And after running gce-xfstests ltm -c ext4/all -g full --repo stable.git --commit v5.15.49 gce-xfstests ltm -c ext4/all -g full --repo stable.git --commit v5.18.6 ... now there will be ~50 VM's all running tests in parallel. So this is far faster than doing a "node bringup", and since I am running all of the tests in parallel, I will get the test results back in a much shorter amount of wall clock time. And, as running each test config complete, the VM's will disappear (after first uploading the test results into GCS), and I will stop getting charged for them. And if I were to launch additional tests runs, each containing their own set of VM's: gce-xfstests ltm -c xfs/all -g full --repo stable.git --commit v5.15.49 gce-xfstests ltm -c xfs/all -g full --repo stable.git --commit v5.18.6 gce-xfstests ltm -c f2fs/all -g full --repo stable.git --commit v5.15.49 I can very quickly have over 100 test VM's running in parallel, and as the tests complete, they are automatically shutdown and destroyed ---- which means that we don't store state in the VM. Instead the state is stored in a Google Cloud Storage (Amazon S3) bucket, with e-mail sent with a summary of results. VM's can get started much more quickly than "make bringup", since we're not running puppet or ansible to configure each node. Instead, we get a clone of the test appliance: % gce-xfstests describe-image archiveSizeBytes: '1315142848' creationTimestamp: '2022-06-20T21:46:24.797-07:00' description: Linux Kernel File System Test Appliance diskSizeGb: '10' family: xfstests ... labels: blktests: gaf97b55 fio: fio-3_30 fsverity: v1_5 ima-evm-utils: v1_3_2 nvme-cli: v1_16 quota: v4_05-43-gd2256ac util-linux: v2_38 xfsprogs: v5_18_0 xfstests: v2022_06_05-13-gbc442c4b xfstests-bld: g8548bd11 zz_build-distro: bullseye ... And since these images are cheap to keep around (5-6 cents/month), I can keep a bunch of older versions of test appliances around, in case I want to see if a test regression might be caused by a newer version of the test appliance. So I can run "gce-xfstests -I xfstests-202001021302" and this will create a VM using the test appliance that I built on January 2, 2020. It also means that I can release a new test appliance to the xfstests-cloud project for public use, and if someone wants to pin their testing to an known version of the test appliance, they can do that. So the test appliance VM's can be much more dynamic than kdevops nodes, because they can be created and deleted without a care in the world. This is enabled by the fact that there isn't any state which is stored on the VM. In contrat, in order to harvest test results from a kdevops node, you have to ssh into the node and try to find the test results. In contrast, I can just run "gce-xfstests ls-results" to see all of the results that has been saved to GCS, and I can fetch a particular test result to my laptop via a single command: "gce-xfstests get-results tytso-20220624210238". No need to ssh to a host node, and then ssh to the kdevops test node, yadda, yadda, yadda --- and if you run "make destroy" you lose all of the test result history on that node, right? Speaking of saving the test result history, a full set of test results/artifiacts for a dozen ext4 configs is around 12MB for the tar.xz file, and Google Cloud Storage is a penny/GB/month for nearline storage, and 0.4 cents/GB/month for coldline storage, so I can afford to keep a *lot* of test results/artifacts for quite a while, which can occasionally be handy for doing some historic research. See the difference? - Ted ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) 2022-06-25 2:21 ` Theodore Ts'o @ 2022-06-25 18:49 ` Luis Chamberlain 2022-06-25 21:14 ` Theodore Ts'o 0 siblings, 1 reply; 17+ messages in thread From: Luis Chamberlain @ 2022-06-25 18:49 UTC (permalink / raw) To: Theodore Ts'o Cc: Darrick J. Wong, Leah Rumancik, Amir Goldstein, Josef Bacik, Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav, Zorro Lang, linux-xfs, fstests On Fri, Jun 24, 2022 at 10:21:50PM -0400, Theodore Ts'o wrote: > On Fri, Jun 24, 2022 at 03:54:44PM -0700, Luis Chamberlain wrote: > > > > Perhaps I am not understanding what you are suggesting with a VM native > > solution. What do you mean by that? A full KVM VM inside the cloud? > > "Cloud native" is the better way to put things. Cloud VM's > are designed to be ephemeral, so the concept of "node bringup" really > doesn't enter into the picture. > > When I run the "build-appliance" command, this creates a test > appliance image. Which is to say, we create a root file system image, > and then "freeze" it into a VM image. So this seems to build an image from a base distro image. Is that right? And it would seem your goal is to store that image then after so it can be re-used. > For kvm-xfstests this is a qcow image which is run in snapshot mode, > which means that if any changes is made to the root file system, those > changes disappear when the VM exits. Sure, so you use one built image once, makes sense. You are optimizing usage for GCE. That makes sense. The goal behind kdevops was to use technology which can *enable* any optimizations in a cloud agnostic way. What APIs become public is up to the cloud provider, and one cloud agnostic way to manage cloud solutions using open source tools is with terraform and so that is used today. If an API is not yet avilable through terraform kdevops could simply use whatever cloud tool for additional hooks. But having the ability to ramp up regardless of cloud provider was extremely important to me from the beginning. Optimizing is certainly possible, always :) Likewise, if you using local virtualized, we can save vagrant images in the vagrant cloud, if we wanted, which would allow pre-built setups saved: https://app.vagrantup.com/boxes/search That could reduce speed for when doing bringup for local KVM / Virtualbox guests. In fact since vagrant images are also just tarballs with qcow2 files, I do wonder if they can be also leveraged for cloud deployments. Or if the inverse is true, if your qcow2 images can be used for vagrant purposes as well. If you're curious: https://github.com/linux-kdevops/kdevops/blob/master/docs/custom-vagrant-boxes.md What approach you use is up to you. From a Linux distribution perspective being able to do reproducible builds was important too, and so that is why a lot of effort was put to ensure how you cook up a final state from an initial distro release was supported. > I can very quickly have over 100 test VM's running in parallel, and as > the tests complete, they are automatically shutdown and destroyed ---- > which means that we don't store state in the VM. Instead the state is > stored in a Google Cloud Storage (Amazon S3) bucket, with e-mail sent > with a summary of results. Using cloud object storage is certainly nice if you can afford it. I think it is valuable, but likewise should be optional. And so with kdevops support is welcomed should someone want to do that. And so what you describe is not impossible with kdevops it is just not done today, but could be enabled. > VM's can get started much more quickly than "make bringup", since > we're not running puppet or ansible to configure each node. You can easily just use pre-built images as well instead of doing the build from a base distro release, just as you could use custom vagrant images for local KVM guests. The usage of ansible to *build* fstests and install can be done once too and that image saved, exported, etc, and then re-used. The kernel config I maintain on kdevops has been tested to work on local KVM virtualization setups, but also all supported cloud providers as well. So I think there is certainly value in learning from the ways you optimizing cloud usage for GCE and generalizing that for *any* cloud provider. The steps to get to *build* an image from a base distro release is glanced over but that alone takes effort and is made pretty well distro agnostic within kdevops too. > In contrast, I can just run "gce-xfstests ls-results" to see all of > the results that has been saved to GCS, and I can fetch a particular > test result to my laptop via a single command: "gce-xfstests > get-results tytso-20220624210238". No need to ssh to a host node, and > then ssh to the kdevops test node, yadda, yadda, yadda --- and if you > run "make destroy" you lose all of the test result history on that node, > right? Actually all the *.bad, *.dmesg as well as final xunit results for all nodes for failed tests is copied over locally to the host which is running kdevops. Xunit files are also merged to represent a final full set of results too. So no not destroyed. If you wanted to keep all files even for non-failed stuff we can add that as a new Kconfig bool. Support for stashing results into object storage sure would be nice, agreed. > See the difference? Yes you have optimized usage of GCE. Good stuff, lots to learn from that effort! Thanks for sharing the details! Luis ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) 2022-06-25 18:49 ` Luis Chamberlain @ 2022-06-25 21:14 ` Theodore Ts'o 2022-07-01 23:08 ` Luis Chamberlain 0 siblings, 1 reply; 17+ messages in thread From: Theodore Ts'o @ 2022-06-25 21:14 UTC (permalink / raw) To: Luis Chamberlain Cc: Darrick J. Wong, Leah Rumancik, Amir Goldstein, Josef Bacik, Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav, Zorro Lang, linux-xfs, fstests On Sat, Jun 25, 2022 at 11:49:54AM -0700, Luis Chamberlain wrote: > You are optimizing usage for GCE. That makes sense. This particular usage model is not unique to GCE. A very similar thing can be done using Microsoft Azure, Amazon Web Services and Oracle Cloud Services. And I've talked to some folks who might be interested in taking the Test Appliance that is currently built for use with KVM, Android, and GCE, and extending it to support other Cloud infrastructures. So the concept of these optimizations are not unique to GCE, which is why I've been calling this approach "cloud native". Perhaps one other difference is that I make the test appliance images available, so people don't *have* to build them from scratch. They can just download the qcow2 image from: https://www.kernel.org/pub/linux/kernel/people/tytso/kvm-xfstests And for GCE, there is the public image project, xfstests-cloud, just like there are public images for debian in the debian-cloud project, for Fedora in the fedora-cloud project, etc. Of course, for full GPL compliance, how to build these images from source is fully available, which is why the images are carefully tagged so all of the git commit versions and the automated scripts used to build the image are fully available for anyone who wants to replicate the build. *BUT*, they don't have to build the test environment if they are just getting started. One of the things which I am trying to do is to make the "out of box" experience as simple as possible, which means I don't want to force users to build the test appliance or run "make bringup" if they don't have to. Of course, someone who is doing xfstests development will need to learn how to build their own test appliance. But for someone who is just getting started, the goal is to make the learning curve as flat as possible. One of the other things that was important design principles for me was I didn't want to require that the VM's have networking access, nor did I want to require users to be able to have to run random scripts via sudo or as root. (Some of this was because of corporate security requirements at the time.) This also had the benefit that I'm not asing the user to set up ssh keys if they are using kvm-xfstests, but instead rely on the serial console. > The goal behind kdevops was to use technology which can *enable* any > optimizations in a cloud agnostic way. Fair enough. My goal for kvm-xfstests and gce-xfstests was to make developer velocity the primary goal. Portability to different cloud systems took a back seat. I don't apologize for this, since over the many years that I've been personally using {kvm,gce}-xfstests, the fact that I can use my native kernel development environment, and have the test environment pluck the kernel straight out of my build tree, has paid for itself many times over. If I had to push test/debug kernel code to a public git tree just so the test VM can pull donwn the code and build it in the test VM a second time --- I'd say, "no thank you, absolutely not." Having to do this would slow me down, and as I said, developer velocity is king. I want to be able to save a patch from my mail user agent, apply the patch, and then give the code a test, *without* having to interact with a public git tree. Maybe you can do that with kdevops --- but it's not at all obvious how. With kvm-xfstests, I have a quickstart doc which gives instructions, and then it's just a matter of running the command "kvm-xfstests smoke" or "kvm-xfstests shell" from the developer's kernel tree. No muss, no fuss, no dirty dishes.... > In fact since vagrant images are also just tarballs with qcow2 files, > I do wonder if they can be also leveraged for cloud deployments. Or if > the inverse is true, if your qcow2 images can be used for vagrant > purposes as well. Well, my qcow2 images don't come with ssh keys, since they are optimized to be launched from the kvm-xfstests script, where the tests to be run are passed in via the boot command line: % kvm-xfstests smoke --no-action Detected kbuild config; using /build/ext4-4.14 for kernel Using kernel /build/ext4-4.14/arch/x86/boot/bzImage Networking disabled. Would execute: ionice -n 5 /usr/bin/kvm -boot order=c -net none -machine type=pc,accel=kvm:tcg \ -cpu host -drive file=/usr/projects/xfstests-bld/build-64/test-appliance/root_fs.img,if=virtio,snapshot=on \ .... -gdb tcp:localhost:7499 --kernel /build/ext4-4.14/arch/x86/boot/bzImage \ --append "quiet loglevel=0 root=/dev/vda console=ttyS0,115200 nokaslr fstestcfg=4k fstestset=-g,quick fstestopt=aex fstesttz=America/New_York fstesttyp=ext4 fstestapi=1.5 orig_cmdline=c21va2UgLS1uby1hY3Rpb24=" The boot command line options "fstestcfg=4k", "fstestset=-g,quick", "fstesttyp=ext4", etc. is how the test appliance knows which tests to run. So that means *all* the developer needs to do is to type command "kvm-xfstests smoke". (By the way, it's a simple config option in ~/.config/kvm-xfstests if you are a btrfs or xfs developer, and you want the default file system type to be btrfs or xfs. Of course you can explicitly specify a test config if you are an ext4 developer and,, you want to test how a test runs on xfs: "kvm-xfstests -c xfs/4k generic/223".) There's no need to set up ssh keys, push the kernel to a public git tree, ssh into the test VM, yadda, yadda, yadda. Just one single command line and you're *done*. This is what I meant by the fact that kvm-xfstests is optimized for a file system developer's workflow, which I claim is very different from what a QA department might want. I added that capability to gce-xfstests later, but it's very separate from the very simple command lines for a file system developer. If I want the lightweight test manager to watch a git tree, and to kick off a build whenever a branch changes, and then run a set of tests, I can do that, but that's a *very* different command and a very different use case, and I've optimized for that separately: gce-xfstests ltm -c ext4/all -g auto --repo ext4.dev --watch dev This is what I call the QA department's workflow. Which is also totally valid. But I believe in optimizing for each workflow separately, and being somewhat opinionated in my choices. For example, the test appliance uses Debian. Period. And that's because I didn't see the point of investing time in making that be flexible. My test infrastructure is optimized for a ***kernel*** developer, and from that perspective, the distro for the test environment is totally irrelevant. I understand that if you are working for SuSE, then maybe you would want to insist on a test environment based on OpenSuSE, or if you're working for Red Hat, you'd want to use Fedora. If so, then kvm-xfstests is not for you. I'd much rather optimize for a *kernel* developer, not a Linux distribution's QA department. They can use kdevops if they want, for that use case. :-) Cheers, - Ted ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) 2022-06-25 21:14 ` Theodore Ts'o @ 2022-07-01 23:08 ` Luis Chamberlain 0 siblings, 0 replies; 17+ messages in thread From: Luis Chamberlain @ 2022-07-01 23:08 UTC (permalink / raw) To: Theodore Ts'o Cc: Darrick J. Wong, Leah Rumancik, Amir Goldstein, Josef Bacik, Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav, Zorro Lang, linux-xfs, fstests On Sat, Jun 25, 2022 at 05:14:17PM -0400, Theodore Ts'o wrote: > On Sat, Jun 25, 2022 at 11:49:54AM -0700, Luis Chamberlain wrote: > > You are optimizing usage for GCE. That makes sense. > > This particular usage model is not unique to GCE. A very similar > thing can be done using Microsoft Azure, Amazon Web Services and > Oracle Cloud Services. And I've talked to some folks who might be > interested in taking the Test Appliance that is currently built for > use with KVM, Android, and GCE, and extending it to support other > Cloud infrastructures. So the concept of these optimizations are not > unique to GCE, which is why I've been calling this approach "cloud > native". I think we have similar goals. I'd like to eventually generalize what you have done for enablement through *any* cloud. And, I suspect this may not just be useful for kernel development too and so there is value in that for other things. > Perhaps one other difference is that I make the test appliance images > available, so people don't *have* to build them from scratch. They > can just download the qcow2 image from: > > https://www.kernel.org/pub/linux/kernel/people/tytso/kvm-xfstests It may make sense for us to consider containers for some of this. If a distro doesn't have one, for example, well then we just have to do the build-it-all-step. > And for GCE, there is the public image project, xfstests-cloud, just > like there are public images for debian in the debian-cloud project, > for Fedora in the fedora-cloud project, etc. Of course, for full GPL > compliance, how to build these images from source is fully available, > which is why the images are carefully tagged so all of the git commit > versions and the automated scripts used to build the image are fully > available for anyone who wants to replicate the build. *BUT*, they > don't have to build the test environment if they are just getting > started. > > One of the things which I am trying to do is to make the "out of box" > experience as simple as possible, which means I don't want to force > users to build the test appliance or run "make bringup" if they don't > have to. You are misunderstanding the goal with 'make bringup', if you already have pre-built images you can use them and you have less to do. You *don't* have to run 'make fstests' if you already have that set up. 'make bringup' just abstracts general initial stage nodes, whether on cloud or local virt. 'make linux' however does get / build / install linux. And for local virtualization there whedre vagrant images are used one could enhance these further too. They are just compressed tarball with a qcow2 file at least when libvirt is used. Since kdevops works off of these, you can then also use pre-built images with all kernel/modules need and even binaries. I've extended docs recently to help folks who wish to optimize on that front: https://github.com/linux-kdevops/kdevops/blob/master/docs/custom-vagrant-boxes.md Each stage has its own reproducible builds aspect to it. So if one *had* these enhanced vagrant images with kernels, one could just skip the build stage and jump straight to testing after bringup. I do wonder if we could share simiular qcow2 images for cloud testing too and for vagrant. If we could... there is a pretty big win. > Of course, someone who is doing xfstests development will need to > learn how to build their own test appliance. But for someone who is > just getting started, the goal is to make the learning curve as flat > as possible. Yup. > One of the other things that was important design principles for me > was I didn't want to require that the VM's have networking access, nor > did I want to require users to be able to have to run random scripts > via sudo or as root. (Some of this was because of corporate security > requirements at the time.) This also had the benefit that I'm not > asing the user to set up ssh keys if they are using kvm-xfstests, but > instead rely on the serial console. Philosphy. > > The goal behind kdevops was to use technology which can *enable* any > > optimizations in a cloud agnostic way. > > Fair enough. My goal for kvm-xfstests and gce-xfstests was to make > developer velocity the primary goal. Portability to different cloud > systems took a back seat. I don't apologize for this, since over the > many years that I've been personally using {kvm,gce}-xfstests, the > fact that I can use my native kernel development environment, and have > the test environment pluck the kernel straight out of my build tree, > has paid for itself many times over. Yes I realize that. No one typically has time to do that. Which is why when I had my requirements from a prior $employer to do the tech do something cloud agnostic, I decided it was tech best shared. It was not easy. > If I had to push test/debug kernel code to a public git tree just so > the test VM can pull donwn the code and build it in the test VM a > second time --- I'd say, "no thank you, absolutely not." Having to do > this would slow me down, and as I said, developer velocity is king. I > want to be able to save a patch from my mail user agent, apply the > patch, and then give the code a test, *without* having to interact > with a public git tree. Every developer may have a different way to work and do Linux kernel development. > Maybe you can do that with kdevops --- but it's not at all obvious > how. The above just explained what you *don't* want to do, not what you want. But you explained to me in private a while ago you expect to do local test builds fast to guest. I think you're just missing that the goal is to support variability and enable that variability. If such variability is not supported then its just a matter of adding a few kconfig options and then adding support for it. So yes its possible and its a matter of taking a bit of time to do that workflow. My kdev workflow was to just work with large guests before, and use 'localmodconfig' kernels which are very small, and so build time is fast, specially after the first build. The other worklow I then supported was the distro world one where we tested a "kernel of the day" which is a kernel on a repo somewhere. So upgradng is just ensuring you have a repo and `zypper in` the kernel, reboot and test. To support the workflow you have I'd like to evaluate both a local virt solution and cloud (for any cloud vendor). For local virt using 9p seems to make sense. For cloud, not so sure. I think we really digress from the subject at hand though. This conversation is useful but it really is just noise to a lot of people. Luis ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: sharing fstests results (Was: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)) 2022-06-24 22:54 ` Luis Chamberlain 2022-06-25 2:21 ` Theodore Ts'o @ 2022-06-25 7:28 ` Amir Goldstein 2022-06-25 19:35 ` Luis Chamberlain 1 sibling, 1 reply; 17+ messages in thread From: Amir Goldstein @ 2022-06-25 7:28 UTC (permalink / raw) To: Luis Chamberlain Cc: Theodore Ts'o, Darrick J. Wong, Leah Rumancik, Josef Bacik, Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav, Zorro Lang, linux-xfs, fstests [Subject change was long due...] On Sat, Jun 25, 2022 at 1:54 AM Luis Chamberlain <mcgrof@kernel.org> wrote: > > On Fri, Jun 24, 2022 at 01:32:23AM -0400, Theodore Ts'o wrote: [...] > > > > > Sure, the TODO item on the URL seemed to indicate there was a desire to > > > find a better place to put failures. > > > > I'm not convinced the "better place" is expunge files. I suspect it > > may need to be some kind of database. Darrick tells me that he stores > > his test results in a postgres database. (Which is way better than > > what I'm doing which is an mbox file and using mail search tools.) > > > > Currently, Leah is using flat text files for the XFS 5.15 stable > > backports effort, plus some tools that parse and analyze those text > > files. > > Where does not matter yet, what I'd like to refocus on is *if* sharing > is desirable by folks. We can discuss *how* and *where* if we do think > it is worth to share. > > If folks would like to evaluate this I'd encourage to do so perhaps > after a specific distro release moving forward, and to not backtrack. > > But for stable kernels I'd imagine it may be easier to see value in > sharing. > > > I'll also note that the number of baseline kernel versions is much > > smaller if you are primarily testing an enterprise Linux distribution, > > such as SLES. > > Much smaller than what? Android? If so then perhaps. Just recall that > Enterprise supports kernels for at least 10 years. > > > And if you are working with stable kernels, you can > > probably get away with having updating the baseline for each LTS > > kernel every so often. But for upstream kernels development the > > number of kernel versions for which a developer might want to track > > flaky percentages and far greater, and will need to be updated at > > least once every kernel development cycle, and possibly more > > frequently than that. Which is why I'm not entirely sure a flat text > > file, such as an expunge file, is really the right answer. I can > > completely understand why Darrick is using a Postgres database. > > > > So there is clearly more thought and design required here, in my > > opinion. > > Sure, let's talk about it, *if* we do find it valuable to share. > kdevops already has stuff in a format which is consistent, that > can change or be ported. We first just need to decide if we want > to as a community share. > > The flakyness annotations are important too, and we have a thread > about that, which I have to go and get back to at some point. > > > > That is not a goal, the goal is allow variability! And share results > > > in the most efficient way. > > > > Sure, but are expunge files the most efficient way to "share results"? > > There are three things we want to do if we are going to talk about > sharing results: > > a) Consuming expunges so check.sh for the Node Under Test (NUT) can expand > on the expunges given a criteria (flakyness, crash requirements) > > b) Sharing updates to expunges per kernel / distro / runner / node-config > and making patches to this easy. > > c) Making updates for failures easy to read for a developer / community. > These would be in the form of an email or results file for a test > run through some sort of kernel-ci. > > Let's start with a): > > We can adopt runners to use anything. My gut tells me postgres is > a bit large unless we need socket communication. I can think of two > ways to go here then. Perhaps others have some other ideas? > > 1) We go lightweight on the db, maybe sqlite3 ? And embrace the same > postgres db schema as used by Darrick if he sees value in sharing > this. If we do this I think it does't make sense to *require* > sqlite3 on the NUT (nodes), for many reasons, so parsing the db > on the host to a flat file to be used by the node does seem > ideal. > > 2) Keep postgres and provide a REST api for queries from the host to > this server so it can then construct a flat file / directory > interpreation of expunges for the nodes under test (NUT). > > Given the minimum requirements desirable on the NUTs I think in the end > a flat file hierarchy is nice so to not incur some new dependency on > them. > > Determinism is important for tests though so snapshotting a reflection > interpretion of expunges at a specific point in time is also important. > So the database would need to be versioned per updates, so a test is > checkpointed against a specific version of the expunge db. Using the terminology "expunge db" is wrong here because it suggests that flakey tests (which are obviously part of that db) should be in expunge list as is done in kdevops and that is not how Josef/Ted/Darrick treat the flakey tests. The discussion should be around sharing fstests "results" not expunge lists. Sharing expunge lists for tests that should not be run at all with certain kernel/disrto/xfsprogs has great value on its own and I this the kdevops hierarchical expunge lists are a very good place to share this *determinitic* information, but only as long as those lists absolutely do not contain non-deterministic test expunges. For example, this is a deterministic expunge list that may be worth sharing: https://github.com/linux-kdevops/kdevops/blob/master/workflows/fstests/expunges/any/xfs/reqs-xfsprogs-5.10.txt Because for all the tests (it's just one), the failure is analysed and found to be deterministic and related to the topic of the expunge. However, this is also a classic example for an expunge list that could be auto generated by the test runner if xfs/540 had the annotations: _fixed_in_version xfsprogs 5.13 _fixed_by_git_commit xfsprogs 5f062427 \ "xfs_repair: validate alignment of inherited rt extent hints" > > If we come to some sort of consensus then this code for parsing an > expunge set can be used from directly on fstests's check script, so the > interpreation and use can be done in one place for all test runners. > We also have additional criteria which we may want for the expunges. > For instance, if we had flakyness percentage annotated somehow then > fstests's check could be passed an argument to only include expunges > given a certain flakyness level of some sort, or for example only > include expunges for tests which are known to crash. > > Generating the files from a db is nice. But what gains do we have > with using a db then? > > Now let's move on to b) sharing the expunges and sending patches for > updates. I think sending a patch against a flat file reads a lot easier > except for the comments / flakyness levels / crash consideration / and > artifacts. For kdevop's purposes this reads well today as we don't > upload artifacts anywhere and just refer to them on github gists as best > effort / optional. There is no convention yet on expression of flakyness > but some tests do mention "failure rate" in one way or another. > > So we want to evaluate if we want to share not only expunges but other > meta data associated to why a new test can be expunged or removed: > > * flakyness percentage > * cause a kernel crash? > * bogus test? > * expunged due to a slew of a tons of other reasons, some of them maybe > categorized and shared, some of them not > > And do we want to share artifacts? If so how? Perhaps an optional URL, > with another component describing what it is, gist, or a tarball, etc. > > Then for the last part c) making failures easy to read to a developer > let's review what could be done. I gather gce-xfstests explains the > xunit results summary. Right now kdevop's kernel-ci stuff just sends > an email with the same but also a diff to the expunge file hierarchy > augmented for the target kernel directory being tested. The developer > would just go and edit the line with meta data as a comment, but that > is just because we lack a structure for it. If we strive to share > an expunge list I think it would be wise to consider structure for > this metadata. > > Perhaps: > > <test> # <crashes>|<flayness-percent-as-fraction>|<fs-skip-reason>|<artifact-type>|<artifact-dir-url>|<comments> > > Where: > > test: xfs/123 or btrfs/234 > crashes: can be either Y or N > flayness-percent-as-percentage: 80% > fs-skip-reason: can be an enum to represent a series of > fs specific reasons why a test may not be > applicable or should be skipped > artifact-type: optional, if present the type of artifact, > can be enum to represent a gist test > description, or a tarball > artifact-dir-url: optional, path to the artifact > comments: additional comments > > All the above considered, a) b) and c), yes I think a flat file > model works well as an option. I'd love to hear other's feedback. > > > If we have a huge amount of variability, such that we have a large > > number of directories with different test configs and different > > hardware configs, each with different expunge files, I'm not sure how > > useful that actually is. > > *If* you want to share I think it would be useful. > > At least kdevops uses a flat file model with no artifacts, just the > expunges and comments, and over time it has been very useful, even to be > able to review historic issues on older kernels by simply using > something like 'git grep xfs/123' gives me a quick sense of history of > issues of a test. > > > Are we expecting users to do a "git clone", > > and then start browsing all of these different expunge files by hand? > > If we want to extend fstests check script to look for this, it could > be an optional directory and an arugment could be pased to check so > to enable its hunt for it, so that if passed it would look for the > runner / kernel / host-type. For instance today we already have > a function on initialization for the check script which looks for > the fstests' config file as follows: > > known_hosts() > { > [ "$HOST_CONFIG_DIR" ] || HOST_CONFIG_DIR=`pwd`/configs > > [ -f /etc/xfsqa.config ] && export HOST_OPTIONS=/etc/xfsqa.config > [ -f $HOST_CONFIG_DIR/$HOST ] && export HOST_OPTIONS=$HOST_CONFIG_DIR/$HOST > [ -f $HOST_CONFIG_DIR/$HOST.config ] && export HOST_OPTIONS=$HOST_CONFIG_DIR/$HOST.config > } > > We could have something similar look for an expugne directory of say > say --expunge-auto-look and that could be something like: > > process_expunge_dir() > { > [ "$HOST_EXPUNGE_DIR" ] || HOST_EXPUNGE_DIR=`pwd`/expunges > > [ -d /etc/fstests/expunges/$HOST ] && export HOST_EXPUNGES=/etc/fstests/expunges/$HOST > [ -d $HOST_EXPUNGE_DIR/$HOST ] && export HOST_EXPUNGES=$HOST_EXPUNGE_DIR/$HOST > } > > The runner could be specified, and the host-type > > ./check --runner <gce-xfstests|kdevops|whatever> --host-type <kvm-8vcpus-2gb> > > And so we can have it look for these directory and if any of these are used > processed (commulative): > > * HOST_EXPUNGES/any/$fstype/ - regardless of kernel, host type and runner > * HOST_EXPUNGES/$kernel/$fstype/any - common between runners for any host type > * HOST_EXPUNGES/$kernel/$fstype/$hostype - common between runners for a host type > * HOST_EXPUNGES/$kernel/$fstype/$hostype/$runner - only present for the runner > > The aggregate set of expugnes are used. > > Additional criteria could be passed to check so to ensure that only > certain expunges that meet the criteria are used to skip tests for the > run, provided we can agree on some metatdata for that. > > > It might perhaps be useful to get a bit more clarity about how we > > expect the shared results would be used, because that might drive some > > of the design decisions about the best way to store these "results". > As a requirement, what I am looking for is a way to search for anything known to the community about failures in test FS/NNN. Because when I get an alert on a possible regression, that's the fastest way for me to triage and understand how much effort I should put into the investigation of that failure and which directions I should look into. Right now, I look at the test header comment and git log, I grep the kdepops expunge lists to look for juicy details and I search lore for mentions of that test. In fact, I already have an auto generated index of lore fstests mentions in xfs patch discussions [1] that I just grep for failures found when testing xfs. For LTS testing, I found it to be the best way to find candidate fix patches that I may have missed. I would love to have more sources to get search results from. There doesn't even need to be a standard form for the search or results. If Leah, Darrick, Ted and Josef would provide me with a script to search their home brewed fstests db, I would just run all those scripts and see what they have to tell me about FS/NNN in some form of human readable format that I can understand. Going forward, we can try to standardize the search and results format, but for getting better requirements you first need users! Thanks, Amir. [1] https://github.com/amir73il/b4/blob/xfs-5.10.y/xfs-5.10..5.17-rn.rst ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: sharing fstests results (Was: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)) 2022-06-25 7:28 ` sharing fstests results (Was: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)) Amir Goldstein @ 2022-06-25 19:35 ` Luis Chamberlain 2022-06-25 21:50 ` Theodore Ts'o 0 siblings, 1 reply; 17+ messages in thread From: Luis Chamberlain @ 2022-06-25 19:35 UTC (permalink / raw) To: Amir Goldstein Cc: Theodore Ts'o, Darrick J. Wong, Leah Rumancik, Josef Bacik, Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav, Zorro Lang, linux-xfs, fstests On Sat, Jun 25, 2022 at 10:28:32AM +0300, Amir Goldstein wrote: > On Sat, Jun 25, 2022 at 1:54 AM Luis Chamberlain <mcgrof@kernel.org> wrote: > > Determinism is important for tests though so snapshotting a reflection > > interpretion of expunges at a specific point in time is also important. > > So the database would need to be versioned per updates, so a test is > > checkpointed against a specific version of the expunge db. > > Using the terminology "expunge db" is wrong here because it suggests > that flakey tests (which are obviously part of that db) should be in > expunge list as is done in kdevops and that is not how Josef/Ted/Darrick > treat the flakey tests. There are flaky tests which can cause a crash, and that is why I started to expunge these. Not all flaky tests cause a crash though. And so, this is why in the format I suggested you can specify metadata such as if a test caused a crash. At this point I agree that the way kdevops simply skips flaky test which does not cause a crash should be changed, and if the test is just known to fail though non deterministically but without a crash it would be good then at the end to simply not treat that failure as fatal. If however the failure rate does change it would be useful to update that information. Without metadata one cannot process that sort of stuff. > The discussion should be around sharing fstests "results" not expunge > lists. Sharing expunge lists for tests that should not be run at all > with certain kernel/disrto/xfsprogs has great value on its own and I > this the kdevops hierarchical expunge lists are a very good place to > share think *determinitic* information, but only as long as those lists > absolutely do not contain non-deterministic test expunges. The way the expunge list is process could simply be modified in kdevops so that non-deterministic tests are not expunged but also not treated as fatal at the end. But think about it, the exception is if the non-deterministic failure does not lead to a crash, no? > > > It might perhaps be useful to get a bit more clarity about how we > > > expect the shared results would be used, because that might drive some > > > of the design decisions about the best way to store these "results". > > > > As a requirement, what I am looking for is a way to search for anything > known to the community about failures in test FS/NNN. Here's the thing though. Not all developers have incentives to share. For a while SLE didn't have public expunges, that changed after OpenSUSE Leap 15.3 as it has binary compatibility with SLE15.3 and so the same failures on workflows/fstests/expunges/opensuse-leap/15.3/ are applicable/. It is up to each distro if they wish to share and without a public vehicle to do so why would they, or how would they? For upstream and stable I would hope there is more incentives to share. But again, no shared home ever had existed before. And I don't think there was ever before dialog about sharing a home for these. > Because when I get an alert on a possible regression, that's the fastest > way for me to triage and understand how much effort I should put into > the investigation of that failure and which directions I should look into. > > Right now, I look at the test header comment and git log, I grep the > kdepops expunge lists to look for juicy details and I search lore for > mentions of that test. > > In fact, I already have an auto generated index of lore fstests > mentions in xfs patch discussions [1] that I just grep for failures found > when testing xfs. For LTS testing, I found it to be the best way to > find candidate fix patches that I may have missed. This effort is valuable and thanks for doing all this. > Going forward, we can try to standardize the search and results > format, but for getting better requirements you first need users! As you are witness to it, running fstests against any fs takes a lot of time and patience, and as I have noted, not many have incentives to share. So the best I could do is provide the solution to enable folks to reproduce testing as fast and as easy as possible and let folks who are interested to share, to do so. And obvioulsy at least I did get a major enterprise distro to share some results. Hope others could follow. So I expect the format for sharing then to be lead by those who have a clear incentive to do so. Folks working on upstream or stable stakeholders seem like an obvious candidates. And then it is just volunteer work. Luis ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: sharing fstests results (Was: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)) 2022-06-25 19:35 ` Luis Chamberlain @ 2022-06-25 21:50 ` Theodore Ts'o 2022-07-01 23:13 ` Luis Chamberlain 0 siblings, 1 reply; 17+ messages in thread From: Theodore Ts'o @ 2022-06-25 21:50 UTC (permalink / raw) To: Luis Chamberlain Cc: Amir Goldstein, Darrick J. Wong, Leah Rumancik, Josef Bacik, Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav, Zorro Lang, linux-xfs, fstests On Sat, Jun 25, 2022 at 12:35:50PM -0700, Luis Chamberlain wrote: > > The way the expunge list is process could simply be modified in kdevops > so that non-deterministic tests are not expunged but also not treated as > fatal at the end. But think about it, the exception is if the non-deterministic > failure does not lead to a crash, no? That's what I'm doing today, but once we have a better test analysis system, what I think the only thing which should be excluded is: a) bugs which cause the kernel to crash b) test bugs c) tests which take ***forever*** for a particular configuration (and for which we probably get enough coverage through other configs) If we have a non-deterministic failure, which is due to a kernel bug, I don't see any reason why we should skip the test. We just need to have a fully-featured enough test results analyzer so that we can distinguish between known failures, known flaky failures, and new test regressions. So for example, the new tests generic/681, generic/682, and generic/692 are causing determinsitic failures for the ext4/encrypt config. Right now, this is being tracked manually in a flat text file: generic/68[12] encrypt Failure percentage: 100% The directory does grow, but blocks aren't charged to either root or the non-privileged users' quota. So this appears to be a real bug. Testing shows this goes all the way back to at least 4.14. It's currently not tagged by kernel version, because I mostly only care about upstream. So once it's fixed upstream, I stop caring about it. In the ideal world, we'd track the kernel commit which fixed the test failure, and when the fix propagated to the various stable kernels, etc. I've also resisted putting it in an expunge file, since if it did, I would ignore it forever. If it stays in my face, I'm more likely to fix it, even if it's on my personal time. > Here's the thing though. Not all developers have incentives to share. Part of this is the amount of *time* that it takes to share this information. Right now, a lot of sharing takes place on the weekly ext4 conference call. It doesn't take Eric Whitney a lot of time to mention that he's seeing a particular test failure, and I can quickly search my test summary Unix mbox file and say, "yep, I've seen this fail a couple of times before, starting in February 2020 --- but it's super rare." And since Darrick attends the weekly ext4 video chats, once or twice we've asked him about some test failures on some esoteric xfs config, such as realtime with an external logdev, and he might say, "oh yeah, that's a known test bug. pull this branch from my public xfstests tree, I just haven't had time to push those fixes upstream yet." (And I don't blame him for that; I just recently pushed some ext4 test bug fixes, some of which I had initially sent to the list in late April --- but on code review, changes were requested, and I just didn't have *time* to clean up fixes in response to the code reviews. So the fix which was good enough to suppress the failures sat in my tree, but didn't go upstream since it was deemed not ready for upstream. I'm all for decreasing tech debt in xfstests; but do understand that sometimes this means fixes to known test bugs will stay in developers' git trees, since we're all overloaded.) It's a similar problem with test failures. Simply reporting a test failure isn't *that* hard. But the analysis, even if it's something like: generic/68[12] encrypt Failure percentage: 100% The directory does grow, but blocks aren't charged to either root or the non-privileged users' quota..... ... is the critical bit that people *really* want, and it takes real developer time to come up with that kind of information. In the ideal world, I'd have an army of trained minions to run down this kind of stuff. In the real world, sometimes this stuff happens after midnight, local time, on a Friday night. (Note that Android and Chrome OS, both of which are big users of fscrypt, don't use quota. So If I were to open a bug tracker entry on it, the bug would get prioritized to P2 or P3, and never be heard from again, since there's no business reason to prioritize fixing it. Which is why some of this happens on personal time.) - Ted ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: sharing fstests results (Was: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)) 2022-06-25 21:50 ` Theodore Ts'o @ 2022-07-01 23:13 ` Luis Chamberlain 0 siblings, 0 replies; 17+ messages in thread From: Luis Chamberlain @ 2022-07-01 23:13 UTC (permalink / raw) To: Theodore Ts'o Cc: Amir Goldstein, Darrick J. Wong, Leah Rumancik, Josef Bacik, Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav, Zorro Lang, linux-xfs, fstests On Sat, Jun 25, 2022 at 05:50:26PM -0400, Theodore Ts'o wrote: > On Sat, Jun 25, 2022 at 12:35:50PM -0700, Luis Chamberlain wrote: > > Here's the thing though. Not all developers have incentives to share. > > Part of this is the amount of *time* that it takes to share this > information. There's many reasons. In the end we keep digressing, but I see no expressed interest to share, and so we can just keep on moving with how things are. Luis ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) 2022-06-22 0:07 ` [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) Luis Chamberlain 2022-06-22 21:44 ` Theodore Ts'o @ 2022-06-22 21:52 ` Leah Rumancik 2022-06-23 21:40 ` Luis Chamberlain 1 sibling, 1 reply; 17+ messages in thread From: Leah Rumancik @ 2022-06-22 21:52 UTC (permalink / raw) To: Luis Chamberlain Cc: Amir Goldstein, Josef Bacik, Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav, linux-xfs, fstests On Tue, Jun 21, 2022 at 05:07:10PM -0700, Luis Chamberlain wrote: > On Thu, Jun 16, 2022 at 11:27:41AM -0700, Leah Rumancik wrote: > > https://gist.github.com/lrumancik/5a9d85d2637f878220224578e173fc23. > > The coverage for XFS is using profiles which seem to come inspired > by ext4's different mkfs configurations. The configs I am using for the backports testing were developed with Darrick's help. If you guys agree on a different set of configs, I'd be happy to update my configs moving forward. As there has been testing of these patches on both 5.10 with those configs as well as on 5.15 with my configs, I don't think this should be blocking for this set of patches. - Leah > > Long ago (2019) I had asked we strive to address popular configurations > for XFS so that what would be back then oscheck (now kdevops) can cover > them for stable XFS patch candidate test consideration. That was so long > ago no one should be surprised you didn't get the memo: > > https://lkml.kernel.org/r/20190208194829.GJ11489@garbanzo.do-not-panic.com > > This has grown to cover more now: > > https://github.com/linux-kdevops/kdevops/blob/master/playbooks/roles/fstests/templates/xfs/xfs.config > > For instance xfs_bigblock and xfs_reflink_normapbt. > > My litmus test back then *and* today is to ensure we have no regressions > on the test sections supported by kdevops for XFS as reflected above. > Without that confidence I'd be really reluctant to support stable > efforts. > > If you use kdevops, it should be easy to set up even if you are not > using local virtualization technologies. For instance I just fired > up an AWS cloud m5ad.4xlarge image which has 2 nvme drives, which > mimics the reqs for the methodology of using loopback files: > > https://github.com/linux-kdevops/kdevops/blob/master/docs/seeing-more-issues.md > > GCE is supported as well, so is Azure and OpenStack, and even custom > openstack solutions... > > Also, I see on the above URL you posted there is a TODO in the gist which > says, "find a better route for publishing these". If you were to use > kdevops for this it would have the immediate gain in that kdevops users > could reproduce your findings and help augment it. > > However if using kdevops as a landing home for this is too large for you, > we could use a new git tree which just tracks expunges and then kdevops can > use it as a git subtree as I had suggested at LSFMM. The benefit of using a > git subtree is then any runner can make use of it. And note that we > track both fstests and blktests. > > The downside is for kdevops to use a new git subtree is just that kdevops > developers would have to use two trees to work on, one for code changes just > for kdevops and one for the git subtree for expunges. That workflow would be > new. I don't suspect it would be a really big issue other than addressing the > initial growing pains to adapt. I have used git subtrees before extensively > and the best rule of thumb is just to ensure you keep the code for the git > subtree in its own directory. You can either immediately upstream your > delta or carry the delta until you are ready to try to push those > changes. Right now kdevops uses the directory workflows/fstests/expunges/ > for expunges. Your runner could use whatever it wishes. > > We should discuss if we just also want to add the respective found > *.bad, *.dmesg *.all files for results for expunged entries, or if > we should be pushing these out to a new shared storage area. Right now > kdevops keeps track of results in the directory workflows/fstests/results/ > but this is a path on .gitignore. If we *do* want to use github and a > shared git subtree perhaps a workflows/fstests/artifacts/kdevops/ would > make sense for the kdevops runner ? Then that namespace allows other > runners to also add files, but we all share expunges / tribal knowledge. > > Thoughts? > > Luis ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) 2022-06-22 21:52 ` [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) Leah Rumancik @ 2022-06-23 21:40 ` Luis Chamberlain 0 siblings, 0 replies; 17+ messages in thread From: Luis Chamberlain @ 2022-06-23 21:40 UTC (permalink / raw) To: Leah Rumancik Cc: Amir Goldstein, Josef Bacik, Chuck Lever, chandanrmail, Sweet Tea Dorminy, Pankaj Raghav, linux-xfs, fstests On Wed, Jun 22, 2022 at 02:52:18PM -0700, Leah Rumancik wrote: > On Tue, Jun 21, 2022 at 05:07:10PM -0700, Luis Chamberlain wrote: > > On Thu, Jun 16, 2022 at 11:27:41AM -0700, Leah Rumancik wrote: > > > https://gist.github.com/lrumancik/5a9d85d2637f878220224578e173fc23. > > > > The coverage for XFS is using profiles which seem to come inspired > > by ext4's different mkfs configurations. > The configs I am using for the backports testing were developed with > Darrick's help. Sorry for the noise then. > If you guys agree on a different set of configs, I'd be > happy to update my configs moving forward. Indeed it would be great to unify on target test configs at the very least. Luis ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2022-07-01 23:14 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20220616182749.1200971-1-leah.rumancik@gmail.com>
2022-06-22 0:07 ` [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) Luis Chamberlain
2022-06-22 21:44 ` Theodore Ts'o
2022-06-23 5:31 ` Amir Goldstein
2022-06-23 21:39 ` Luis Chamberlain
2022-06-23 21:31 ` Luis Chamberlain
2022-06-24 5:32 ` Theodore Ts'o
2022-06-24 22:54 ` Luis Chamberlain
2022-06-25 2:21 ` Theodore Ts'o
2022-06-25 18:49 ` Luis Chamberlain
2022-06-25 21:14 ` Theodore Ts'o
2022-07-01 23:08 ` Luis Chamberlain
2022-06-25 7:28 ` sharing fstests results (Was: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)) Amir Goldstein
2022-06-25 19:35 ` Luis Chamberlain
2022-06-25 21:50 ` Theodore Ts'o
2022-07-01 23:13 ` Luis Chamberlain
2022-06-22 21:52 ` [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) Leah Rumancik
2022-06-23 21:40 ` Luis Chamberlain
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox