Re: [PATCH 2/2] fstests: add configuration option for executing post mkfs commands

public inbox for fstests@vger.kernel.org
 help / color / mirror / Atom feed

From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Dave Chinner <david@fromorbit.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
	Anand Jain <anand.jain@oracle.com>,
	fstests@vger.kernel.org, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 2/2] fstests: add configuration option for executing post mkfs commands
Date: Sat, 7 Oct 2023 13:15:10 +1030	[thread overview]
Message-ID: <b82e99c6-af98-4896-894b-dde6e43ca7dc@gmx.com> (raw)
In-Reply-To: <ZSCGUUtCY5AsmWaO@dread.disaster.area>

On 2023/10/7 08:42, Dave Chinner wrote:
> On Fri, Oct 06, 2023 at 05:16:31PM +1030, Qu Wenruo wrote:
>> However for the whole btrfs/fstests combination, we have several
>> features which can not be easily integrated into fstests.
>>
>> The biggest example is multi-device management.
>>
>> For now, only some btrfs specific test cases are utilizing
>> SCRATCH_DEV_POOL to cover multi-device functionality (including all the
>> RAID and seed support).
>> This means way less coverage for seed and btrfs RAID, all generic group
>> would not utilize btrfs RAID/seed functionality at all.
>
> IOWs, you are saying that the btrfs device setup code in fstests is
> functionally deficient.

It always needs the test case to utilize the pool, and choose mkfs
profiles, to proper enable different profiles.

For seed device, it need the test case to enable seed feature, then add
a new device to allow seed sprout.

Thus none of the generic group can utilize them.

>
>>
>> For a better coverage, or for more complex setup (maybe dm-dust for XFS
>> log device?), I am not that convinced if the current plain mkfs options
>> is good enough.
>
> We already know mkfs alone isn't sufficent - that's why we have
> filesystem specific mkfs fucntions for any filesystem that needs to
> do something more complex than run mkfs....

Still not enough for above seed sprout, or to utilize the pool by default.

Sure, you can go the existing environment variables, but that would lead
to other problems explained later.

>
> i.e. we already have infrastructure that we use to solve this
> problem - there are example implementations that you can look at to
> follow.
>
>>
>> Thus I'm more interested in exploring the possibility to "out-source"
>> those basic functionality (from mkfs to check) to outside scripts, as
>> we're not that far away to hit the limits of the existing framework. (At
>> least for btrfs)
>
> The whole idea that we set up devices for testing via magic,
> undocumented, private external scripts is antithetical to the
> purpose of fstests. The device model used in fstests is that you tell it
> what configuration you want, and it does all the work to set them up
> that way. This allows tests to override or skip incompatible
> configurations based on known config variables, etc.

Nope, the "private/closed-source" is only optional.

We would still provide something like this:

mkfs.avail/
|- xfs.sh
|- xfs_external_log.sh
|- btrfs_single.sh
|- btrfs_multi.sh

fsck.avail/
|- xfs.sh
|- btrfs_check_data_csum.sh
|- btrfs.sh

mount.avail/
|- xfs.sh
|- btrfs.sh
|- btrfs_compression.sh

config/
|- mkfs.sh -> ../mkfs.avail/btrfs_single.sh
|- check.sh -> ../fsck.avail/btrfs.sh
|- mount.sh -> ../mount.avail/btrfs_compression.sh

Those basic ones in *.avail/ should still be open-sourced, and managed
by fstests.

It's end users' freedom to open or hide their scripts, but if they
choose to hide, then all the reproducibility problem and maintenance
burden are all on their own.

>
> It also allows -everyone- to test complex configurations without
> needing to share private, external scripts or knowing any of the
> intricate details needed to set up that configuration. External
> scripts are like proprietary code - it only works if you have some
> magic secret sauce that nobody else knows about.

Aren't there more than enough undocumented environment variables already
in common/config?

It's no different than those separate scripts, and I can also argue
those scripts would have a better naming than `common/config` or
`common/rc`.

>
> If it's hard to set something up in fstests, then *fix that
> problem*. If you are adding code in environment variables and
> hacking in environment varaibles to run that code, then the -code
> itself- should be in fstests.

It's not possible unless we're going to update every generic test cases
to let them specify whatever special setup they want to use for btrfs.

As mentioned, for seed devices, we always need to add a device to do the
sprout, and for multi-devices, we need to specify the number of devices
and profiles at least.

Meanwhile generic tests just go "_scratch_mkfs" and "_scratch_mount",
unless we can override them, it's not that simple.

And if we want to override them, then I see no reason not to go external
scripts to override those functions.
At least much cleaner than export whatever complex environment variables
and involved parsers for them.

>
> Having the code in fstests means that anyone can add
> "BTRFS_SCRATCH_UUID='<uuid>' to their config file to change uuids
> for the devices being tested. They don't need to know waht magic
> command is needed to do this, when it needs to be set, what changes
> elsewhere in fstests they need to watch out for, which tests is
> might conflict with, etc.
>
> Hiding this in some custom script means it can't be easily
> documented,

Nor are those special environment variables.
The "SCRATCH_DEV_POOL" is already not that well documented in
"common/config".

Complexity is unavoidable, but if we want to make simple things complex
or simple is what we can choose.

> can't be easily or widely replicated,

If your setup is using some complex LVM/DM setup, and you just share
your config as:

SCRATCH_DEV="/dev/dm-2"
TEST_DEV="/dev/dm-3"

I don't see it's any different.

In fact, if they share a script like "mkfs.avail/xfs_complex_lvm.sh", it
would be much more clear.

This just shows another point, your existing simpleness is based on the
point that you rely on dm/fs layer to do a lot setup work already.

That's already not part of the fstests, and all the problems can also
apply here.

> it can't be
> discovered by reading the fstests code, and it isn't obvious to
> -anyone- that it is part of the btrfs test matrix that needs to be
> exercised.

Nor the dm setup case either.

And I already said, the external scripts can be part of fstests.
But I also allow end users to hide their scripts for whatever reasons,
it would be recommended for them to open-source (and merged if we see a
real wide benefits), for maintenance or reproducibility reasons, but
that's not mandatory.

>
> IOWs, it's just really bad QA architecture to externalise random
> parts of the test environment configuration.  If the configuration
> needs to be tested, then the infrastructure should support that
> directly and it should be easily discoverable and used by people
> largely unfamiliar with btrfs volume management (i.e. typical distro
> QA environment).

I won't be surprised that "mkfs.avail/btrfs_single.sh" is more readable
than jumping between "common/config", "common/btrfs", "common/rc" or
whatever other files.

>
>>> I suppose the problem there is that mkfs.btrfs won't itself create a
>>> filesystem with the metadata_uuid field that doesn't match the other
>>> uuid?
>>
>> That's not a big deal, we (at least me) are very open to add this mkfs
>> feature.
>>
>> But there are other limits, like the fsck part.
>>
>> For now, btrfs follows the behavior of other fses, just check the
>> correctness of the metadata, and ignore the correctness of data.
>>
>> But remember btrfs has data checksum by default, thus it can easily
>> verify the data too, and we have the extra switch ("--check-data-csum"
>> option) to enable that for "btrfs check".
>
> Which is yet another arguement for the code being in fstests and
> controlled by an environment variable.
>
> This is *exactly* the case for the LARGE_SCRATCH_DEV stuff that ext4
> and XFS support in the mkfs routines. On the XFS side we have
> LARGE_SCRATCH_DEV checks in -both- the XFS mkfs and check/repair
> functions to handle this configuration correctly.

If LARGE_SCRATCH_DEV feature also implies verifying data checksum during
fsck, I'm strongly wondering if any end user would be happy when fsck a
10TB fs and waiting hours, just after a unexpected powerloss.

I can also go with cases like compression feature, bounding a feature to
mkfs flag or offline tuning, is not flex nor end user friendly.

Yes, for some cases, paired fs features are good, especially for
fstests, but sometimes it's not.

(Although for the very initial intention of this patchset, I still
believe we need "mkfs.btrfs --metadata-uuid" option, that problem itself
is not worthy all the hassle)

That's why we allow end users to choose if they want to verify data
checksum at fsck time, just as an example.

>
> IOWs, what you want to do is add a config variable for
> BTFS_SCRATCH_CHECK_DATA, and trigger off that in all btrfs specific
> functions that need to add, modify or check data checksums.

Yes, for this check-data-csum case, it's possible to go environment
variables.

But more and more variables are just also going undocumented, just as
you worried for external scripts.

>
>> For now we're not going to enable the "--check-data-csum" option nor we
>> have the ability to teach fstests how to change the behavior.
>
> We most certainly do have the ability to do this in fstests, and
> quite easily.
>
> Another example is the USE_EXTERNAL variable that tells XFS and ext4
> that external log devices (and rt devices for XFS) are to be used.
> This has hooks all over mkfs, mount, check, repair, xfs_db, quota
> and fs population functions so that they all specify devices
> appropriately.
>
> That is, this config variable directly modifies the command lines
> used for these operations - it is an even better example of FS
> specific device configuration driving by config variables than
> LARGE_SCRATCH_DEV.  This model will work just fine for stuff like
> the --check-data-csum btrfs specific check option being talked about
> here, and the only thing that needs to change is the btrfs specific
> check/repair functions...

I have already explained, sometimes end users really want to choose
between checking just several megabytes of metadata, and checking
several terabytes of data.

Thus paired and on-disk flags is not always the best solution for real
world usage.

>
>> Thus I'm taking the chance to explore any way to "out-source" those
>> mkfs/fsck functionality, even this means other fses may not even bother
>> as the current framework just works good enough for them.
>
> And as I said above, that's the wrong model for fstests - it means
> that a typical QA environment is not going to be able to test
> complex things because the people running the tests do not know how
> to write these complex "out-sourced" scripts to configure the test
> environment.

See my "TEST_DEV=/dev/dm-3" vs "mkfs.avail/xfs_lvm_luks.sh" case.

>
> Having all the code in fstests and triggering it via a config
> variable is the right way to do this sort of thing. It works for
> everyone and it's easy to replicate the test environment and
> configurations for reproduction of issues that are found.

Mentioned already, the script can be managed by fstests, either as an
example (need users to modify a little) or guaranteed/recommended test
combinations.

>
> If the test envirnoment is dependent on private scripts for
> configuration and reproduction of issues, then how do other people
> reproduce the problems you might find? Yeah, you have to share all
> your scripts for everyone to run, and at that point the code
> actually needs to be in fstests itself because it's proven to be a
> useful test configuration that everyone should be running....

The existing one is already dependent on the black box block device
provided by end users.

>
>> But IIRC, even f2fs is gaining multi-device support, I believe this is
>> not a btrfs specific thing, but a framework limitation.
>
> The scratch dev pool was an easy extension to support multi-device
> btrfs filesystems done in the really early days when there was
> almost zero btrfs specific test coverage in fstests. I'm not
> surprised that it has warts and may not do everything that btrfs
> developers might need these days.
>
> However, we don't need custom hooks to externalise scripts - we
> already have a working model for config driven filesystem specific
> device configuration. I don't see that there is any major common
> infrastructure change needed, most of what I'm hearing is that the
> btrfs specific device configuration needs to catch up with how other
> filesystems have been testing complex device configurations....

External scripts make overriding _scratch_mkfs() and fsck much easier,
and can still be managed by fstests.

The idea of "external" scripts is to make simple things simple, if your
setup/fs doesn't need complex setup, your mount.sh/mkfs.sh/check.sh
would just be one line for your fs.

Meanwhile if you want to go complex, you have all the freedom, while not
to make other code complex/bloated.

We can even move a lot of notrun checks into the special scripts, making
most test cases just to care about their workload on a very basic setup.
Let the complex setup to check if they are really suitable for that test
case.

Sure there would be some complexity in the communication, but I still
believe this would make most test cases/infrastructure simpler.

Thanks,
Qu

>
> Cheers,
>
> Dave.

next prev parent reply	other threads:[~2023-10-07  2:45 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-28  4:23 [PATCH 0/2] fstests: add config option to run after mkfs Anand Jain
2023-09-28  4:23 ` [PATCH 1/2] fstests: btrfs streamlining mkfs command for post-mkfs operations Anand Jain
2023-09-28  4:23 ` [PATCH 2/2] fstests: add configuration option for executing post mkfs commands Anand Jain
2023-09-28  4:26   ` Qu Wenruo
2023-09-28  5:34     ` Anand Jain
2023-09-28  7:40       ` Qu Wenruo
2023-10-06  5:17         ` Dave Chinner
2023-10-09 12:18           ` Anand Jain
2023-10-06  6:09         ` Darrick J. Wong
2023-10-06  6:46           ` Qu Wenruo
2023-10-06 22:12             ` Dave Chinner
2023-10-07  2:45               ` Qu Wenruo [this message]
2023-10-09 12:23           ` Anand Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b82e99c6-af98-4896-894b-dde6e43ca7dc@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=anand.jain@oracle.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=fstests@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox