From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: "Darrick J. Wong" <djwong@kernel.org>, Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org, fstests@vger.kernel.org
Subject: Re: [PATCH] generic: check if one fs can detect damage at/after fs thaw
Date: Thu, 20 Oct 2022 06:52:36 +0800 [thread overview]
Message-ID: <cb39519d-8e31-5f39-71fe-ebb0a886780e@gmx.com> (raw)
In-Reply-To: <Y1AZl3o+iSqNZgMw@magnolia>
On 2022/10/19 23:36, Darrick J. Wong wrote:
> On Wed, Oct 19, 2022 at 01:29:55PM +0800, Qu Wenruo wrote:
>> [BACKGROUND]
>> There is bug report from btrfs mailing list that, hiberation can allow
>
> "hibernation".
>
>> one to modify the frozen filesystem unexpectedly (using another OS).
>> (https://lore.kernel.org/linux-btrfs/83bf3b4b-7f4c-387a-b286-9251e3991e34@bluemole.com/)
>>
>> Later btrfs adds the check to make sure the fs is not changed
>> unexpectedly, to prevent corruption from happening.
>>
>> [TESTCASE]
>> Here the new test case will create a basic filesystem, fill it with
>> something by using fsstress, then sync the fs, and finally freeze the fs.
>>
>> Then corrupt the whole fs by overwriting the block device with 0xcd
>> (default seed from xfs_io pwrite command).
>>
>> Finally we thaw the fs, and try if we can create a new file.
>>
>> for EXT4, it will detect the corruption at touch time, causing -EUCLEAN.
>
> Heh, yikes. That's pretty scary for ext4 since it still uses buffer
> heads from the block device to read/store metadata and older kernels are
> known to have crashing problems if (say) the feature bits in the primary
> superblock get changed.
>
> I wonder if this should force errors=remount-ro for ext4 since
> errors=continue is dangerous and erorrs=panic will crash the test
> machine.
>
>> For Btrfs, it will detect the corruption at thaw time, marking the
>> fs RO immediately, and later touch will return -EROFS.
>
> What /does/ btrfs check, specifically?
- Read sb without using cache
- The same mount time sanity checks on the superblock
Which already implies an fsid check.
- Extra generation check
To make sure no one has touched out cake.
> Reading this makes me wonder if
> xfs shouldn't re-read its primary super on thaw to check that nobody ran
> us over with a backhoe, though that wouldn't help us in the hibernation
> case. (Or does it? Is userspace/systemd finally smart enough to freeze
> filesystems?)
I doubt if userspace/systemd is that smart, because the error report is
running not-that-old distro.
Especially for hibernation there is really no way for anyone to know if
our cakes are touched.
>
>> For XFS, it will detect the corruption at touch time, return -EUCLEAN.
>> (Without the cache drop, XFS seems to be very happy using the cache info
>> to do the work without any error though.)
>
> Yep.
>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>> tests/generic/702 | 61 +++++++++++++++++++++++++++++++++++++++++++
>> tests/generic/702.out | 2 ++
>> 2 files changed, 63 insertions(+)
>> create mode 100755 tests/generic/702
>> create mode 100644 tests/generic/702.out
>>
>> diff --git a/tests/generic/702 b/tests/generic/702
>> new file mode 100755
>> index 00000000..fc3624e1
>> --- /dev/null
>> +++ b/tests/generic/702
>> @@ -0,0 +1,61 @@
>> +#! /bin/bash
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Copyright (C) 2022 SUSE Linux Products GmbH. All Rights Reserved.
>> +#
>> +# FS QA Test 702
>> +#
>> +# Test if the filesystem can detect the underlying disk has changed at
>> +# thaw time.
>> +#
>> +. ./common/preamble
>> +. ./common/filter
>> +_begin_fstest freeze quick
>> +
>> +# real QA test starts here
>> +
>> +_supported_fs generic
>> +_fixed_by_kernel_commit a05d3c915314 \
>> + "btrfs: check superblock to ensure the fs was not modified at thaw time"
>
> Hmmm, it's not very useful for a test failure on (say) xfs spitting
> out a message about how this "may" get fixed with a btrfs patch. How
> about:
>
> $FSTYP = btrfs && _fixed_by_kernel_commit a05d3c915314 \
> "btrfs: check superbloc..."
That sounds pretty good.
>
>> +
>> +# We will corrupt the device completely, thus should not check it after the test.
>> +_require_scratch_nocheck
>> +_require_freeze
>> +
>> +# Limit the fs to 512M so we won't waste too much time screwing it up later.
>> +_scratch_mkfs_sized $((512 * 1024 * 1024)) >> $seqres.full 2>&1
>> +_scratch_mount
>> +
>> +# Populate the fs with something.
>> +$FSSTRESS_PROG -n 500 -d $SCRATCH_MNT >> $seqres.full
>> +
>> +# Sync to make sure no dirty journal
>> +sync
>> +
>> +# Drop all cache, so later write will need to read from disk, increasing
>> +# the chance of detecting the corruption.
>> +echo 3 > /proc/sys/vm/drop_caches
>> +
>> +$XFS_IO_PROG -x -c "freeze" $SCRATCH_MNT
>> +
>> +# Now screw up the block device
>> +$XFS_IO_PROG -f -c "pwrite 0 512M" -c sync $SCRATCH_DEV >> $seqres.full
>
> directio and a larger buffer size to speed this up? e.g.
>
> $XFS_IO_PROG -d -c 'pwrite -b 1m 0 512M' -c sync $SCRATCH_DEV
I guess no need for directio especially we're doing a sync after the write.
Although larger blocksize may only help a little considering by default
it's already buffered write.
>
>> +
>> +# Thaw the fs, it may or may not report error, we will check it manually later.
>> +$XFS_IO_PROG -x -c "thaw" $SCRATCH_MNT
>
> I'm a little surprised you don't check for btrfs returning an error
> here...?
Great you have asked!
This is the special pitfall related to thaw error handling.
If we return an error for .unfreeze_fs hook, the VFS treats it as we
failed to thaw the fs, and will still consider the fs frozen.
Thus for now, btrfs only output error message into dmesg during thaw,
but always return 0 to workaround it.
We may want a better way for .unfreeze_fs hook to distinguish between
"something really went wrong, but please consider it unfreezed" and
"nope, please keep it frozen".
Thanks,
Qu
>
>> +# If the fs detects something wrong, it should trigger error now.
>> +# We don't use the error message as golden output, as btrfs and ext4 use
>> +# different error number for different reasons.
>> +# (btrfs detects the change immediately at thaw time and mark the fs RO, thus
>> +# touch returns -EROFS, while ext4 detects the change at journal write time,
>> +# returning -EUCLEAN).
>> +touch $SCRATCH_MNT/foobar >>$seqres.full 2>&1
>> +if [ $? -eq 0 ]; then
>> + echo "Failed to detect corrupted fs"
>> +else
>> + echo "Detected corrupted fs (expected)"
>> +fi
>
> But otherwise this test looks reasonable so far.
>
> --D
>
>> +
>> +# success, all done
>> +status=0
>> +exit
>> diff --git a/tests/generic/702.out b/tests/generic/702.out
>> new file mode 100644
>> index 00000000..c29311ff
>> --- /dev/null
>> +++ b/tests/generic/702.out
>> @@ -0,0 +1,2 @@
>> +QA output created by 702
>> +Detected corrupted fs (expected)
>> --
>> 2.38.0
>>
next prev parent reply other threads:[~2022-10-19 22:54 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-19 5:29 [PATCH] generic: check if one fs can detect damage at/after fs thaw Qu Wenruo
2022-10-19 15:36 ` Darrick J. Wong
2022-10-19 22:52 ` Qu Wenruo [this message]
2022-10-20 0:11 ` Darrick J. Wong
2022-10-20 3:02 ` Theodore Ts'o
2022-10-20 3:20 ` Qu Wenruo
2022-10-20 15:00 ` Zorro Lang
2022-10-20 22:59 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cb39519d-8e31-5f39-71fe-ebb0a886780e@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=djwong@kernel.org \
--cc=fstests@vger.kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox