Re: [PATCH v2] fstests: btrfs: a new test case to verify a use-after-free bug

FS/XFS testing framework
 help / color / mirror / Atom feed

From: Anand Jain <anand.jain@oracle.com>
To: Qu Wenruo <wqu@suse.com>,
	linux-btrfs@vger.kernel.org, fstests@vger.kernel.org
Subject: Re: [PATCH v2] fstests: btrfs: a new test case to verify a use-after-free bug
Date: Mon, 26 Aug 2024 21:55:29 +0800	[thread overview]
Message-ID: <9feb3f6e-b682-4978-9d35-3f5176d96e38@oracle.com> (raw)
In-Reply-To: <20240824103021.264856-1-wqu@suse.com>

On 24/8/24 6:30 pm, Qu Wenruo wrote:
> [BUG]
> There is a use-after-free bug triggered very randomly by btrfs/125.
> 
> If KASAN is enabled it can be triggered on certain setup.
> Or it can lead to crash.
> 
> [CAUSE]
> The test case btrfs/125 is using RAID5 for metadata, which has a known
> RMW problem if the there is some corruption on-disk.
> 
> RMW will use the corrupted contents to generate a new parity, losing the
> final chance to rebuild the contents.
> 
> This is specific to metadata, as for data we have extra data checksum,
> but the metadata has extra problems like possible deadlock due to the
> extra metadata read/recovery needed to search the extent tree.
> 
> And we know this problem for a while but without a better solution other
> than avoid using RAID56 for metadata:
> 
>>    Metadata
>>        Do not use raid5 nor raid6 for metadata. Use raid1 or raid1c3
>>        respectively.
> 
> Combined with the above csum tree corruption, since RAID5 is stripe
> based, btrfs needs to split its read bios according to stripe boundary.
> And after a split, do a csum tree lookup for the expected csum.
> 
> But if that csum lookup failed, in the error path btrfs doesn't handle
> the split bios properly and lead to double freeing of the original bio
> (the one containing the bio vectors).
> 
> [NEW TEST CASE]
> Unlike the original btrfs/125, which is very random and picky to
> reproduce, introduce a new test case to verify the specific behavior by:
> 
> - Create a btrfs with enough csum leaves
>    To bump the csum tree level, use the minimal nodesize possible (4K).
>    Writing 32M data which needs at least 8 leaves for data checksum
> 
> - Find the last csum tree leave and corrupt it
> 
> - Read the data many times until we trigger the bug or exit gracefully
>    With an x86_64 VM (which is never able to trigger btrfs/125 failure)
>    with KASAN enabled, it can trigger the KASAN report in just 4
>    iterations (the default iteration number is 32).
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> Changelog:
> v2:
> - Fix the wrong commit hash
>    The proper fix is not yet merged, the old hash is a place holder
>    copied from another test case but forgot to remove.
> 
> - Minor wording update
> 
> - Add to "dangerous" group
> ---
>   tests/btrfs/319     | 84 +++++++++++++++++++++++++++++++++++++++++++++
>   tests/btrfs/319.out |  2 ++
>   2 files changed, 86 insertions(+)
>   create mode 100755 tests/btrfs/319
>   create mode 100644 tests/btrfs/319.out
> 
> diff --git a/tests/btrfs/319 b/tests/btrfs/319
> new file mode 100755
> index 00000000..4be2b50b
> --- /dev/null
> +++ b/tests/btrfs/319
> @@ -0,0 +1,84 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (C) 2024 SUSE Linux Products GmbH. All Rights Reserved.
> +#
> +# FS QA Test 319
> +#
> +# Make sure data csum lookup failure will not lead to double bio freeing
> +#
> +. ./common/preamble
> +_begin_fstest auto quick dangerous
> +
> +_require_scratch
> +_fixed_by_kernel_commit xxxxxxxxxxxx \
> +	"btrfs: fix a use-after-free bug when hitting errors inside btrfs_submit_chunk()"
> +
> +# The final fs will have a corrupted csum tree, which will never pass fsck
> +_require_scratch_nocheck
> +_require_scratch_dev_pool 2
> +
> +# Use RAID0 for data to get bio splitted according to stripe boundary.
> +# This is required to trigger the bug.


> +_check_btrfs_raid_type raid0

Did you mean to use _require_btrfs_raid_type(raid0)? Otherwise,
the line has no effect since you're not checking
_check_btrfs_raid_type's return value.

The rest looks good.

Thx.
Anand

> +
> +# This test goes 4K sectorsize and 4K nodesize, so that we easily create
> +# higher level of csum tree.
> +_require_btrfs_support_sectorsize 4096
> +
> +# The bug itself has a race window, run this many times to ensure triggering.
> +# On an x86_64 VM with KASAN enabled, normally it is triggered before the 10th run.
> +runtime=32
> +
> +_scratch_pool_mkfs "-d raid0 -m single -n 4k -s 4k" >> $seqres.full 2>&1
> +# This test requires data checksum to trigger the bug.
> +_scratch_mount -o datasum,datacow
> +
> +# For the smallest csum size (CRC32C) it's 4 bytes per 4K, writing 32M data
> +# will need 32K data checksum at minimal, which is at least 8 leaves.
> +_pwrite_byte 0xef 0 32m "$SCRATCH_MNT/foobar" > /dev/null
> +sync
> +_scratch_unmount
> +
> +# Search for the last leaf of the csum tree, that will be the target to destroy.
> +$BTRFS_UTIL_PROG inspect dump-tree -t csum $SCRATCH_DEV >> $seqres.full
> +target_bytenr=$($BTRFS_UTIL_PROG inspect dump-tree -t csum $SCRATCH_DEV | grep "leaf.*flags" | sort | tail -n1 | cut -f2 -d\ )
> +
> +if [ -z "$target_bytenr" ]; then
> +	_fail "unable to locate the last csum tree leave"
> +fi
> +
> +echo "bytenr of csum tree leave to corrupt: $target_bytenr" >> $seqres.full
> +
> +# Corrupt that csum tree block.
> +physical=$(_btrfs_get_physical "$target_bytenr" 1)
> +dev=$(_btrfs_get_device_path "$target_bytenr" 1)
> +
> +echo "physical bytenr: $physical" >> $seqres.full
> +echo "physical device: $dev" >> $seqres.full
> +
> +_pwrite_byte 0x00 "$physical" 4 "$dev" > /dev/null
> +
> +for (( i = 0; i < $runtime; i++ )); do
> +	echo "=== run $i/$runtime ===" >> $seqres.full
> +	_scratch_mount -o ro
> +	# Since the data is on RAID0, read bios will be split at the stripe
> +	# (64K sized) boundary. If csum lookup failed due to corrupted csum
> +	# tree, there is a race window that can lead to double bio freeing
> +	# (triggering KASAN at least).
> +	cat "$SCRATCH_MNT/foobar" &> /dev/null
> +	_scratch_unmount
> +
> +	# Manually check the dmesg for "BUG", and do not call _check_dmesg()
> +	# as it will clear 'check_dmesg' file and skips the final check after
> +	# the test.
> +	# For now just focus on the "BUG:" line from KASAN.
> +	if _check_dmesg_for "BUG" ; then
> +		_fail "Critical error(s) found in dmesg"
> +	fi
> +done
> +
> +echo "Silence is golden"
> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/btrfs/319.out b/tests/btrfs/319.out
> new file mode 100644
> index 00000000..d40c929a
> --- /dev/null
> +++ b/tests/btrfs/319.out
> @@ -0,0 +1,2 @@
> +QA output created by 319
> +Silence is golden

     prev parent reply	other threads:[~2024-08-26 13:55 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-24 10:30 [PATCH v2] fstests: btrfs: a new test case to verify a use-after-free bug Qu Wenruo
2024-08-26 12:14 ` Filipe Manana
2024-08-26 12:45   ` Filipe Manana
2024-08-26 22:15   ` Qu Wenruo
2024-08-26 13:55 ` Anand Jain [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9feb3f6e-b682-4978-9d35-3f5176d96e38@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=fstests@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox