From: Eryu Guan <guaneryu@gmail.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>, bfoster@redhat.com
Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org
Subject: Re: [PATCH 1/1] xfs: check for COW overflows in i_delayed_blks
Date: Tue, 25 Jun 2019 10:53:51 +0800 [thread overview]
Message-ID: <20190625025351.GP15846@desktop> (raw)
In-Reply-To: <156089205776.347309.3970978868590991055.stgit@magnolia>
On Tue, Jun 18, 2019 at 02:07:37PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> With the new copy on write functionality it's possible to reserve so
> much COW space for a file that we end up overflowing i_delayed_blks.
> The only user-visible effect of this is to cause totally wrong i_blocks
> output in stat, so check for that.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Test looks fine to me and it runs well too.
Brian, really appreciate if you could help review this new version again
as well!
Thanks,
Eryu
> ---
> tests/xfs/907 | 199 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> tests/xfs/907.out | 8 ++
> tests/xfs/group | 1
> 3 files changed, 208 insertions(+)
> create mode 100755 tests/xfs/907
> create mode 100644 tests/xfs/907.out
>
>
> diff --git a/tests/xfs/907 b/tests/xfs/907
> new file mode 100755
> index 00000000..92cd0399
> --- /dev/null
> +++ b/tests/xfs/907
> @@ -0,0 +1,199 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0+
> +# Copyright (c) 2019 Oracle, Inc. All Rights Reserved.
> +#
> +# FS QA Test No. 907
> +#
> +# Try to overflow i_delayed_blks by setting the largest cowextsize hint
> +# possible, creating a sparse file with a single byte every cowextsize bytes,
> +# reflinking it, and retouching every written byte to see if we can create
> +# enough speculative COW reservations to overflow i_delayed_blks.
> +#
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1 # failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 7 15
> +
> +_cleanup()
> +{
> + cd /
> + test -n "$loop_mount" && $UMOUNT_PROG $loop_mount > /dev/null 2>&1
> + test -n "$loop_dev" && _destroy_loop_device $loop_dev
> + rm -rf $tmp.*
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/reflink
> +. ./common/filter
> +
> +# real QA test starts here
> +_supported_os Linux
> +_supported_fs xfs
> +_require_scratch_reflink
> +_require_cp_reflink
> +_require_loop
> +_require_xfs_debug # needed for xfs_bmap -c
> +
> +MAXEXTLEN=2097151 # cowextsize can't be more than MAXEXTLEN
> +
> +echo "Format and mount"
> +_scratch_mkfs > "$seqres.full" 2>&1
> +_scratch_mount
> +
> +# Create a huge sparse filesystem on the scratch device because that's what
> +# we're going to need to guarantee that we have enough blocks to overflow in
> +# the first place. We need to have at least enough free space on that huge fs
> +# to handle one written block every MAXEXTLEN blocks and to reserve 2^32 blocks
> +# in the COW fork. There needs to be sufficient space in the scratch
> +# filesystem to handle a 256M log, all the per-AG metadata, and all the data
> +# written to the test file.
> +#
> +# Worst case, a 64k-block fs needs to be about 300TB. Best case, a 1k block
> +# filesystem needs ~5TB. For the most common 4k case we only need a ~20TB fs.
> +#
> +# In practice, the author observed that the space required on the scratch fs
> +# never exceeded ~800M even for a 300T 6k-block filesystem, so we'll just ask
> +# for about 1.2GB.
> +blksz=$(_get_file_block_size "$SCRATCH_MNT")
> +nr_cows="$(( ((2 ** 32) / MAXEXTLEN) + 100 ))"
> +blks_needed="$(( nr_cows * (1 + MAXEXTLEN) ))"
> +loop_file_sz="$(( ((blksz * blks_needed) * 12 / 10) / 512 * 512 ))"
> +_require_fs_space $SCRATCH_MNT 1234567
> +
> +loop_file=$SCRATCH_MNT/a.img
> +loop_mount=$SCRATCH_MNT/a
> +$XFS_IO_PROG -f -c "truncate $loop_file_sz" $loop_file
> +loop_dev=$(_create_loop_device $loop_file)
> +
> +# Now we have to create the source file. The goal is to overflow a 32-bit
> +# i_delayed_blks, which means that we have to create at least that many delayed
> +# allocation block reservations. Take advantage of the fact that a cowextsize
> +# hint causes creation of large speculative delalloc reservations in the cow
> +# fork to reduce the amount of work we have to do.
> +#
> +# The maximum cowextsize can only be set to MAXEXTLEN fs blocks on a filesystem
> +# whose AGs each have more than MAXEXTLEN * 2 blocks. This we can do easily
> +# with a multi-terabyte filesystem, so start by setting up the hint. Note that
> +# the current fsxattr interface specifies its u32 cowextsize hint in units of
> +# bytes and therefore can't handle MAXEXTLEN * blksz on most filesystems, so we
> +# set it via mkfs because mkfs takes units of fs blocks, not bytes.
> +
> +_mkfs_dev -d cowextsize=$MAXEXTLEN -l size=256m $loop_dev >> $seqres.full
> +mkdir $loop_mount
> +mount $loop_dev $loop_mount
> +
> +echo "Create crazy huge file"
> +huge_file="$loop_mount/a"
> +touch "$huge_file"
> +blksz=$(_get_file_block_size "$loop_mount")
> +extsize_bytes="$(( MAXEXTLEN * blksz ))"
> +
> +# Make sure it actually set a hint.
> +curr_cowextsize_str="$($XFS_IO_PROG -c 'cowextsize' "$huge_file")"
> +echo "$curr_cowextsize_str" >> $seqres.full
> +cowextsize_bytes="$(echo "$curr_cowextsize_str" | sed -e 's/^.\([0-9]*\).*$/\1/g')"
> +test "$cowextsize_bytes" -eq 0 && echo "could not set cowextsize?"
> +
> +# Now we have to seed the file with sparse contents. Remember, the goal is to
> +# create a little more than 2^32 delayed allocation blocks in the COW fork with
> +# as little effort as possible. We know that speculative COW preallocation
> +# will create MAXEXTLEN-length reservations for us, so that means we should
> +# be able to get away with touching a single byte every extsize_bytes. We
> +# do this backwards to avoid having to move EOF.
> +seq $nr_cows -1 0 | while read n; do
> + off="$((n * extsize_bytes))"
> + $XFS_IO_PROG -c "pwrite $off 1" "$huge_file" > /dev/null
> +done
> +
> +echo "Reflink crazy huge file"
> +_cp_reflink "$huge_file" "$huge_file.b"
> +
> +# Now that we've shared all the blocks in the file, we touch them all again
> +# to create speculative COW preallocations.
> +echo "COW crazy huge file"
> +seq $nr_cows -1 0 | while read n; do
> + off="$((n * extsize_bytes))"
> + $XFS_IO_PROG -c "pwrite $off 1" "$huge_file" > /dev/null
> +done
> +
> +# Compare the number of blocks allocated to this file (as reported by stat)
> +# against the number of blocks that are in the COW fork. If either one is
> +# less than 2^32 then we have evidence of an overflow problem.
> +echo "Check crazy huge file"
> +allocated_stat_blocks="$(stat -c %b "$huge_file")"
> +stat_blksz="$(stat -c %B "$huge_file")"
> +allocated_fsblocks=$(( allocated_stat_blocks * stat_blksz / blksz ))
> +
> +# Make sure we got enough COW reservations to overflow a 32-bit counter.
> +
> +# Return the number of delalloc & real blocks given bmap output for a fork of a
> +# file. Output is in units of 512-byte blocks.
> +count_fork_blocks() {
> + $AWK_PROG "
> +{
> + if (\$3 == \"delalloc\") {
> + x += \$4;
> + } else if (\$3 == \"hole\") {
> + ;
> + } else {
> + x += \$6;
> + }
> +}
> +END {
> + print(x);
> +}
> +"
> +}
> +
> +# Count the number of blocks allocated to a file based on the xfs_bmap output.
> +# Output is in units of filesystem blocks.
> +count_file_fork_blocks() {
> + local tag="$1"
> + local file="$2"
> + local args="$3"
> +
> + $XFS_IO_PROG -c "bmap $args -l -p -v" "$huge_file" > $tmp.extents
> + echo "$tag fork map" >> $seqres.full
> + cat $tmp.extents >> $seqres.full
> + local sectors="$(count_fork_blocks < $tmp.extents)"
> + echo "$(( sectors / (blksz / 512) ))"
> +}
> +
> +cowblocks=$(count_file_fork_blocks cow "$huge_file" "-c")
> +attrblocks=$(count_file_fork_blocks attr "$huge_file" "-a")
> +datablocks=$(count_file_fork_blocks data "$huge_file" "")
> +
> +# Did we create more than 2^32 blocks in the cow fork?
> +# Make sure the test actually set us up for the overflow.
> +echo "datablocks is $datablocks" >> $seqres.full
> +echo "attrblocks is $attrblocks" >> $seqres.full
> +echo "cowblocks is $cowblocks" >> $seqres.full
> +test "$cowblocks" -lt $((2 ** 32)) && \
> + echo "cowblocks (${cowblocks}) should be more than 2^32!"
> +
> +# Does stat's block allocation count exceed 2^32?
> +# This is how we detect the incore delalloc count overflow.
> +echo "stat blocks is $allocated_fsblocks" >> $seqres.full
> +test "$allocated_fsblocks" -lt $((2 ** 32)) && \
> + echo "stat blocks (${allocated_fsblocks}) should be more than 2^32!"
> +
> +# Finally, does st_blocks match what we computed from the forks?
> +# Sanity check the values computed from the forks.
> +expected_allocated_fsblocks=$((datablocks + cowblocks + attrblocks))
> +echo "expected stat blocks is $expected_allocated_fsblocks" >> $seqres.full
> +
> +_within_tolerance "st_blocks" $allocated_fsblocks $expected_allocated_fsblocks 2% -v
> +
> +echo "Test done"
> +# Quick check the large sparse fs, but skip xfs_db because it doesn't scale
> +# well on a multi-terabyte filesystem.
> +LARGE_SCRATCH_DEV=yes _check_xfs_filesystem $loop_dev none none
> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/xfs/907.out b/tests/xfs/907.out
> new file mode 100644
> index 00000000..cc07d659
> --- /dev/null
> +++ b/tests/xfs/907.out
> @@ -0,0 +1,8 @@
> +QA output created by 907
> +Format and mount
> +Create crazy huge file
> +Reflink crazy huge file
> +COW crazy huge file
> +Check crazy huge file
> +st_blocks is in range
> +Test done
> diff --git a/tests/xfs/group b/tests/xfs/group
> index ffe4ae12..e528c559 100644
> --- a/tests/xfs/group
> +++ b/tests/xfs/group
> @@ -504,3 +504,4 @@
> 504 auto quick mkfs label
> 505 auto quick spaceman
> 506 auto quick health
> +907 clone
>
next prev parent reply other threads:[~2019-06-25 2:53 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-18 21:07 [PATCH 0/1] xfs: test overflow of delalloc block counters Darrick J. Wong
2019-06-18 21:07 ` [PATCH 1/1] xfs: check for COW overflows in i_delayed_blks Darrick J. Wong
2019-06-25 2:53 ` Eryu Guan [this message]
-- strict thread matches above, loose matches on Subject: below --
2019-06-04 21:17 [PATCH 0/1] xfs: test overflow of delalloc block counters Darrick J. Wong
2019-06-04 21:17 ` [PATCH 1/1] xfs: check for COW overflows in i_delayed_blks Darrick J. Wong
2019-06-06 19:14 ` Brian Foster
2019-06-06 20:02 ` Darrick J. Wong
2019-06-07 11:24 ` Brian Foster
2019-06-07 15:53 ` Darrick J. Wong
2019-06-07 16:23 ` Brian Foster
2019-06-07 18:19 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190625025351.GP15846@desktop \
--to=guaneryu@gmail.com \
--cc=bfoster@redhat.com \
--cc=darrick.wong@oracle.com \
--cc=fstests@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.