From: Eryu Guan <guaneryu@gmail.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>, bfoster@redhat.com
Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org
Subject: Re: [PATCH 1/1] xfs: check for COW overflows in i_delayed_blks
Date: Tue, 25 Jun 2019 10:53:51 +0800 [thread overview]
Message-ID: <20190625025351.GP15846@desktop> (raw)
In-Reply-To: <156089205776.347309.3970978868590991055.stgit@magnolia>
On Tue, Jun 18, 2019 at 02:07:37PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> With the new copy on write functionality it's possible to reserve so
> much COW space for a file that we end up overflowing i_delayed_blks.
> The only user-visible effect of this is to cause totally wrong i_blocks
> output in stat, so check for that.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Test looks fine to me and it runs well too.
Brian, really appreciate if you could help review this new version again
as well!
Thanks,
Eryu
> ---
> tests/xfs/907 | 199 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> tests/xfs/907.out | 8 ++
> tests/xfs/group | 1
> 3 files changed, 208 insertions(+)
> create mode 100755 tests/xfs/907
> create mode 100644 tests/xfs/907.out
>
>
> diff --git a/tests/xfs/907 b/tests/xfs/907
> new file mode 100755
> index 00000000..92cd0399
> --- /dev/null
> +++ b/tests/xfs/907
> @@ -0,0 +1,199 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0+
> +# Copyright (c) 2019 Oracle, Inc. All Rights Reserved.
> +#
> +# FS QA Test No. 907
> +#
> +# Try to overflow i_delayed_blks by setting the largest cowextsize hint
> +# possible, creating a sparse file with a single byte every cowextsize bytes,
> +# reflinking it, and retouching every written byte to see if we can create
> +# enough speculative COW reservations to overflow i_delayed_blks.
> +#
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1 # failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 7 15
> +
> +_cleanup()
> +{
> + cd /
> + test -n "$loop_mount" && $UMOUNT_PROG $loop_mount > /dev/null 2>&1
> + test -n "$loop_dev" && _destroy_loop_device $loop_dev
> + rm -rf $tmp.*
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/reflink
> +. ./common/filter
> +
> +# real QA test starts here
> +_supported_os Linux
> +_supported_fs xfs
> +_require_scratch_reflink
> +_require_cp_reflink
> +_require_loop
> +_require_xfs_debug # needed for xfs_bmap -c
> +
> +MAXEXTLEN=2097151 # cowextsize can't be more than MAXEXTLEN
> +
> +echo "Format and mount"
> +_scratch_mkfs > "$seqres.full" 2>&1
> +_scratch_mount
> +
> +# Create a huge sparse filesystem on the scratch device because that's what
> +# we're going to need to guarantee that we have enough blocks to overflow in
> +# the first place. We need to have at least enough free space on that huge fs
> +# to handle one written block every MAXEXTLEN blocks and to reserve 2^32 blocks
> +# in the COW fork. There needs to be sufficient space in the scratch
> +# filesystem to handle a 256M log, all the per-AG metadata, and all the data
> +# written to the test file.
> +#
> +# Worst case, a 64k-block fs needs to be about 300TB. Best case, a 1k block
> +# filesystem needs ~5TB. For the most common 4k case we only need a ~20TB fs.
> +#
> +# In practice, the author observed that the space required on the scratch fs
> +# never exceeded ~800M even for a 300T 6k-block filesystem, so we'll just ask
> +# for about 1.2GB.
> +blksz=$(_get_file_block_size "$SCRATCH_MNT")
> +nr_cows="$(( ((2 ** 32) / MAXEXTLEN) + 100 ))"
> +blks_needed="$(( nr_cows * (1 + MAXEXTLEN) ))"
> +loop_file_sz="$(( ((blksz * blks_needed) * 12 / 10) / 512 * 512 ))"
> +_require_fs_space $SCRATCH_MNT 1234567
> +
> +loop_file=$SCRATCH_MNT/a.img
> +loop_mount=$SCRATCH_MNT/a
> +$XFS_IO_PROG -f -c "truncate $loop_file_sz" $loop_file
> +loop_dev=$(_create_loop_device $loop_file)
> +
> +# Now we have to create the source file. The goal is to overflow a 32-bit
> +# i_delayed_blks, which means that we have to create at least that many delayed
> +# allocation block reservations. Take advantage of the fact that a cowextsize
> +# hint causes creation of large speculative delalloc reservations in the cow
> +# fork to reduce the amount of work we have to do.
> +#
> +# The maximum cowextsize can only be set to MAXEXTLEN fs blocks on a filesystem
> +# whose AGs each have more than MAXEXTLEN * 2 blocks. This we can do easily
> +# with a multi-terabyte filesystem, so start by setting up the hint. Note that
> +# the current fsxattr interface specifies its u32 cowextsize hint in units of
> +# bytes and therefore can't handle MAXEXTLEN * blksz on most filesystems, so we
> +# set it via mkfs because mkfs takes units of fs blocks, not bytes.
> +
> +_mkfs_dev -d cowextsize=$MAXEXTLEN -l size=256m $loop_dev >> $seqres.full
> +mkdir $loop_mount
> +mount $loop_dev $loop_mount
> +
> +echo "Create crazy huge file"
> +huge_file="$loop_mount/a"
> +touch "$huge_file"
> +blksz=$(_get_file_block_size "$loop_mount")
> +extsize_bytes="$(( MAXEXTLEN * blksz ))"
> +
> +# Make sure it actually set a hint.
> +curr_cowextsize_str="$($XFS_IO_PROG -c 'cowextsize' "$huge_file")"
> +echo "$curr_cowextsize_str" >> $seqres.full
> +cowextsize_bytes="$(echo "$curr_cowextsize_str" | sed -e 's/^.\([0-9]*\).*$/\1/g')"
> +test "$cowextsize_bytes" -eq 0 && echo "could not set cowextsize?"
> +
> +# Now we have to seed the file with sparse contents. Remember, the goal is to
> +# create a little more than 2^32 delayed allocation blocks in the COW fork with
> +# as little effort as possible. We know that speculative COW preallocation
> +# will create MAXEXTLEN-length reservations for us, so that means we should
> +# be able to get away with touching a single byte every extsize_bytes. We
> +# do this backwards to avoid having to move EOF.
> +seq $nr_cows -1 0 | while read n; do
> + off="$((n * extsize_bytes))"
> + $XFS_IO_PROG -c "pwrite $off 1" "$huge_file" > /dev/null
> +done
> +
> +echo "Reflink crazy huge file"
> +_cp_reflink "$huge_file" "$huge_file.b"
> +
> +# Now that we've shared all the blocks in the file, we touch them all again
> +# to create speculative COW preallocations.
> +echo "COW crazy huge file"
> +seq $nr_cows -1 0 | while read n; do
> + off="$((n * extsize_bytes))"
> + $XFS_IO_PROG -c "pwrite $off 1" "$huge_file" > /dev/null
> +done
> +
> +# Compare the number of blocks allocated to this file (as reported by stat)
> +# against the number of blocks that are in the COW fork. If either one is
> +# less than 2^32 then we have evidence of an overflow problem.
> +echo "Check crazy huge file"
> +allocated_stat_blocks="$(stat -c %b "$huge_file")"
> +stat_blksz="$(stat -c %B "$huge_file")"
> +allocated_fsblocks=$(( allocated_stat_blocks * stat_blksz / blksz ))
> +
> +# Make sure we got enough COW reservations to overflow a 32-bit counter.
> +
> +# Return the number of delalloc & real blocks given bmap output for a fork of a
> +# file. Output is in units of 512-byte blocks.
> +count_fork_blocks() {
> + $AWK_PROG "
> +{
> + if (\$3 == \"delalloc\") {
> + x += \$4;
> + } else if (\$3 == \"hole\") {
> + ;
> + } else {
> + x += \$6;
> + }
> +}
> +END {
> + print(x);
> +}
> +"
> +}
> +
> +# Count the number of blocks allocated to a file based on the xfs_bmap output.
> +# Output is in units of filesystem blocks.
> +count_file_fork_blocks() {
> + local tag="$1"
> + local file="$2"
> + local args="$3"
> +
> + $XFS_IO_PROG -c "bmap $args -l -p -v" "$huge_file" > $tmp.extents
> + echo "$tag fork map" >> $seqres.full
> + cat $tmp.extents >> $seqres.full
> + local sectors="$(count_fork_blocks < $tmp.extents)"
> + echo "$(( sectors / (blksz / 512) ))"
> +}
> +
> +cowblocks=$(count_file_fork_blocks cow "$huge_file" "-c")
> +attrblocks=$(count_file_fork_blocks attr "$huge_file" "-a")
> +datablocks=$(count_file_fork_blocks data "$huge_file" "")
> +
> +# Did we create more than 2^32 blocks in the cow fork?
> +# Make sure the test actually set us up for the overflow.
> +echo "datablocks is $datablocks" >> $seqres.full
> +echo "attrblocks is $attrblocks" >> $seqres.full
> +echo "cowblocks is $cowblocks" >> $seqres.full
> +test "$cowblocks" -lt $((2 ** 32)) && \
> + echo "cowblocks (${cowblocks}) should be more than 2^32!"
> +
> +# Does stat's block allocation count exceed 2^32?
> +# This is how we detect the incore delalloc count overflow.
> +echo "stat blocks is $allocated_fsblocks" >> $seqres.full
> +test "$allocated_fsblocks" -lt $((2 ** 32)) && \
> + echo "stat blocks (${allocated_fsblocks}) should be more than 2^32!"
> +
> +# Finally, does st_blocks match what we computed from the forks?
> +# Sanity check the values computed from the forks.
> +expected_allocated_fsblocks=$((datablocks + cowblocks + attrblocks))
> +echo "expected stat blocks is $expected_allocated_fsblocks" >> $seqres.full
> +
> +_within_tolerance "st_blocks" $allocated_fsblocks $expected_allocated_fsblocks 2% -v
> +
> +echo "Test done"
> +# Quick check the large sparse fs, but skip xfs_db because it doesn't scale
> +# well on a multi-terabyte filesystem.
> +LARGE_SCRATCH_DEV=yes _check_xfs_filesystem $loop_dev none none
> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/xfs/907.out b/tests/xfs/907.out
> new file mode 100644
> index 00000000..cc07d659
> --- /dev/null
> +++ b/tests/xfs/907.out
> @@ -0,0 +1,8 @@
> +QA output created by 907
> +Format and mount
> +Create crazy huge file
> +Reflink crazy huge file
> +COW crazy huge file
> +Check crazy huge file
> +st_blocks is in range
> +Test done
> diff --git a/tests/xfs/group b/tests/xfs/group
> index ffe4ae12..e528c559 100644
> --- a/tests/xfs/group
> +++ b/tests/xfs/group
> @@ -504,3 +504,4 @@
> 504 auto quick mkfs label
> 505 auto quick spaceman
> 506 auto quick health
> +907 clone
>
next prev parent reply other threads:[~2019-06-25 2:53 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-18 21:07 [PATCH 0/1] xfs: test overflow of delalloc block counters Darrick J. Wong
2019-06-18 21:07 ` [PATCH 1/1] xfs: check for COW overflows in i_delayed_blks Darrick J. Wong
2019-06-25 2:53 ` Eryu Guan [this message]
-- strict thread matches above, loose matches on Subject: below --
2019-06-04 21:17 [PATCH 0/1] xfs: test overflow of delalloc block counters Darrick J. Wong
2019-06-04 21:17 ` [PATCH 1/1] xfs: check for COW overflows in i_delayed_blks Darrick J. Wong
2019-06-06 19:14 ` Brian Foster
2019-06-06 20:02 ` Darrick J. Wong
2019-06-07 11:24 ` Brian Foster
2019-06-07 15:53 ` Darrick J. Wong
2019-06-07 16:23 ` Brian Foster
2019-06-07 18:19 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190625025351.GP15846@desktop \
--to=guaneryu@gmail.com \
--cc=bfoster@redhat.com \
--cc=darrick.wong@oracle.com \
--cc=fstests@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).