Re: [PATCH v2] shared/298: Wire btrfs support in get_free_sectors

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Eryu Guan <guaneryu@gmail.com>
To: Nikolay Borisov <nborisov@suse.com>
Cc: dsterb@suse.cz, fstests@vger.kernel.org, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2] shared/298: Wire btrfs support in get_free_sectors
Date: Fri, 22 Feb 2019 01:10:51 +0800	[thread overview]
Message-ID: <20190221171051.GC2713@desktop> (raw)
In-Reply-To: <20190208114404.13505-1-nborisov@suse.com>

On Fri, Feb 08, 2019 at 01:44:04PM +0200, Nikolay Borisov wrote:
> Add support for btrfs in shared/298. Achieve this by introducing 2
> new awk scripts that parse relevant btrfs structures and print holes.
> Additionally modify the test to create larger - 3gb filesystem in the
> case of btrfs. This is needed so that distinct block groups are used
> for data and metadata.
> 
> Signed-off-by: Nikolay Borisov <nborisov@suse.com>

Sorry for the late review.. I find that parsing btrfs extent and dev
tree is very btrfs-specific, it'd be great if btrfs folks could help
review the two awk scripts as well!

> ---
> 
> V2: 
>  * Changed the way args are passed to mkfs.btrfs to preserve existing 
>  options, yet override data/metadata profile settings
> 
>  parse-dev-tree.awk    |  47 +++++++++++++++++++
>  parse-extent-tree.awk | 125 ++++++++++++++++++++++++++++++++++++++++++++++++++

I'd prefer placing these files in src dir, instead of dumping them in
top dir directly.

>  tests/shared/298      |  36 +++++++++++++--
>  3 files changed, 205 insertions(+), 3 deletions(-)
>  create mode 100755 parse-dev-tree.awk
>  create mode 100755 parse-extent-tree.awk
> 
> diff --git a/parse-dev-tree.awk b/parse-dev-tree.awk
> new file mode 100755
> index 000000000000..52f9c0aadc25
> --- /dev/null
> +++ b/parse-dev-tree.awk
> @@ -0,0 +1,47 @@
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2019 Nikolay Borisov, SUSE LLC.  All Rights Reserved.
> +#
> +# Parses btrfs' device tree for holes, required parameters passed on command

I find this description not very useful, would you please describe the
expected output and format as well?

> +# line:
> +#     * spb - how many bytes per sector, this is used so that the numbers 
           ^^^ This is misleading, in shared/298 spb represents "sector
	   per block", but here it's really sector size.

> +# 		returned by the script are in sectors.
> +#	  * devsize - size of the device in bytes, used to output the final 

This line is not aligned with above line, it contains leading tab.

> +# 		hole (if any)
> +
> +function get_extent_size(line,  tmp) {

Would you please document the expected intput and output in comment as
well? So it's easier to review.

Also, is the 'tmp' argument really needed?

> +	split(line, tmp)
> +	return tmp[6]
> +}
> +
> +function get_extent_offset(line,  tmp) {

Same here.

> +	split(line, tmp)
> +	gsub(/\)/,"", tmp[6])
> +	return tmp[6]
> +}
> +
> +BEGIN {
> +	dev_extent_match="^.item [0-9]* key \\([0-9]* DEV_EXTENT [0-9]*\\).*"
> +	dev_extent_len_match="^\\s*chunk_objectid [0-9]* chunk_offset [0-9]* length [0-9]*$"
> +}
> +
> +{
> +	if (match($0,dev_extent_match)) {
> +		extent_offset = get_extent_offset($0)		
> +		if (prev_extent_end) {
> +			hole_size = extent_offset - prev_extent_end
> +			if (hole_size > 0) {
> +				print prev_extent_end / spb, int((extent_offset - 1) / spb)
> +			}
> +		} 
> +	} else if (match($0, dev_extent_len_match)) {
> +		extent_size = get_extent_size($0)
> +		prev_extent_end = extent_offset + extent_size
> +	}
> +}
> +
> +END {
> +	if (prev_extent_end) {
> +		print prev_extent_end / spb, int((devsize - 1) / spb)
> +	}
> +}
> +
> diff --git a/parse-extent-tree.awk b/parse-extent-tree.awk
> new file mode 100755
> index 000000000000..01c61254cfef
> --- /dev/null
> +++ b/parse-extent-tree.awk
> @@ -0,0 +1,125 @@
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2019 Nikolay Borisov, SUSE LLC.  All Rights Reserved.
> +#
> +# Parses btrfs' extent tree for holes, required parameters passed on command

Same here, please provide more details.

> +# line:
> +#     * spb - how many bytes per sector, this is used so that the numbers 

And replace 'spb' with a more proper variable name.

> +# 		returned by the script are in sectors.
> +#	  * nodesize - size of metadata extents, used for internal calculations

Indention issue too.

> +
> +function get_extent_size(line,  tmp) {
> +	if (line ~ data_match || line ~ bg_match) {
> +		split(line, tmp)
> +		gsub(/\)/,"", tmp[6])
> +		return tmp[6]
> +	} else if (line ~ metadata_match) {
> +		return nodesize
> +	}
> +}
> +
> +function get_extent_offset(line,  tmp) {
> +	split(line, tmp)
> +	gsub(/\(/,"",tmp[4])
> +	return tmp[4]
> +}
> +
> +function print_array(         base_offset, bg_line)

Document the expected input and output of these functions too.

And why there're so many spaces before 'base_offset' argument?

> +{
> +	if (match(lines[0], bg_match)) {
> +		#we don't have an extent at the beginning of of blockgroup, so we

Add a space after '#' for comments.

> +		#have a hole between blockgroup offset and first extent offset
> +		bg_line = lines[0]
> +		prev_size=0
> +		prev_offset=get_extent_offset(bg_line)
> +		delete lines[0]
> +	} else {
> +		#we have an extent at the beginning of block group, so initialize 
> +		#the prev_* vars correctly
> +		bg_line = lines[1]
> +		prev_size = get_extent_size(lines[0])
> +		prev_offset = get_extent_offset(lines[0])
> +		delete lines[1]
> +		delete lines[0]
> +	}
> +
> +	bg_offset=get_extent_offset(bg_line)
> +	bgend=bg_offset + get_extent_size(bg_line)
> +
> +	for (i in lines) {
> +			cur_size = get_extent_size(lines[i])
> +			cur_offset = get_extent_offset(lines[i])
> +			if (cur_offset  != prev_offset + prev_size)
> +				print int((prev_size + prev_offset) / spb), int((cur_offset-1) / spb)
> +			prev_size = cur_size
> +			prev_offset = cur_offset
> +	}
> +
> +	print int((prev_size + prev_offset) / spb), int((bgend-1) / spb)
> +	total_printed++
> +	delete lines
> +}
> +
> +BEGIN {
> +	loi_match="^.item [0-9]* key \\([0-9]* (BLOCK_GROUP_ITEM|METADATA_ITEM|EXTENT_ITEM) [0-9]*\\).*"
> +	metadata_match="^.item [0-9]* key \\([0-9]* METADATA_ITEM [0-9]*\\).*"
> +	data_match="^.item [0-9]* key \\([0-9]* EXTENT_ITEM [0-9]*\\).*"
> +	bg_match="^.item [0-9]* key \\([0-9]* BLOCK_GROUP_ITEM [0-9]*\\).*"
> +	node_match="^node.*$"
> +	leaf_match="^leaf [0-9]* flags"
> +	line_count=0
> +	total_printed=0
> +	skip_lines=0
> +}
> +
> +{
> +	#skip lines not belonging to a leaf
> +	if (match($0,node_match)) {
> +		skip_lines=1
> +	} else if (match($0, leaf_match)) {
> +		skip_lines=0
> +	}
> +
> +	if (!match($0,loi_match) || skip_lines == 1) next;
> +
> +	#we have a line of interest, we need to parse it. First check if there is
> +	#anything in the array
> +	if (line_count==0) {
> +		lines[line_count++]=$0;
> +	} else {
> +		prev_line=lines[line_count-1]
> +		split(prev_line, prev_line_fields)
> +		prev_objectid=prev_line_fields[4]
> +		objectid=$4
> +
> +		if (objectid == prev_objectid && match($0, bg_match)) {
> +			if (total_printed>0) {
> +				#We are adding a BG after we have added its first extent
> +				#previously, consider this a record ending event and just print
> +				#the array
> +
> +				delete lines[line_count-1]
> +				print_array()
> +				#we now start a new array with current and previous lines
> +				line_count=0
> +				lines[line_count++]=prev_line
> +				lines[line_count++]=$0
> +			} else {
> +				#first 2 added lines are EXTENT and BG that match, in this case
> +				#just add them
> +				lines[line_count++]=$0
> +				
> +			}
> +		} else if (match($0, bg_match)) {
> +			#ordinary end of record
> +			print_array()
> +			line_count=0
> +			lines[line_count++]=$0
> +		} else {
> +			lines[line_count++]=$0
> +		}
> +	}
> +}
> +
> +END {
> +	print_array()
> +}
> diff --git a/tests/shared/298 b/tests/shared/298
> index e7b7b233de45..8c053aa34adb 100755
> --- a/tests/shared/298
> +++ b/tests/shared/298
> @@ -15,14 +15,24 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
>  
>  . ./common/rc
>  
> -_supported_fs ext4 xfs
> +_supported_fs ext4 xfs btrfs
>  _supported_os Linux
>  _require_test
>  _require_loop
>  _require_fstrim
>  _require_xfs_io_command "fiemap"
> -_require_fs_space $TEST_DIR 307200
> +if [ "$FSTYP" = "btrfs" ]; then
> +	# 3g for btrfs to have distinct bgs
> +	_require_fs_space $TEST_DIR 3145728
> +	fssize=3000
> +else
> +	_require_fs_space $TEST_DIR 307200
> +	fssize=300
> +fi
> +
>  [ "$FSTYP" = "ext4" ] && _require_dumpe2fs
> +[ "$FSTYP" = "btrfs" ] && _require_btrfs_command inspect-internal dump-super
> +[ "$FSTYP" = "btrfs" ] && _require_btrfs_command inspect-internal dump-tree
>  
>  _cleanup()
>  {
> @@ -61,6 +71,25 @@ get_free_sectors()
>  		 $AWK_PROG -v spb=$sectors_per_block -v agsize=$agsize \
>  		'{ print spb * ($1 * agsize + $2), spb * ($1 * agsize + $2 + $3) - 1 }'
>  	;;
> +	btrfs)
> +		local device_size=$($BTRFS_UTIL_PROG filesystem show --raw $loop_mnt 2>&1 \
> +			| sed -n "s/^.*size \([0-9]*\).*$/\1/p")
> +
> +		local nodesize=$($BTRFS_UTIL_PROG inspect-internal dump-super $img_file  \
> +			| sed -n 's/nodesize\s*\(.*\)/\1/p')
> +
> +		# Get holes within block groups
> +		local btrfs_free=$($BTRFS_UTIL_PROG inspect-internal dump-tree -t extent $img_file \
> +			| $AWK_PROG -v spb=512 -v nodesize=$nodesize -f parse-extent-tree.awk)
> +
> +		btrfs_free+="\n"
> +
> +		# Get holes within unallocated space on disk
> +		btrfs_free+=$($BTRFS_UTIL_PROG inspect-internal dump-tree -t dev $img_file \
> +			| $AWK_PROG -v spb=512 -v devsize=$device_size -f parse-dev-tree.awk)
> +
> +		echo -e "$btrfs_free"

I don't think the "$btrfs_free" variable is needed, we could just let
awk print the result out.

Thanks,
Eryu

> +	;;
>  	esac
>  }
>  
> @@ -105,7 +134,7 @@ here=`pwd`
>  tmp=`mktemp -d`
>  
>  img_file=$TEST_DIR/$$.fs
> -dd if=/dev/zero of=$img_file bs=1M count=300 &> /dev/null
> +dd if=/dev/zero of=$img_file bs=1M count=$fssize &> /dev/null
>  
>  loop_dev=$(_create_loop_device $img_file)
>  loop_mnt=$tmp/loop_mnt
> @@ -118,6 +147,7 @@ merged_sectors="$tmp/merged_free_sectors"
>  mkdir $loop_mnt
>  
>  [ "$FSTYP" = "xfs" ] && MKFS_OPTIONS="-f $MKFS_OPTIONS"
> +[ "$FSTYP" = "btrfs" ] && MKFS_OPTIONS="$MKFS_OPTIONS -f -dsingle -msingle"
>  
>  _mkfs_dev $loop_dev
>  _mount $loop_dev $loop_mnt
> -- 
> 2.7.4
>

next prev parent reply	other threads:[~2019-02-21 17:10 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-08 11:44 [PATCH v2] shared/298: Wire btrfs support in get_free_sectors Nikolay Borisov
2019-02-21  8:19 ` Nikolay Borisov
2019-02-21 17:10 ` Eryu Guan [this message]
2019-02-22  6:23   ` Nikolay Borisov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190221171051.GC2713@desktop \
    --to=guaneryu@gmail.com \
    --cc=dsterb@suse.cz \
    --cc=fstests@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=nborisov@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.