linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Goffredo Baroncelli <kreijack@inwind.it>
To: Ashish Samant <ashish.samant@oracle.com>
Cc: linux-btrfs@vger.kernel.org, bo.li.liu@oracle.com
Subject: Re: [PATCH] Btrfs-progs: add check-only option for balance
Date: Fri, 10 Jun 2016 19:57:15 +0200	[thread overview]
Message-ID: <9dbb261c-3bfd-16e1-3ed6-c12ed9d59601@inwind.it> (raw)
In-Reply-To: <1465508801-14163-1-git-send-email-ashish.samant@oracle.com>

Hi all,

On 2016-06-09 23:46, Ashish Samant wrote:
> From: Liu Bo <bo.li.lu@oracle.com>
> 
> This aims to decide whether a balance can reduce the number of
> data block groups and if it can, this shows the '-dvrange' block
> group's objectid.
> 
> With this, you can run
> 'btrfs balance start -c mnt' or 'btrfs balance start --check-only mnt'
> 
>  --------------------------------------------------------------
> $ btrfs balance start -c /mnt/btrfs
> Checking data block groups...
> block_group        12582912 (len     8388608 used      786432)
> block_group      1103101952 (len  1073741824 used   536870912)
> block_group      2176843776 (len  1073741824 used  1073741824)
> total bgs 3 total_free 544473088 min_used bg 12582912 has (min_used 786432 free 7602176)
> run 'btrfs balance start -dvrange=12582912..12582913 your_mnt'
> 
> $ btrfs balance start -dvrange=12582912..12582913 /mnt/btrfs
> Done, had to relocate 1 out of 5 chunks
> 
> $ btrfs balance start -c /mnt/btrfs
> Checking data block groups...
> block_group      1103101952 (len  1073741824 used   537395200)
> block_group      2176843776 (len  1073741824 used  1073741824)
> total bgs 2 total_free 536346624 min_used bg 1103101952 has (min_used 537395200 free 536346624)
>  --------------------------------------------------------------
> 
> So you now know how to babysit your btrfs in a smart way.

I think that it is an excellent tool. However I have some suggestions, most of these are from an user interface POV:

1) this should be a real command; it doesn't make sense at all that this command is a "sub command" of "btrfs bal start". I have two suggestion about that:
a) we could add a new sub-command to the "balance" family. Something like "btrfs bal analisys", where we could put some suggestions for a good balance
b) we could add a new sub-command to the "inspect" family. We could also add some feature like showing other block_gruop (system and metadata), and print their profile: i.e.

  # btrfs inspect block-group-analisys /
  Type           Mode              Start         Len        Used
  Data           single      83806388224     1.00GiB   945.64MiB
  Data           single      84880130048     1.00GiB   890.60MiB
  Data           single      85953871872     1.00GiB   818.18MiB
  Data           single      87027613696     1.00GiB   835.58MiB
  Data           single      88101355520     1.00GiB  1023.91MiB
  System         single      89175097344    32.00MiB    16.00KiB
  Metadata       single      89208651776     1.00GiB   614.88MiB
  [...]

further options could be added like showing only the most empty chunks, sorting by the Used value, filtering by type and/or profile....

2) From a readability POV, I suggest to use the pretty_size() function to display a more readable "len" and "used".
3) For the same reason, I suggest to switch to a "tabular" format, like my example: it doesn't make sense to write for every line "block_group/len/used"...
4) when the usual balance command fails because ENOSPACE, we could suggest to use this new command....

more notes below

BR
G.Baroncelli

> 
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
> ---
>  cmds-balance.c |  127 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 files changed, 126 insertions(+), 1 deletions(-)
> 
> diff --git a/cmds-balance.c b/cmds-balance.c
> index 8f3bf5b..e2aab6c 100644
> --- a/cmds-balance.c
> +++ b/cmds-balance.c
> @@ -493,6 +493,116 @@ out:
>  	return ret;
>  }
>  
> +/* return 0 if balance can remove a data block group, otherwise return 1 */
> +static int search_data_bgs(const char *path)
> +{
> +	struct btrfs_ioctl_search_args args;
> +	struct btrfs_ioctl_search_key *sk;
> +	struct btrfs_ioctl_search_header *header;
> +	struct btrfs_block_group_item *bg;
> +	unsigned long off = 0;
> +	DIR *dirstream = NULL;
> +	int e;
> +	int fd;
> +	int i;
> +	u64 total_free = 0;
> +	u64 min_used = (u64)-1;
> +	u64 free_of_min_used = 0;
> +	u64 bg_of_min_used = 0;
> +	u64 flags;
> +	u64 used;
> +	int ret = 0;
> +	int nr_data_bgs = 0;
> +
> +	fd = btrfs_open_dir(path, &dirstream, 1);
> +	if (fd < 0)
> +		return 1;
> +
> +	memset(&args, 0, sizeof(args));
> +	sk = &args.key;
> +
> +	sk->tree_id = BTRFS_EXTENT_TREE_OBJECTID;
> +	sk->min_objectid = sk->min_offset = sk->min_transid = 0;
> +	sk->max_objectid = sk->max_offset = sk->max_transid = (u64)-1;
> +	sk->max_type = sk->min_type = BTRFS_BLOCK_GROUP_ITEM_KEY;
> +	sk->nr_items = 65536;
> +
> +	while (1) {
> +		ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, &args);
> +		e = errno;
> +		if (ret < 0) {
> +			fprintf(stderr, "ret %d error '%s'\n", ret,
> +				strerror(e));
> +			return ret;
> +		}
> +		/*
> +		 * it should not happen.
> +		 */
> +		if (sk->nr_items == 0)
> +			break;
> +
> +		off = 0;
> +		for (i = 0; i < sk->nr_items; i++) {
> +			header = (struct btrfs_ioctl_search_header *)(args.buf
> +								      + off);
> +
> +			off += sizeof(*header);
> +			if (header->type == BTRFS_BLOCK_GROUP_ITEM_KEY) {
> +				bg = (struct btrfs_block_group_item *)
> +					(args.buf + off);
> +				flags = btrfs_block_group_flags(bg);
> +				if (flags & BTRFS_BLOCK_GROUP_DATA) {
> +					nr_data_bgs++;
> +					used = btrfs_block_group_used(bg);
> +					printf(
> +					"block_group %15llu (len %11llu used %11llu)\n",
> +							header->objectid,
> +							header->offset, used);

                                                        ^^^^^^^^^^^^^^^^^^^^^
Use pretty_size() around "header->offset" and "used"...

> +					total_free += header->offset - used;
> +					if (min_used >= used) {

I think that it would be better to track the chunk with the max space free that we can reallocate: the chunk from which we should start must be selected from the ones that have the properties:
	used <= total_free
and this chunk must have
	max(size - used)
This to free the max space with one shot
	
	

> +						min_used = used;
> +						free_of_min_used =
> +							header->offset - used;
> +						bg_of_min_used =
> +							header->objectid;
> +					}
> +				}
> +			}
> +
> +			off += header->len;
> +			sk->min_objectid = header->objectid;
> +			sk->min_type = header->type;
> +			sk->min_offset = header->offset;
> +		}
> +		sk->nr_items = 65536;
> +
> +		if (sk->min_objectid < sk->max_objectid)
> +			sk->min_objectid += 1;
> +		else
> +			break;
> +	}
> +
> +	if (nr_data_bgs <= 1) {
> +		printf("Data block groups in fs = %d, no need to do balance.\n",
> +				nr_data_bgs);
> +		return 1;
> +	}
> +
> +	printf("total bgs %d total_free %llu min_used bg %llu has (min_used %llu free %llu)\n",
> +			nr_data_bgs, total_free, bg_of_min_used, min_used,
> +			free_of_min_used);
> +	if (total_free - free_of_min_used > min_used) {
> +		printf("run 'btrfs balance start -dvrange=%llu..%llu <mountpoint>'\n",
> +				bg_of_min_used, bg_of_min_used + 1);
> +		ret = 0;
> +	} else {
> +		printf("Please don't balance data block groups, it won't work.\n");

We should add the reason because the balance will not fail

"WARNING: don't balance data block groups; it won't work, because there is no enough space free".


> +		ret = 1;
> +	}
> +
> +	return ret;
> +}
> +
>  static const char * const cmd_balance_start_usage[] = {
>  	"btrfs balance start [options] <path>",
>  	"Balance chunks across the devices",
> @@ -508,6 +618,7 @@ static const char * const cmd_balance_start_usage[] = {
>  	"-m[filters]    act on metadata chunks",
>  	"-s[filters]    act on system chunks (only under -f)",
>  	"-v             be verbose",
> +	"-c             only check if balance would make sense, not doing real job",
>  	"-f             force reducing of metadata integrity",
>  	"--full-balance do not print warning and do not delay start",
>  	NULL
> @@ -521,6 +632,7 @@ static int cmd_balance_start(int argc, char **argv)
>  	int force = 0;
>  	int verbose = 0;
>  	unsigned start_flags = 0;
> +	int check_data_bgs = 0;
>  	int i;
>  
>  	memset(&args, 0, sizeof(args));
> @@ -534,12 +646,14 @@ static int cmd_balance_start(int argc, char **argv)
>  			{ "system", optional_argument, NULL, 's' },
>  			{ "force", no_argument, NULL, 'f' },
>  			{ "verbose", no_argument, NULL, 'v' },
> +			{ "check-only", no_argument, NULL, 'c' },
>  			{ "full-balance", no_argument, NULL,
>  				GETOPT_VAL_FULL_BALANCE },
>  			{ NULL, 0, NULL, 0 }
>  		};
>  
> -		int opt = getopt_long(argc, argv, "d::s::m::fv", longopts, NULL);
> +		int opt = getopt_long(argc, argv, "d::s::m::fvc", longopts,
> +				      NULL);
>  		if (opt < 0)
>  			break;
>  
> @@ -571,6 +685,9 @@ static int cmd_balance_start(int argc, char **argv)
>  		case 'v':
>  			verbose = 1;
>  			break;
> +		case 'c':
> +			check_data_bgs = 1;
> +			break;
>  		case GETOPT_VAL_FULL_BALANCE:
>  			start_flags |= BALANCE_START_NOWARN;
>  			break;
> @@ -582,6 +699,14 @@ static int cmd_balance_start(int argc, char **argv)
>  	if (check_argc_exact(argc - optind, 1))
>  		usage(cmd_balance_start_usage);
>  
> +	if (check_data_bgs) {
> +		if (verbose)
> +			dump_ioctl_balance_args(&args);
> +
> +		printf("Checking data block groups...\n");
> +		return search_data_bgs(argv[optind]);
> +	}
> +
>  	/*
>  	 * allow -s only under --force, otherwise do with system chunks
>  	 * the same thing we were ordered to do with meta chunks
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

  reply	other threads:[~2016-06-10 17:57 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-09 21:46 [PATCH] Btrfs-progs: add check-only option for balance Ashish Samant
2016-06-10 17:57 ` Goffredo Baroncelli [this message]
2016-06-10 20:47 ` Hans van Kranenburg
2016-06-12 18:41   ` Goffredo Baroncelli
2016-06-12 18:53     ` Hans van Kranenburg
2016-06-14 18:11       ` Goffredo Baroncelli
2016-06-14 18:16         ` Hugo Mills
2016-06-14 18:55           ` Goffredo Baroncelli
2016-06-14 18:21   ` Ashish Samant
  -- strict thread matches above, loose matches on Subject: below --
2016-01-14 23:12 Liu Bo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9dbb261c-3bfd-16e1-3ed6-c12ed9d59601@inwind.it \
    --to=kreijack@inwind.it \
    --cc=ashish.samant@oracle.com \
    --cc=bo.li.liu@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).