Re: [PATCHv3 2/3] diff: introduce --stat-lines to limit the stat lines

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Michael J Gruber <git@drmicha.warpmail.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCHv3 2/3] diff: introduce --stat-lines to limit the stat lines
Date: Mon, 30 May 2011 08:40:48 +0200	[thread overview]
Message-ID: <4DE33BF0.7060607@drmicha.warpmail.net> (raw)
In-Reply-To: <7v39jzqvny.fsf@alter.siamese.dyndns.org>

Junio C Hamano venit, vidit, dixit 28.05.2011 06:46:
> Michael J Gruber <git@drmicha.warpmail.net> writes:
> 
>> diff --git a/diff.c b/diff.c
>> index 4541939..6a6cbca 100644
>> --- a/diff.c
>> +++ b/diff.c
>> @@ -1259,6 +1259,7 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
>>  
>>  	width = options->stat_width ? options->stat_width : 80;
>>  	name_width = options->stat_name_width ? options->stat_name_width : 50;
>> +	count = options->stat_count ? options->stat_count : data->nr;
> 
> It is a tad hard to follow the logic of "count" here. It could be the
> ceiling the user gave you, or we may not be constrained at all if the full
> diff is smaller.  But then
> 
>> @@ -1275,11 +1276,12 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
>>  	add_c = diff_get_color_opt(options, DIFF_FILE_NEW);
>>  	del_c = diff_get_color_opt(options, DIFF_FILE_OLD);
>>  
>> -	for (i = 0; i < data->nr; i++) {
>> +	for (i = 0; (i < count) && (i < data->nr); i++) {
> 
> Here we compare "i" which does not have much to do with how many we
> actually have to show (as we may be skipping in the loop) with "count" and
> also with data->nr.  In the loop, you compensate for the skipping
> (i.e. increment of "i" that should not affect the outcome) by incrementing
> "count", so (i < count) part will correctly count towards the limit the
> user gave. It may be a correct math, but it is tricky and hard to read.
> Especially if we started with the options->stat_count that is larger than
> the data->nr, of course "i < count" will always hold true with or without
> skipping, which also is correct but even trickier.

Maybe I was trying too hard to be efficient. We could have count be
stat_count if non-zero or data->nr and limit all loops by the double
conditional (i < count) && (i < data->nr), resp. use an additional
counter like shown or skipped and compare with that.

Given the short discussion for the initial patch, I really did not
expect how much it would still take... (My bad, of course.)
> 
>>  		struct diffstat_file *file = data->files[i];
>>  		uintmax_t change = file->added + file->deleted;
>>  		if (!data->files[i]->is_renamed &&
>>  			 (change == 0)) {
>> +			count++; /* not shown == room for one more */
>>  			continue;
>>  		}
>>  		fill_print_name(file);
>> @@ -1292,6 +1294,7 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
>>  		if (max_change < change)
>>  			max_change = change;
>>  	}
>> +	count = i; /* min(count, data->nr) */
> 
> This value is mechanically min(count, data->nr) but it is confusing to
> explain what it really means.  It is the number of elements you have to
> look at in data->files[] array, still skipping irrelevant entries in the
> loop, until you show options->stat_count number of entries.

As you explain further down, we may have to look at even more in order
to be able to say "..." or not.

> 
>> @@ -1306,7 +1309,7 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
>>  	else
>>  		width = max_change;
>>  
>> -	for (i = 0; i < data->nr; i++) {
>> +	for (i = 0; i < count; i++) {
> 
> Hence this loop is correct. You ignore everything beyond count in the
> array, as their lengths are immaterial to the outcome.
> 

I think I really should not worry too much about efficiency here and
more about simple code.

>> @@ -1373,6 +1376,19 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
>>  		show_graph(options->file, '-', del, del_c, reset);
>>  		fprintf(options->file, "\n");
>>  	}
>> +	if (count < data->nr)
>> +		fprintf(options->file, "%s ...\n", line_prefix);
> 
> At this point, count is the sum of the number of items you are going to
> show and the number of items you are going to skip. If that is smaller
> than the total in the original array, there are items that you are not
> showing nor skipping. But you terminated the first loop early before
> reaching the end of data->file[] array, so you do not know enough to say
> "there _are_ more to show but I am not showing", I think.

Sigh, you're right. We would have to look until we find the first
non-show non-skipped in order to say that.

> 
> If
> 
>  - data->nr is 3;
>  - the data->file[0] is to be shown but other entries in data->file[] are
>    not to be shown; and
>  - opt->stat_count is 1
> 
> then your first loop fills data->file[0]->name, increments i to 1, the
> condition (i < count) no longer holds, and stops.  At this point count is
> 1, data->nr is 3, but data->file[1] and [2] are not to be shown anyway, so
> this "there are more but I am not showing them" is not quite right, is it?
> 
>> +	for (i = count; i < data->nr; i++) {
>> +		uintmax_t added = data->files[i]->added;
>> +		uintmax_t deleted = data->files[i]->deleted;
>> +		if (!data->files[i]->is_renamed &&
>> +			 (added + deleted == 0)) {
>> +			total_files--;
>> +			continue;
>> +		}
>> +		adds += added;
>> +		dels += deleted;
>> +	}
> 
> If this loop runs "adds += added" even once, then you have "there are more
> but I am not showing" situation, and if all the entries beyond count are
> skipped, then you are not limiting the output.
> 
> By the way, in this round, you are trying to show only the first N items,
> and if there are more to show, then showing "..." after them. I kind of
> liked the idea that came up during the discussion in earlier rounds to
> show N-1 then "..."  then the last one item, but I do not feel too
> strongly about it.

I thought fmt-merge-msg was the example to follow. It does exactly what
I'm trying here, i.e. show count items (if there are count or more), and
show "..." if there are more.

>  diff.c                     |    6 ++++--
>  t/t4049-diff-stat-count.sh |   25 +++++++++++++++++++++++++
>  2 files changed, 29 insertions(+), 2 deletions(-)
> 
> diff --git a/diff.c b/diff.c
> index 82789e3..542ff42 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -1247,6 +1247,7 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
>  	int width, name_width, count;
>  	const char *reset, *add_c, *del_c;
>  	const char *line_prefix = "";
> +	int extra_shown = 0;
>  	struct strbuf *msg = NULL;
>  
>  	if (data->nr == 0)
> @@ -1376,8 +1377,6 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
>  		show_graph(options->file, '-', del, del_c, reset);
>  		fprintf(options->file, "\n");
>  	}
> -	if (count < data->nr)
> -		fprintf(options->file, "%s ...\n", line_prefix);
>  	for (i = count; i < data->nr; i++) {
>  		uintmax_t added = data->files[i]->added;
>  		uintmax_t deleted = data->files[i]->deleted;
> @@ -1388,6 +1387,9 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
>  		}
>  		adds += added;
>  		dels += deleted;
> +		if (!extra_shown)
> +			fprintf(options->file, "%s ...\n", line_prefix);
> +		extra_shown = 1;
>  	}
>  	fprintf(options->file, "%s", line_prefix);
>  	fprintf(options->file,
> diff --git a/t/t4049-diff-stat-count.sh b/t/t4049-diff-stat-count.sh
> new file mode 100755
> index 0000000..641e70d
> --- /dev/null
> +++ b/t/t4049-diff-stat-count.sh
> @@ -0,0 +1,25 @@
> +#!/bin/sh
> +# Copyright (c) 2011, Google Inc.
> +
> +test_description='diff --stat-count'
> +. ./test-lib.sh
> +
> +test_expect_success setup '
> +	>a &&
> +	>b &&
> +	>c &&
> +	>d &&
> +	git add a b c d &&
> +	chmod +x c d &&
> +	echo a >a &&
> +	echo b >b &&
> +	cat >expect <<-\EOF
> +	 a |    1 +
> +	 b |    1 +
> +	 2 files changed, 2 insertions(+), 0 deletions(-)
> +	EOF
> +	git diff --stat --stat-count=2 >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_done

That works, thanks also for the test.

Should I squash this in or go for a clearer use of "count"? (Also, I may
have to take into account your notes about the workflow.)

Michael

next prev parent reply	other threads:[~2011-05-30  6:40 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-29 14:57 [PATCH 1/2] diff: introduce --stat-lines to limit the stat lines Michael J Gruber
2011-04-29 14:57 ` [PATCH 2/2] diff-options.txt: describe --stat-{width,name-width,lines} Michael J Gruber
2011-04-29 15:12 ` [PATCH 1/2] diff: introduce --stat-lines to limit the stat lines Michael J Gruber
2011-04-29 16:15 ` Junio C Hamano
2011-05-01  8:27   ` Michael J Gruber
2011-05-01 18:33     ` Junio C Hamano
2011-05-03 10:46       ` [PATCHv2 " Michael J Gruber
2011-05-03 10:46         ` [PATCHv2 2/2] diff-options.txt: describe --stat-{width,name-width,count} Michael J Gruber
2011-05-03 18:47         ` [PATCHv2 1/2] diff: introduce --stat-lines to limit the stat lines Junio C Hamano
2011-05-04  7:16           ` Michael J Gruber
2011-05-27 12:36             ` [PATCHv3 0/3] diff.c: --stat-count=<n> Michael J Gruber
2011-05-27 12:36               ` [PATCHv3 1/3] diff.c: omit hidden entries from namelen calculation with --stat Michael J Gruber
2011-05-27 17:43                 ` Junio C Hamano
2011-05-27 20:19                   ` Michael J Gruber
2011-05-27 22:45                     ` Junio C Hamano
2011-05-27 12:36               ` [PATCHv3 2/3] diff: introduce --stat-lines to limit the stat lines Michael J Gruber
2011-05-28  4:46                 ` Junio C Hamano
2011-05-30  6:40                   ` Michael J Gruber [this message]
2011-05-30  7:36                     ` Junio C Hamano
2011-05-27 12:36               ` [PATCHv3 3/3] diff-options.txt: describe --stat-{width,name-width,count} Michael J Gruber
2011-05-04  4:37         ` [PATCHv2 1/2] diff: introduce --stat-lines to limit the stat lines Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DE33BF0.7060607@drmicha.warpmail.net \
    --to=git@drmicha.warpmail.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.