From: Michael J Gruber <git@drmicha.warpmail.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCHv3 2/3] diff: introduce --stat-lines to limit the stat lines
Date: Mon, 30 May 2011 08:40:48 +0200 [thread overview]
Message-ID: <4DE33BF0.7060607@drmicha.warpmail.net> (raw)
In-Reply-To: <7v39jzqvny.fsf@alter.siamese.dyndns.org>
Junio C Hamano venit, vidit, dixit 28.05.2011 06:46:
> Michael J Gruber <git@drmicha.warpmail.net> writes:
>
>> diff --git a/diff.c b/diff.c
>> index 4541939..6a6cbca 100644
>> --- a/diff.c
>> +++ b/diff.c
>> @@ -1259,6 +1259,7 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
>>
>> width = options->stat_width ? options->stat_width : 80;
>> name_width = options->stat_name_width ? options->stat_name_width : 50;
>> + count = options->stat_count ? options->stat_count : data->nr;
>
> It is a tad hard to follow the logic of "count" here. It could be the
> ceiling the user gave you, or we may not be constrained at all if the full
> diff is smaller. But then
>
>> @@ -1275,11 +1276,12 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
>> add_c = diff_get_color_opt(options, DIFF_FILE_NEW);
>> del_c = diff_get_color_opt(options, DIFF_FILE_OLD);
>>
>> - for (i = 0; i < data->nr; i++) {
>> + for (i = 0; (i < count) && (i < data->nr); i++) {
>
> Here we compare "i" which does not have much to do with how many we
> actually have to show (as we may be skipping in the loop) with "count" and
> also with data->nr. In the loop, you compensate for the skipping
> (i.e. increment of "i" that should not affect the outcome) by incrementing
> "count", so (i < count) part will correctly count towards the limit the
> user gave. It may be a correct math, but it is tricky and hard to read.
> Especially if we started with the options->stat_count that is larger than
> the data->nr, of course "i < count" will always hold true with or without
> skipping, which also is correct but even trickier.
Maybe I was trying too hard to be efficient. We could have count be
stat_count if non-zero or data->nr and limit all loops by the double
conditional (i < count) && (i < data->nr), resp. use an additional
counter like shown or skipped and compare with that.
Given the short discussion for the initial patch, I really did not
expect how much it would still take... (My bad, of course.)
>
>> struct diffstat_file *file = data->files[i];
>> uintmax_t change = file->added + file->deleted;
>> if (!data->files[i]->is_renamed &&
>> (change == 0)) {
>> + count++; /* not shown == room for one more */
>> continue;
>> }
>> fill_print_name(file);
>> @@ -1292,6 +1294,7 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
>> if (max_change < change)
>> max_change = change;
>> }
>> + count = i; /* min(count, data->nr) */
>
> This value is mechanically min(count, data->nr) but it is confusing to
> explain what it really means. It is the number of elements you have to
> look at in data->files[] array, still skipping irrelevant entries in the
> loop, until you show options->stat_count number of entries.
As you explain further down, we may have to look at even more in order
to be able to say "..." or not.
>
>> @@ -1306,7 +1309,7 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
>> else
>> width = max_change;
>>
>> - for (i = 0; i < data->nr; i++) {
>> + for (i = 0; i < count; i++) {
>
> Hence this loop is correct. You ignore everything beyond count in the
> array, as their lengths are immaterial to the outcome.
>
I think I really should not worry too much about efficiency here and
more about simple code.
>> @@ -1373,6 +1376,19 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
>> show_graph(options->file, '-', del, del_c, reset);
>> fprintf(options->file, "\n");
>> }
>> + if (count < data->nr)
>> + fprintf(options->file, "%s ...\n", line_prefix);
>
> At this point, count is the sum of the number of items you are going to
> show and the number of items you are going to skip. If that is smaller
> than the total in the original array, there are items that you are not
> showing nor skipping. But you terminated the first loop early before
> reaching the end of data->file[] array, so you do not know enough to say
> "there _are_ more to show but I am not showing", I think.
Sigh, you're right. We would have to look until we find the first
non-show non-skipped in order to say that.
>
> If
>
> - data->nr is 3;
> - the data->file[0] is to be shown but other entries in data->file[] are
> not to be shown; and
> - opt->stat_count is 1
>
> then your first loop fills data->file[0]->name, increments i to 1, the
> condition (i < count) no longer holds, and stops. At this point count is
> 1, data->nr is 3, but data->file[1] and [2] are not to be shown anyway, so
> this "there are more but I am not showing them" is not quite right, is it?
>
>> + for (i = count; i < data->nr; i++) {
>> + uintmax_t added = data->files[i]->added;
>> + uintmax_t deleted = data->files[i]->deleted;
>> + if (!data->files[i]->is_renamed &&
>> + (added + deleted == 0)) {
>> + total_files--;
>> + continue;
>> + }
>> + adds += added;
>> + dels += deleted;
>> + }
>
> If this loop runs "adds += added" even once, then you have "there are more
> but I am not showing" situation, and if all the entries beyond count are
> skipped, then you are not limiting the output.
>
> By the way, in this round, you are trying to show only the first N items,
> and if there are more to show, then showing "..." after them. I kind of
> liked the idea that came up during the discussion in earlier rounds to
> show N-1 then "..." then the last one item, but I do not feel too
> strongly about it.
I thought fmt-merge-msg was the example to follow. It does exactly what
I'm trying here, i.e. show count items (if there are count or more), and
show "..." if there are more.
> diff.c | 6 ++++--
> t/t4049-diff-stat-count.sh | 25 +++++++++++++++++++++++++
> 2 files changed, 29 insertions(+), 2 deletions(-)
>
> diff --git a/diff.c b/diff.c
> index 82789e3..542ff42 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -1247,6 +1247,7 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
> int width, name_width, count;
> const char *reset, *add_c, *del_c;
> const char *line_prefix = "";
> + int extra_shown = 0;
> struct strbuf *msg = NULL;
>
> if (data->nr == 0)
> @@ -1376,8 +1377,6 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
> show_graph(options->file, '-', del, del_c, reset);
> fprintf(options->file, "\n");
> }
> - if (count < data->nr)
> - fprintf(options->file, "%s ...\n", line_prefix);
> for (i = count; i < data->nr; i++) {
> uintmax_t added = data->files[i]->added;
> uintmax_t deleted = data->files[i]->deleted;
> @@ -1388,6 +1387,9 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
> }
> adds += added;
> dels += deleted;
> + if (!extra_shown)
> + fprintf(options->file, "%s ...\n", line_prefix);
> + extra_shown = 1;
> }
> fprintf(options->file, "%s", line_prefix);
> fprintf(options->file,
> diff --git a/t/t4049-diff-stat-count.sh b/t/t4049-diff-stat-count.sh
> new file mode 100755
> index 0000000..641e70d
> --- /dev/null
> +++ b/t/t4049-diff-stat-count.sh
> @@ -0,0 +1,25 @@
> +#!/bin/sh
> +# Copyright (c) 2011, Google Inc.
> +
> +test_description='diff --stat-count'
> +. ./test-lib.sh
> +
> +test_expect_success setup '
> + >a &&
> + >b &&
> + >c &&
> + >d &&
> + git add a b c d &&
> + chmod +x c d &&
> + echo a >a &&
> + echo b >b &&
> + cat >expect <<-\EOF
> + a | 1 +
> + b | 1 +
> + 2 files changed, 2 insertions(+), 0 deletions(-)
> + EOF
> + git diff --stat --stat-count=2 >actual &&
> + test_cmp expect actual
> +'
> +
> +test_done
That works, thanks also for the test.
Should I squash this in or go for a clearer use of "count"? (Also, I may
have to take into account your notes about the workflow.)
Michael
next prev parent reply other threads:[~2011-05-30 6:40 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-29 14:57 [PATCH 1/2] diff: introduce --stat-lines to limit the stat lines Michael J Gruber
2011-04-29 14:57 ` [PATCH 2/2] diff-options.txt: describe --stat-{width,name-width,lines} Michael J Gruber
2011-04-29 15:12 ` [PATCH 1/2] diff: introduce --stat-lines to limit the stat lines Michael J Gruber
2011-04-29 16:15 ` Junio C Hamano
2011-05-01 8:27 ` Michael J Gruber
2011-05-01 18:33 ` Junio C Hamano
2011-05-03 10:46 ` [PATCHv2 " Michael J Gruber
2011-05-03 10:46 ` [PATCHv2 2/2] diff-options.txt: describe --stat-{width,name-width,count} Michael J Gruber
2011-05-03 18:47 ` [PATCHv2 1/2] diff: introduce --stat-lines to limit the stat lines Junio C Hamano
2011-05-04 7:16 ` Michael J Gruber
2011-05-27 12:36 ` [PATCHv3 0/3] diff.c: --stat-count=<n> Michael J Gruber
2011-05-27 12:36 ` [PATCHv3 1/3] diff.c: omit hidden entries from namelen calculation with --stat Michael J Gruber
2011-05-27 17:43 ` Junio C Hamano
2011-05-27 20:19 ` Michael J Gruber
2011-05-27 22:45 ` Junio C Hamano
2011-05-27 12:36 ` [PATCHv3 2/3] diff: introduce --stat-lines to limit the stat lines Michael J Gruber
2011-05-28 4:46 ` Junio C Hamano
2011-05-30 6:40 ` Michael J Gruber [this message]
2011-05-30 7:36 ` Junio C Hamano
2011-05-27 12:36 ` [PATCHv3 3/3] diff-options.txt: describe --stat-{width,name-width,count} Michael J Gruber
2011-05-04 4:37 ` [PATCHv2 1/2] diff: introduce --stat-lines to limit the stat lines Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4DE33BF0.7060607@drmicha.warpmail.net \
--to=git@drmicha.warpmail.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).