git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] diff: Fixes shortstat number of files
@ 2012-11-21 21:26 Antoine Pelisse
  2012-11-26  3:28 ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Antoine Pelisse @ 2012-11-21 21:26 UTC (permalink / raw)
  To: git; +Cc: Antoine Pelisse

There is a discrepancy between the last line of `git diff --stat`
and `git diff --shortstat` in case of a merge.
The unmerged files are actually counted twice, thus doubling the
value of "file changed".

In fact, while stat decrements number of files when seeing an unmerged
file, shortstat doesn't.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
---
 diff.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/diff.c b/diff.c
index e89a201..5c6bcbd 100644
--- a/diff.c
+++ b/diff.c
@@ -1704,9 +1704,8 @@ static void show_shortstats(struct diffstat_t *data, struct diff_options *option
 		int added = data->files[i]->added;
 		int deleted= data->files[i]->deleted;
 
-		if (data->files[i]->is_unmerged)
-			continue;
-		if (!data->files[i]->is_renamed && (added + deleted == 0)) {
+		if (data->files[i]->is_unmerged ||
+		  (!data->files[i]->is_renamed && (added + deleted == 0))) {
 			total_files--;
 		} else if (!data->files[i]->is_binary) { /* don't count bytes */
 			adds += added;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] diff: Fixes shortstat number of files
@ 2012-11-23  7:33 Antoine Pelisse
  0 siblings, 0 replies; 6+ messages in thread
From: Antoine Pelisse @ 2012-11-23  7:33 UTC (permalink / raw)
  To: git; +Cc: Antoine Pelisse

There is a discrepancy between the last line of `git diff --stat`
and `git diff --shortstat` in case of a merge.
The unmerged files are actually counted twice, thus doubling the
value of "file changed".

In fact, while stat decrements number of files when seeing an unmerged
file, shortstat doesn't.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
---
 diff.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/diff.c b/diff.c
index e89a201..5c6bcbd 100644
--- a/diff.c
+++ b/diff.c
@@ -1704,9 +1704,8 @@ static void show_shortstats(struct diffstat_t *data, struct diff_options *option
 		int added = data->files[i]->added;
 		int deleted= data->files[i]->deleted;
 
-		if (data->files[i]->is_unmerged)
-			continue;
-		if (!data->files[i]->is_renamed && (added + deleted == 0)) {
+		if (data->files[i]->is_unmerged ||
+		  (!data->files[i]->is_renamed && (added + deleted == 0))) {
 			total_files--;
 		} else if (!data->files[i]->is_binary) { /* don't count bytes */
 			adds += added;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] diff: Fixes shortstat number of files
@ 2012-11-23 21:27 Antoine Pelisse
  0 siblings, 0 replies; 6+ messages in thread
From: Antoine Pelisse @ 2012-11-23 21:27 UTC (permalink / raw)
  To: git, gitster; +Cc: Antoine Pelisse

There is a discrepancy between the last line of `git diff --stat`
and `git diff --shortstat` in case of a merge.
The unmerged files are actually counted twice, thus doubling the
value of "file changed".

In fact, while stat decrements number of files when seeing an unmerged
file, shortstat doesn't.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
---
 diff.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/diff.c b/diff.c
index e89a201..5c6bcbd 100644
--- a/diff.c
+++ b/diff.c
@@ -1704,9 +1704,8 @@ static void show_shortstats(struct diffstat_t *data, struct diff_options *option
 		int added = data->files[i]->added;
 		int deleted= data->files[i]->deleted;
 
-		if (data->files[i]->is_unmerged)
-			continue;
-		if (!data->files[i]->is_renamed && (added + deleted == 0)) {
+		if (data->files[i]->is_unmerged ||
+		  (!data->files[i]->is_renamed && (added + deleted == 0))) {
 			total_files--;
 		} else if (!data->files[i]->is_binary) { /* don't count bytes */
 			adds += added;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] diff: Fixes shortstat number of files
  2012-11-21 21:26 [PATCH] diff: Fixes shortstat number of files Antoine Pelisse
@ 2012-11-26  3:28 ` Junio C Hamano
  2012-11-26  9:10   ` Antoine Pelisse
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2012-11-26  3:28 UTC (permalink / raw)
  To: Antoine Pelisse; +Cc: git

Antoine Pelisse <apelisse@gmail.com> writes:

> Subject: Re: [PATCH] diff: Fixes shortstat number of files

Please replace "Fixes" with "Fix at least (because our log messages
are written as if a patch is giving an order to the codebase, iow,
in imperative mood), but we would prefer to see a concrete
description on what is fixed, when we can.  And in this case, I
think we can, perhaps:

    diff: do not count unmerged paths twice in --shortstat/--numstat

or something.

> There is a discrepancy between the last line of `git diff --stat`
> and `git diff --shortstat` in case of a merge.
> The unmerged files are actually counted twice, thus doubling the
> value of "file changed".

I think the current 'master' and upward is broken with respect to
this; I am consistently getting two entries for unmerged paths
across --stat, --shortstat and --numstat options (iow, not just
shortstat and numstat but the '--stat' seems to be broken as well).

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] diff: Fixes shortstat number of files
  2012-11-26  3:28 ` Junio C Hamano
@ 2012-11-26  9:10   ` Antoine Pelisse
  2012-11-27 19:14     ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Antoine Pelisse @ 2012-11-26  9:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Linus Torvalds

Indeed stat seems to be broken on master by commit 74faaa16 from Linus Torvalds

There are three separated issues here:
 - unmerged files are marked as "interesting" in stat and probably
shouldn't, with some patch like this:

        data->is_interesting = p->status != 0;

        if (!one || !two) {
                data->is_unmerged = 1;
+               data->is_interesting = 0;
                return;
        }

By the way, I don't get the point of this code then:

        else if (data->files[i]->is_unmerged) {
            fprintf(options->file, "%s", line_prefix);
            show_name(options->file, prefix, name, len);
            fprintf(options->file, " Unmerged\n");
            continue;
        }

and

        if (file->is_unmerged) {
            /* "Unmerged" is 8 characters */
            bin_width = bin_width < 8 ? 8 : bin_width;
            continue;
        }

Are we ever supposed to print that ? I feel like it could be removed.

 - Unmerged files are not filtered out in shortstat, thus counted
twice (addressed by the patch)
 - no file has ever been filtered out of numstat, and probably should
the way it's done in stat. That is with something like this:

        if (!data->files[i]->is_interesting &&
             (added + deleted == 0)) {
            continue;
        }


Cheers,
Antoine Pelisse


---------- Forwarded message ----------
From: Junio C Hamano <gitster@pobox.com>
Date: Mon, Nov 26, 2012 at 4:28 AM
Subject: Re: [PATCH] diff: Fixes shortstat number of files
To: Antoine Pelisse <apelisse@gmail.com>
Cc: git@vger.kernel.org


Antoine Pelisse <apelisse@gmail.com> writes:

> Subject: Re: [PATCH] diff: Fixes shortstat number of files

Please replace "Fixes" with "Fix at least (because our log messages
are written as if a patch is giving an order to the codebase, iow,
in imperative mood), but we would prefer to see a concrete
description on what is fixed, when we can.  And in this case, I
think we can, perhaps:

    diff: do not count unmerged paths twice in --shortstat/--numstat

or something.

> There is a discrepancy between the last line of `git diff --stat`
> and `git diff --shortstat` in case of a merge.
> The unmerged files are actually counted twice, thus doubling the
> value of "file changed".

I think the current 'master' and upward is broken with respect to
this; I am consistently getting two entries for unmerged paths
across --stat, --shortstat and --numstat options (iow, not just
shortstat and numstat but the '--stat' seems to be broken as well).

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] diff: Fixes shortstat number of files
  2012-11-26  9:10   ` Antoine Pelisse
@ 2012-11-27 19:14     ` Junio C Hamano
  0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2012-11-27 19:14 UTC (permalink / raw)
  To: Antoine Pelisse; +Cc: git, Linus Torvalds

Antoine Pelisse <apelisse@gmail.com> writes:

> Indeed stat seems to be broken on master by commit 74faaa16 from Linus Torvalds
>
> There are three separated issues here:
>  - unmerged files are marked as "interesting" in stat and probably
> shouldn't, with some patch like this:
>
>         data->is_interesting = p->status != 0;
>
>         if (!one || !two) {
>                 data->is_unmerged = 1;
> +               data->is_interesting = 0;
>                 return;
>         }
>
> By the way, I don't get the point of this code then:
>
>         else if (data->files[i]->is_unmerged) {
>             fprintf(options->file, "%s", line_prefix);
>             show_name(options->file, prefix, name, len);
>             fprintf(options->file, " Unmerged\n");
>             continue;
>         }
>
> and
>
>         if (file->is_unmerged) {
>             /* "Unmerged" is 8 characters */
>             bin_width = bin_width < 8 ? 8 : bin_width;
>             continue;
>         }
>
> Are we ever supposed to print that ? I feel like it could be removed.

Yes, we have been showing two entries in --stat output:

 file | Unmerged
 file | 4 ++++

and that is not going to change.

There are a few problems in diff.c around --stat area, partially
caused by Linus's patch (like s/is_rename/is_interesting/ change
started ignoring unmerged entries in the diff queue and made the
existing loop not go into the codepath we see above), and largely
caused by the earlier change that introduced when --stat-count was
added (the second loop that decrements total_files does so only for
the paths within the "count" horizon determined by the first loop;
total_files must be adjusted for _all_ uninteresting and unchanged
filepairs and exclude unmerged entries).

Also the initialization of data->is_interesting is wrong.  These
days, p->status is never zero.

I'll send out a fix later today.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-11-27 19:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-21 21:26 [PATCH] diff: Fixes shortstat number of files Antoine Pelisse
2012-11-26  3:28 ` Junio C Hamano
2012-11-26  9:10   ` Antoine Pelisse
2012-11-27 19:14     ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2012-11-23  7:33 Antoine Pelisse
2012-11-23 21:27 Antoine Pelisse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).