git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* BUG? in --dirstat when rearranging lines in a file
@ 2011-04-07 13:49 Johan Herland
  2011-04-07 14:56 ` Linus Torvalds
  0 siblings, 1 reply; 91+ messages in thread
From: Johan Herland @ 2011-04-07 13:49 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Linus Torvalds

Hi,

(CCed the two main authors of --dirstat and diffcore-delta.c)

Consider the following sequence of commands:

$ git init
$ mkdir dir
$ echo -e "foo\nbar" > dir/file
$ git add dir
$ git commit -m "first"
$ # Rearrange lines in dir/file
$ echo -e "bar\nfoo" > dir/file
$ git diff
diff --git a/dir/file b/dir/file
index 3bd1f0e..1289765 100644
--- a/dir/file
+++ b/dir/file
@@ -1,2 +1,2 @@
-foo
 bar
+foo
$ git diff --stat
 dir/file |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
$ git diff --dirstat
$ # WTF!?

"git diff" and "git diff --stat" generates the expected output, but "git 
diff --dirstat" unexpectedly generates no output at all. I've traced 
this down through show_dirstat(), to diffcore_count_changes() which 
processes the pre-image and post-image to accumulate two counts:

- src_copied (#lines (or 64-byte chunks) copied from pre- to post-)

- literal_added (#lines/chunks added in post-).

When the diff consists only of rearranging lines (like the above 
example) the line-based hashing and subsequent sorting in 
diffcore-delta.c ends up hiding he fact that lines have been moved 
around, and the resulting --dirstat reports less changes than expected.

Is this a bug or a feature? :)


(This issue was originally found by a colleague at $dayjob who wrote a 
script (using --dirstat) to produce a summary of the areas of the 
source tree touched by a given commit)


Have fun! :)

...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply related	[flat|nested] 91+ messages in thread

end of thread, other threads:[~2011-04-29  9:37 UTC | newest]

Thread overview: 91+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-07 13:49 BUG? in --dirstat when rearranging lines in a file Johan Herland
2011-04-07 14:56 ` Linus Torvalds
2011-04-07 22:43   ` Junio C Hamano
2011-04-07 22:59     ` Linus Torvalds
2011-04-08 14:46   ` Johan Herland
2011-04-08 14:48     ` [PATCH 1/3] --dirstat: Document shortcomings compared to --stat or regular diff Johan Herland
2011-04-08 19:50       ` Junio C Hamano
2011-04-08 14:50     ` [PATCH 2/3] --dirstat-by-file: Make it faster and more correct Johan Herland
2011-04-08 14:55     ` [RFC/PATCH 3/3] Teach --dirstat to not completely ignore rearranged lines Johan Herland
2011-04-08 15:04     ` BUG? in --dirstat when rearranging lines in a file Linus Torvalds
2011-04-08 19:56       ` Junio C Hamano
2011-04-10 22:48         ` [PATCHv2 0/3] --dirstat fixes Johan Herland
2011-04-10 22:48           ` [PATCHv2 1/3] --dirstat: Describe non-obvious differences relative to --stat or regular diff Johan Herland
2011-04-10 22:48           ` [PATCHv2 2/3] --dirstat-by-file: Make it faster and more correct Johan Herland
2011-04-11 18:14             ` Junio C Hamano
2011-04-10 22:48           ` [PATCHv2 3/3] Teach --dirstat to not completely ignore rearranged lines within a file Johan Herland
2011-04-11 21:38             ` Junio C Hamano
2011-04-11 21:56               ` Johan Herland
2011-04-11 22:08                 ` Junio C Hamano
2011-04-12  9:22                   ` Johan Herland
2011-04-12  9:24                     ` [PATCH 4/3] --dirstat: In case of renames, use target filename instead of source filename Johan Herland
2011-04-12 14:59                       ` Linus Torvalds
2011-04-12  9:26                     ` [RFC/PATCH 5/3] Alternative --dirstat implementation, based on diffstat analysis Johan Herland
2011-04-12 14:46                       ` Linus Torvalds
2011-04-12 15:08                         ` Linus Torvalds
2011-04-12 22:03                           ` Johan Herland
2011-04-12 22:12                             ` Linus Torvalds
2011-04-12 22:22                             ` Junio C Hamano
2011-04-26  0:01                         ` [PATCH 0/6] --dirstat fixes, part 2 Johan Herland
2011-04-26  0:01                           ` [PATCH 1/6] Add several testcases for --dirstat and friends Johan Herland
2011-04-26  0:01                           ` [PATCH 2/6] Make --dirstat=0 output directories that contribute < 0.1% of changes Johan Herland
2011-04-26  0:01                           ` [PATCH 3/6] Refactor --dirstat parsing; deprecate --cumulative and --dirstat-by-file Johan Herland
2011-04-26 16:36                             ` Junio C Hamano
2011-04-27  2:02                               ` Johan Herland
2011-04-27  4:53                                 ` Junio C Hamano
2011-04-27 20:51                                 ` Junio C Hamano
2011-04-27 21:01                                   ` Junio C Hamano
2011-04-26  0:01                           ` [PATCH 4/6] Add config variable for specifying default --dirstat behavior Johan Herland
2011-04-26 16:43                             ` Junio C Hamano
2011-04-27  2:02                               ` Johan Herland
2011-04-26  0:01                           ` [PATCH 5/6] Use floating point for --dirstat percentages Johan Herland
2011-04-26 16:52                             ` Junio C Hamano
2011-04-27  2:02                               ` Johan Herland
2011-04-27  4:42                                 ` Junio C Hamano
2011-04-27  4:53                                   ` Linus Torvalds
2011-04-27  5:20                                     ` Junio C Hamano
2011-04-26  0:01                           ` [PATCH 6/6] New --dirstat=lines mode, doing dirstat analysis based on diffstat Johan Herland
2011-04-26 16:59                             ` Junio C Hamano
2011-04-27  2:02                               ` Johan Herland
2011-04-26  0:15                           ` [PATCH 0/6] --dirstat fixes, part 2 Linus Torvalds
2011-04-27  2:12                           ` [PATCHv2 " Johan Herland
2011-04-27  2:12                             ` [PATCHv2 1/6] Add several testcases for --dirstat and friends Johan Herland
2011-04-27  2:12                             ` [PATCHv2 2/6] Make --dirstat=0 output directories that contribute < 0.1% of changes Johan Herland
2011-04-27  2:12                             ` [PATCHv2 3/6] Refactor --dirstat parsing; deprecate --cumulative and --dirstat-by-file Johan Herland
2011-04-27  2:12                             ` [PATCHv2 4/6] Add config variable for specifying default --dirstat behavior Johan Herland
2011-04-27  2:12                             ` [PATCHv2 5/6] Use floating point for --dirstat percentages Johan Herland
2011-04-27  2:45                               ` Linus Torvalds
2011-04-27  2:12                             ` [PATCHv2 6/6] New --dirstat=lines mode, doing dirstat analysis based on diffstat Johan Herland
2011-04-27  8:24                             ` [PATCHv3 0/6] --dirstat fixes, part 2 Johan Herland
2011-04-27  8:24                               ` [PATCHv3 1/6] Add several testcases for --dirstat and friends Johan Herland
2011-04-27  8:24                               ` [PATCHv3 2/6] Make --dirstat=0 output directories that contribute < 0.1% of changes Johan Herland
2011-04-27  8:24                               ` [PATCHv3 3/6] Refactor --dirstat parsing; deprecate --cumulative and --dirstat-by-file Johan Herland
2011-04-27  8:24                               ` [PATCHv3 4/6] Add config variable for specifying default --dirstat behavior Johan Herland
2011-04-27  8:24                               ` [PATCHv3 5/6] Allow specifying --dirstat cut-off percentage as a floating point number Johan Herland
2011-04-27  8:37                                 ` Linus Torvalds
2011-04-27 10:29                                   ` [PATCHv4 " Johan Herland
2011-04-27  8:24                               ` [PATCHv3 6/6] New --dirstat=lines mode, doing dirstat analysis based on diffstat Johan Herland
2011-04-28  1:17                               ` [PATCHv5 0/7] --dirstat fixes, part 2 Johan Herland
2011-04-28  1:17                                 ` [PATCHv5 1/7] Add several testcases for --dirstat and friends Johan Herland
2011-04-28  1:17                                 ` [PATCHv5 2/7] Make --dirstat=0 output directories that contribute < 0.1% of changes Johan Herland
2011-04-28  1:17                                 ` [PATCHv5 3/7] Refactor --dirstat parsing; deprecate --cumulative and --dirstat-by-file Johan Herland
2011-04-28  1:17                                 ` [PATCHv5 4/7] Add config variable for specifying default --dirstat behavior Johan Herland
2011-04-28  1:17                                 ` [PATCHv5 5/7] Allow specifying --dirstat cut-off percentage as a floating point number Johan Herland
2011-04-28  1:17                                 ` [PATCHv5 6/7] New --dirstat=lines mode, doing dirstat analysis based on diffstat Johan Herland
2011-04-28  1:17                                 ` [PATCHv5 7/7] Improve error handling when parsing dirstat parameters Johan Herland
2011-04-28 18:41                                   ` Junio C Hamano
2011-04-28 19:20                                     ` Junio C Hamano
2011-04-28 23:16                                       ` Johan Herland
2011-04-28 23:13                                     ` Johan Herland
2011-04-29  4:06                                       ` Junio C Hamano
2011-04-29  9:36                                         ` [PATCHv6 0/8] --dirstat fixes, part 2 Johan Herland
2011-04-29  9:36                                           ` [PATCHv6 1/8] Add several testcases for --dirstat and friends Johan Herland
2011-04-29  9:36                                           ` [PATCHv6 2/8] Make --dirstat=0 output directories that contribute < 0.1% of changes Johan Herland
2011-04-29  9:36                                           ` [PATCHv6 3/8] Refactor --dirstat parsing; deprecate --cumulative and --dirstat-by-file Johan Herland
2011-04-29  9:36                                           ` [PATCHv6 4/8] Add config variable for specifying default --dirstat behavior Johan Herland
2011-04-29  9:36                                           ` [PATCHv6 5/8] Allow specifying --dirstat cut-off percentage as a floating point number Johan Herland
2011-04-29  9:36                                           ` [PATCHv6 6/8] New --dirstat=lines mode, doing dirstat analysis based on diffstat Johan Herland
2011-04-29  9:36                                           ` [PATCHv6 7/8] Improve error handling when parsing dirstat parameters Johan Herland
2011-04-29  9:36                                           ` [PATCHv6 8/8] Mark dirstat error messages for translation Johan Herland
2011-04-12 18:34                       ` [RFC/PATCH 5/3] Alternative --dirstat implementation, based on diffstat analysis Junio C Hamano
2011-04-10 23:17           ` [PATCHv2 0/3] --dirstat fixes Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).