* A bug in git 1.6.5.2 with git log --stat: shows a negative number as a size @ 2010-04-16 13:59 Heikki Orsila 2010-04-16 15:02 ` Tomas Carnecky 0 siblings, 1 reply; 5+ messages in thread From: Heikki Orsila @ 2010-04-16 13:59 UTC (permalink / raw) To: git I'm running git version 1.6.5.2. git log --stat shows a negative diffstat size for two files that are each 2049MiB in size. Steps to reproduce: $ for f in 0 1 ; do dd bs=$((1024*1024)) if=/dev/zero of=$f count=2049 ; done $ git add 0 1 $ git commit -m "test commit" $ git log --stat commit 6afe3d3c889daa92bd79956c4bb733eb5cb408dc Author: Heikki Orsila <heikki.orsila@iki.fi> Date: 2010-04-16 16:54:52 +0300 test commit 0 | Bin 0 -> -2146435072 bytes 1 | Bin 0 -> -2146435072 bytes 2 files changed, 0 insertions(+), 0 deletions(-) -- Heikki Orsila heikki.orsila@iki.fi http://www.iki.fi/shd ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: A bug in git 1.6.5.2 with git log --stat: shows a negative number as a size 2010-04-16 13:59 A bug in git 1.6.5.2 with git log --stat: shows a negative number as a size Heikki Orsila @ 2010-04-16 15:02 ` Tomas Carnecky 2010-04-17 10:25 ` [PATCH] diff: use 64-bit integers for diffstat calculations Jeff King 0 siblings, 1 reply; 5+ messages in thread From: Tomas Carnecky @ 2010-04-16 15:02 UTC (permalink / raw) To: Heikki Orsila; +Cc: git On 4/16/10 3:59 PM, Heikki Orsila wrote: > I'm running git version 1.6.5.2. git log --stat shows a negative > diffstat size for two files that are each 2049MiB in size. > > Steps to reproduce: > > $ for f in 0 1 ; do dd bs=$((1024*1024)) if=/dev/zero of=$f count=2049 ; done > $ git add 0 1 > $ git commit -m "test commit" > $ git log --stat > commit 6afe3d3c889daa92bd79956c4bb733eb5cb408dc > Author: Heikki Orsila<heikki.orsila@iki.fi> > Date: 2010-04-16 16:54:52 +0300 > > test commit > > 0 | Bin 0 -> -2146435072 bytes > 1 | Bin 0 -> -2146435072 bytes > 2 files changed, 0 insertions(+), 0 deletions(-) Yep, bug is also in the latest version (1.7.0.5). The code uses 'int' instead of something big enough to hold the size of your files. http://git.kernel.org/?p=git/git.git;a=blob;f=diff.c;h=a1bf1e9cb37104cda8168c5118769ce5bbcfcbb2;hb=HEAD#l1105 and a couple lines below (1124) you see that the stat is printed out. tom ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] diff: use 64-bit integers for diffstat calculations 2010-04-16 15:02 ` Tomas Carnecky @ 2010-04-17 10:25 ` Jeff King 2010-04-17 17:00 ` Junio C Hamano 0 siblings, 1 reply; 5+ messages in thread From: Jeff King @ 2010-04-17 10:25 UTC (permalink / raw) To: Tomas Carnecky; +Cc: Junio C Hamano, Heikki Orsila, git On Fri, Apr 16, 2010 at 05:02:01PM +0200, Tomas Carnecky wrote: > >commit 6afe3d3c889daa92bd79956c4bb733eb5cb408dc > >Author: Heikki Orsila<heikki.orsila@iki.fi> > >Date: 2010-04-16 16:54:52 +0300 > > > > test commit > > > > 0 | Bin 0 -> -2146435072 bytes > > 1 | Bin 0 -> -2146435072 bytes > > 2 files changed, 0 insertions(+), 0 deletions(-) > > Yep, bug is also in the latest version (1.7.0.5). The code uses 'int' > instead of something big enough to hold the size of your files. Yuck, we use "unsigned int" for the actual storage, and then convert to a regular "int" in some other places. I think we should just do this: -- >8 -- Subject: [PATCH] diff: use 64-bit integers for diffstat calculations The diffstat "added" and "changed" fields generally store line counts; however, for binary files, they store file sizes. Since we store and print these values as ints, a diffstat on a file larger than 2G can show a negative size. Instead, let's explicitly use 64-bit integers. Signed-off-by: Jeff King <peff@peff.net> --- diff.c | 21 ++++++++++++--------- 1 files changed, 12 insertions(+), 9 deletions(-) diff --git a/diff.c b/diff.c index 7effdac..54933cc 100644 --- a/diff.c +++ b/diff.c @@ -936,7 +936,7 @@ struct diffstat_t { unsigned is_unmerged:1; unsigned is_binary:1; unsigned is_renamed:1; - unsigned int added, deleted; + uint64_t added, deleted; } **files; }; @@ -1028,7 +1028,7 @@ static void fill_print_name(struct diffstat_file *file) static void show_stats(struct diffstat_t *data, struct diff_options *options) { int i, len, add, del, adds = 0, dels = 0; - int max_change = 0, max_len = 0; + uint64_t max_change = 0, max_len = 0; int total_files = data->nr; int width, name_width; const char *reset, *set, *add_c, *del_c; @@ -1057,7 +1057,7 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options) for (i = 0; i < data->nr; i++) { struct diffstat_file *file = data->files[i]; - int change = file->added + file->deleted; + uint64_t change = file->added + file->deleted; fill_print_name(file); len = strlen(file->print_name); if (max_len < len) @@ -1085,8 +1085,8 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options) for (i = 0; i < data->nr; i++) { const char *prefix = ""; char *name = data->files[i]->print_name; - int added = data->files[i]->added; - int deleted = data->files[i]->deleted; + uint64_t added = data->files[i]->added; + uint64_t deleted = data->files[i]->deleted; int name_len; /* @@ -1107,9 +1107,11 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options) if (data->files[i]->is_binary) { show_name(options->file, prefix, name, len); fprintf(options->file, " Bin "); - fprintf(options->file, "%s%d%s", del_c, deleted, reset); + fprintf(options->file, "%s%"PRIu64"%s", + del_c, deleted, reset); fprintf(options->file, " -> "); - fprintf(options->file, "%s%d%s", add_c, added, reset); + fprintf(options->file, "%s%"PRIu64"%s", + add_c, added, reset); fprintf(options->file, " bytes"); fprintf(options->file, "\n"); continue; @@ -1138,7 +1140,7 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options) del = scale_linear(del, width, max_change); } show_name(options->file, prefix, name, len); - fprintf(options->file, "%5d%s", added + deleted, + fprintf(options->file, "%5"PRIu64"%s", added + deleted, added + deleted ? " " : ""); show_graph(options->file, '+', add, add_c, reset); show_graph(options->file, '-', del, del_c, reset); @@ -1188,7 +1190,8 @@ static void show_numstat(struct diffstat_t *data, struct diff_options *options) fprintf(options->file, "-\t-\t"); else fprintf(options->file, - "%d\t%d\t", file->added, file->deleted); + "%"PRIu64"\t%"PRIu64"\t", + file->added, file->deleted); if (options->line_termination) { fill_print_name(file); if (!file->is_renamed) -- 1.7.1.rc1.277.g2c4c9 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] diff: use 64-bit integers for diffstat calculations 2010-04-17 10:25 ` [PATCH] diff: use 64-bit integers for diffstat calculations Jeff King @ 2010-04-17 17:00 ` Junio C Hamano 2010-04-17 17:41 ` Jeff King 0 siblings, 1 reply; 5+ messages in thread From: Junio C Hamano @ 2010-04-17 17:00 UTC (permalink / raw) To: Jeff King; +Cc: Tomas Carnecky, Heikki Orsila, git Jeff King <peff@peff.net> writes: > Yuck, we use "unsigned int" for the actual storage, and then convert to > a regular "int" in some other places. I think we should just do this: > > -- >8 -- > Subject: [PATCH] diff: use 64-bit integers for diffstat calculations > > The diffstat "added" and "changed" fields generally store > line counts; however, for binary files, they store file > sizes. Since we store and print these values as ints, a > diffstat on a file larger than 2G can show a negative size. > Instead, let's explicitly use 64-bit integers. > > Signed-off-by: Jeff King <peff@peff.net> > --- Yes, but we would probably be better off using using uintmax_t for things like this if the quantity a variable represents is not closely tied to external file format (e.g. the offset field of pack idx file), nor the code is only for a particular platform (e.g. compat/win32mmap.c), don't you think? That is the impression I am getting on the discipline expressed in the current codebase, from browsing the output from "git grep uint64_t". ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] diff: use 64-bit integers for diffstat calculations 2010-04-17 17:00 ` Junio C Hamano @ 2010-04-17 17:41 ` Jeff King 0 siblings, 0 replies; 5+ messages in thread From: Jeff King @ 2010-04-17 17:41 UTC (permalink / raw) To: Junio C Hamano; +Cc: Tomas Carnecky, Heikki Orsila, git On Sat, Apr 17, 2010 at 10:00:44AM -0700, Junio C Hamano wrote: > > The diffstat "added" and "changed" fields generally store > > line counts; however, for binary files, they store file > > sizes. Since we store and print these values as ints, a > > diffstat on a file larger than 2G can show a negative size. > > Instead, let's explicitly use 64-bit integers. > > Yes, but we would probably be better off using using uintmax_t for things > like this if the quantity a variable represents is not closely tied to > external file format (e.g. the offset field of pack idx file), nor the > code is only for a particular platform (e.g. compat/win32mmap.c), don't > you think? But 640K^W 64-bits should be large enough for anyone! :) But yes, you're right. Updated patch below. -- >8 -- Subject: [PATCH] diff: use large integers for diffstat calculations The diffstat "added" and "changed" fields generally store line counts; however, for binary files, they store file sizes. Since we store and print these values as ints, a diffstat on a file larger than 2G can show a negative size. Instead, let's use uintmax_t, which should be at least 64 bits on modern platforms. Signed-off-by: Jeff King <peff@peff.net> --- diff.c | 21 ++++++++++++--------- 1 files changed, 12 insertions(+), 9 deletions(-) diff --git a/diff.c b/diff.c index 7effdac..fbdbd8d 100644 --- a/diff.c +++ b/diff.c @@ -936,7 +936,7 @@ struct diffstat_t { unsigned is_unmerged:1; unsigned is_binary:1; unsigned is_renamed:1; - unsigned int added, deleted; + uintmax_t added, deleted; } **files; }; @@ -1028,7 +1028,7 @@ static void fill_print_name(struct diffstat_file *file) static void show_stats(struct diffstat_t *data, struct diff_options *options) { int i, len, add, del, adds = 0, dels = 0; - int max_change = 0, max_len = 0; + uintmax_t max_change = 0, max_len = 0; int total_files = data->nr; int width, name_width; const char *reset, *set, *add_c, *del_c; @@ -1057,7 +1057,7 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options) for (i = 0; i < data->nr; i++) { struct diffstat_file *file = data->files[i]; - int change = file->added + file->deleted; + uintmax_t change = file->added + file->deleted; fill_print_name(file); len = strlen(file->print_name); if (max_len < len) @@ -1085,8 +1085,8 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options) for (i = 0; i < data->nr; i++) { const char *prefix = ""; char *name = data->files[i]->print_name; - int added = data->files[i]->added; - int deleted = data->files[i]->deleted; + uintmax_t added = data->files[i]->added; + uintmax_t deleted = data->files[i]->deleted; int name_len; /* @@ -1107,9 +1107,11 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options) if (data->files[i]->is_binary) { show_name(options->file, prefix, name, len); fprintf(options->file, " Bin "); - fprintf(options->file, "%s%d%s", del_c, deleted, reset); + fprintf(options->file, "%s%"PRIuMAX"%s", + del_c, deleted, reset); fprintf(options->file, " -> "); - fprintf(options->file, "%s%d%s", add_c, added, reset); + fprintf(options->file, "%s%"PRIuMAX"%s", + add_c, added, reset); fprintf(options->file, " bytes"); fprintf(options->file, "\n"); continue; @@ -1138,7 +1140,7 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options) del = scale_linear(del, width, max_change); } show_name(options->file, prefix, name, len); - fprintf(options->file, "%5d%s", added + deleted, + fprintf(options->file, "%5"PRIuMAX"%s", added + deleted, added + deleted ? " " : ""); show_graph(options->file, '+', add, add_c, reset); show_graph(options->file, '-', del, del_c, reset); @@ -1188,7 +1190,8 @@ static void show_numstat(struct diffstat_t *data, struct diff_options *options) fprintf(options->file, "-\t-\t"); else fprintf(options->file, - "%d\t%d\t", file->added, file->deleted); + "%"PRIuMAX"\t%"PRIuMAX"\t", + file->added, file->deleted); if (options->line_termination) { fill_print_name(file); if (!file->is_renamed) -- 1.7.1.rc1.277.g2c4c9 ^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-04-17 17:41 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-04-16 13:59 A bug in git 1.6.5.2 with git log --stat: shows a negative number as a size Heikki Orsila 2010-04-16 15:02 ` Tomas Carnecky 2010-04-17 10:25 ` [PATCH] diff: use 64-bit integers for diffstat calculations Jeff King 2010-04-17 17:00 ` Junio C Hamano 2010-04-17 17:41 ` Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).