From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>
Subject: [PATCH] decimal_width: avoid integer overflow
Date: Thu, 5 Feb 2015 03:14:19 -0500 [thread overview]
Message-ID: <20150205081419.GA7666@peff.net> (raw)
The decimal_width function originally appeared in blame.c as
"lineno_width", and was designed for calculating the
print-width of small-ish integer values (line numbers in
text files). In ec7ff5b, it was made into a reusable
function, and in dc801e7, we started using it to align
diffstats.
Binary files in a diffstat show byte counts rather than line
numbers, meaning they can be quite large (e.g., consider
adding or removing a 2GB file). decimal_width is not up to
the challenge for two reasons:
1. It takes the value as an "int", whereas large files may
easily surpass this. The value may be truncated, in
which case we will produce an incorrect value.
2. It counts "up" by repeatedly multiplying another
integer by 10 until it surpasses the value. This can
cause an infinite loop when the value is close to the
largest representable integer.
For example, consider using a 32-bit signed integer,
and a value of 2,140,000,000 (just shy of 2^31-1).
We will count up and eventually see that 1,000,000,000
is smaller than our value. The next step would be to
multiply by 10 and see that 10,000,000,000 is too
large, ending the loop. But we can't represent that
value, and we have signed overflow.
This is technically undefined behavior, but a common
behavior is to lose the high bits, in which case our
iterator will certainly be less than the number. So
we'll keep multiplying, overflow again, and so on.
This patch changes the argument to a uintmax_t (the same
type we use to store the diffstat information for binary
filese), and counts "down" by repeatedly dividing our value
by 10.
Signed-off-by: Jeff King <peff@peff.net>
---
Note that besides taking a larger type, we also switch to an unsigned
type. I don't think a signed value makes any sense here (do we include
the "-" or not?), and certainly would not have behaved correctly with
the old code. I did a quick look over all the callers, and they all look
to be conceptually unsigned.
cache.h | 2 +-
| 8 ++++----
2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/cache.h b/cache.h
index f704af5..04951dd 100644
--- a/cache.h
+++ b/cache.h
@@ -1498,7 +1498,7 @@ extern const char *pager_program;
extern int pager_in_use(void);
extern int pager_use_color;
extern int term_columns(void);
-extern int decimal_width(int);
+extern int decimal_width(uintmax_t);
extern int check_pager_config(const char *cmd);
extern const char *editor_program;
--git a/pager.c b/pager.c
index f6e8c33..98b2682 100644
--- a/pager.c
+++ b/pager.c
@@ -133,12 +133,12 @@ int term_columns(void)
/*
* How many columns do we need to show this number in decimal?
*/
-int decimal_width(int number)
+int decimal_width(uintmax_t number)
{
- int i, width;
+ int width;
- for (width = 1, i = 10; i <= number; width++)
- i *= 10;
+ for (width = 1; number >= 10; width++)
+ number /= 10;
return width;
}
--
2.3.0.rc1.287.g761fd19
next reply other threads:[~2015-02-05 8:14 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-05 8:14 Jeff King [this message]
2015-02-05 20:42 ` [PATCH] decimal_width: avoid integer overflow Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150205081419.GA7666@peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.