From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Martin Koegler <martin.koegler@chello.at>
Subject: Re: [PATCH] zlib.c: use size_t for size
Date: Fri, 12 Oct 2018 22:38:45 -0400 [thread overview]
Message-ID: <20181013023845.GA15595@sigill.intra.peff.net> (raw)
In-Reply-To: <xmqqsh1bbq36.fsf@gitster-ct.c.googlers.com>
On Fri, Oct 12, 2018 at 04:07:25PM +0900, Junio C Hamano wrote:
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index e6316d294d..b9ca04eb8a 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -266,15 +266,15 @@ static void copy_pack_data(struct hashfile *f,
> struct packed_git *p,
> struct pack_window **w_curs,
> off_t offset,
> - off_t len)
> + size_t len)
> {
> unsigned char *in;
> - unsigned long avail;
> + size_t avail;
I know there were a lot of comments about "maybe this off_t switch is
not good". Let me say something a bit stronger: I think this part of the
change is strictly worse.
copy_pack_data() looks like this right now:
static void copy_pack_data(struct hashfile *f,
struct packed_git *p,
struct pack_window **w_curs,
off_t offset,
off_t len)
{
unsigned char *in;
unsigned long avail;
while (len) {
in = use_pack(p, w_curs, offset, &avail);
if (avail > len)
avail = (unsigned long)len;
hashwrite(f, in, avail);
offset += avail;
len -= avail;
}
}
So right now let's imagine that off_t is 64-bit, and "unsigned long" is
32-bit (e.g., 32-bit system, or an IL32P64 model like Windows). We'll
repeatedly ask use_pack() for a window, and it will tell us how many
bytes we have in "avail". So even as a 32-bit value, that just means
we'll process chunks smaller than 4GB, and this is correct (or at least
this part of it -- hold on). But we can still process the whole "len"
given by the off_t eventually.
But by switching away from off_t in the function interface, we risk
truncation before we even enter the loop. Because of the switch to
size_t, it actually works on an IL32P64 system (because size_t is big
there), but it has introduced a bug on a true 32-bit system. If your
off_t really is 64-bit (and it generally is because we #define
_FILE_OFFSET_BITS), the function will truncate modulo 2^32.
And nor will most compilers warn without -Wconversion. You can try it
with this on Linux:
#define _FILE_OFFSET_BITS 64
#include <unistd.h>
void foo(size_t x);
void bar(off_t x);
void bar(off_t x)
{
foo(x);
}
That compiles fine with "gcc -c -m32 -Wall -Werror -Wextra" for me.
Adding "-Wconversion" catches it, but our code base is not close to
compiling with that warning enabled.
So I don't think this hunk is actually fixing any problems, and is
actually introducing one.
I do in general support moving to size_t over "unsigned long". Switching
avail to size_t makes sense here. It's just the off_t part that is
funny.
-Peff
next prev parent reply other threads:[~2018-10-13 2:38 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-12 7:07 [PATCH] zlib.c: use size_t for size Junio C Hamano
2018-10-12 9:54 ` Johannes Schindelin
2018-10-12 13:52 ` Junio C Hamano
2018-10-12 15:34 ` Johannes Schindelin
2018-10-12 23:23 ` Ramsay Jones
2018-10-12 20:42 ` [PATCH v2 1/1] " tboegi
2018-10-12 22:22 ` SZEDER Gábor
2018-10-13 5:00 ` Torsten Bögershausen
2018-10-14 2:16 ` Ramsay Jones
2018-10-14 2:31 ` Ramsay Jones
2018-10-14 2:52 ` Jeff King
2018-10-14 15:03 ` Ramsay Jones
2018-10-15 0:01 ` Jeff King
2018-10-15 0:41 ` Ramsay Jones
2018-10-15 4:22 ` Junio C Hamano
2018-10-15 5:54 ` Torsten Bögershausen
2018-10-13 2:38 ` Jeff King [this message]
2018-10-13 2:46 ` [PATCH] " Jeff King
2018-10-13 8:43 ` Johannes Sixt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181013023845.GA15595@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=martin.koegler@chello.at \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).