From: "Torsten Bögershausen" <tboegi@web.de>
To: Derrick Stolee <stolee@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH/RFC v1 1/1] Use size_t instead of unsigned long
Date: Mon, 19 Nov 2018 17:33:23 +0100 [thread overview]
Message-ID: <20181119163323.GA15913@tor.lan> (raw)
In-Reply-To: <c09938cf-7631-ef89-d8fc-d952f9b121c8@gmail.com>
On Sun, Nov 18, 2018 at 03:18:52PM -0500, Derrick Stolee wrote:
> On 11/17/2018 10:11 AM, tboegi@web.de wrote:
> >From: Torsten Bögershausen <tboegi@web.de>
> >
> >Currently Git users can not commit files >4Gib under 64 bit Windows,
> >where "long" is 32 bit but size_t is 64 bit.
> >Improve the code base in small steps, as small as possible.
> >What started with a small patch to replace "unsigned long" with size_t
> >in one file (convert.c) ended up with a change in many files.
> >
> >Signed-off-by: Torsten Bögershausen <tboegi@web.de>
> >---
> >
> >This needs to go on top of pu, to cover all the good stuff
> > cooking here.
>
> Better to work on top of 'master', as the work in 'pu' will be rewritten
> several times, probably.
>
> >I have started this series on November 1st, since that 2 or 3 rebases
> > had been done to catch up, and now it is on pu from November 15.
> >
> >I couldn't find a reason why changing "unsigned ling"
> > into "size_t" may break anything, any thoughts, please ?
>
> IIRC, the blocker for why we haven't done this already is that "size_t",
> "timestamp_t" and "off_t" are all 64-bit types that give more implied
> meaning, so we should swap types specifically to these as we want. One
> example series does a specific, small change [1].
>
> If we wanted to do a single swap that removes 'unsigned long' with an
> unambiguously 64-bit type, I would recommend "uint64_t". Later replacements
> to size_t, off_t, and timestamp_t could happen on a case-by-case basis for
> type clarity.
>
> It may benefit reviewers to split this change into multiple patches,
> starting at the lowest levels of the call stack (so higher 'unsigned long's
> can up-cast to the 64-bit types when calling a function) and focusing the
> changes to one or two files at a time (X.c and X.h, preferrably).
>
> Since you are talking about the benefits for Git for Windows to accept 4GB
> files, I wonder if we can add a test that verifies such a file will work. If
> you have such a test, then I could help verify that the test fails before
> the change and succeeds afterward.
>
> Finally, it may be good to add a coccinelle script that replaces 'unsigned
> long' with 'uint64_t' so we can easily fix any new introductions that happen
> in the future.
The plan was never to change "unsigned long" to a 64 bit value in general.
The usage of "unsigned long" (instead of size_t) was (and is) still good
for 32bit systems, where both are 32 bit. (at least all system I am aware of).
For 64 bit systems like Linux or Mac OS it is the same, both are 64 bit.
The only problematic system is Win64, where "unsigned long" is 32 bit,
and therefore we must use size_t to address data in memory.
This is not to be confused with off_t, which is used for "data on disk"
(and nothing else) or timestamp_t which is used for timestamps (and nothing else).
I haven't followed the "coccinelle script" development at all, if someone
makes a patch do replace "unsigned long" with size_t, that could replace
my whole patch. (Some of them may be downgraded to "unsigned int" ?)
However, as we need to let tb/print-size-t-with-uintmax-format make it to
master, otherwise we are not able to print the variables in a portable way.
> Thanks! I do think we should make this change, but we must be careful. It
> may be disruptive to topics in flight.
>
> -Stolee
>
> [1] https://public-inbox.org/git/20181112084031.11769-1-carenas@gmail.com/
>
next prev parent reply other threads:[~2018-11-19 18:04 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-17 15:11 [PATCH/RFC v1 1/1] Use size_t instead of unsigned long tboegi
2018-11-18 20:18 ` Derrick Stolee
2018-11-18 23:40 ` Junio C Hamano
2018-11-19 5:33 ` Torsten Bögershausen
2018-11-19 18:15 ` René Scharfe
2018-11-19 16:33 ` Torsten Bögershausen [this message]
2018-11-20 1:36 ` Junio C Hamano
2018-11-20 5:04 ` [PATCH v2 1/1] Use size_t instead of 'unsigned long' for data in memory tboegi
2018-11-21 11:55 ` Derrick Stolee
2019-01-16 21:46 ` Thomas Braun
2019-01-19 17:06 ` Torsten Bögershausen
2019-01-22 14:25 ` Thomas Braun
2019-04-13 15:18 ` [PATCH v3 " tboegi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181119163323.GA15913@tor.lan \
--to=tboegi@web.de \
--cc=git@vger.kernel.org \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).