Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH 4/6] SubmittingPatches: document Based-on-patch-by trailer
From: Kristoffer Haugsbakk @ 2026-06-16 20:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqqse6tnho1.fsf@gitster.g>

On Thu, Jun 11, 2026, at 18:52, Junio C Hamano wrote:
> kristofferhaugsbakk@fastmail.com writes:
>
>> +. `Based-on-patch-by:` can be used when someone else authored parts of
>> +  the patch that you are submitting. This might be relevant if someone
>> +  sent a patch to the mailing list without a commit message or a
>> +  `Signed-off-by:` and you have picked it up.
>
> Hmph, this seems to encourage pick up material that come outside of
> the usual DCO process, which should not be the intention of this
> document.

Oh, I have misread the room on this subject. It would be better to drop
the mention of signoff here.

>
> Unless the changes are trivial enough to not be copyrightable, it
> may be better to say "... if someone submitted a preliminary patch or
> a detailed code snippet with their sign-off", plus encourage asking
> the original author to sign-off if it initially came without, or
> something like that?

Okay, since they provided something concrete to copy (cf. Helped-by
where they did not provide precise changes in patch form, according to
the below context), it’s best to mention that signoff is relevant here.

>
>>  . `Helped-by:` is used to credit someone who suggested ideas for
>>    changes without providing the precise changes in patch form.
>>  . `Mentored-by:` is used to credit someone with helping develop a

^ permalink raw reply

* Re: [PATCH 2/6] SubmittingPatches: discuss non-ident trailers
From: Kristoffer Haugsbakk @ 2026-06-16 20:02 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git
In-Reply-To: <aivvE6gVMGWhRbCB@pks.im>

On Fri, Jun 12, 2026, at 13:35, Patrick Steinhardt wrote:
> On Thu, Jun 11, 2026 at 12:22:45AM +0200,
> kristofferhaugsbakk@fastmail.com wrote:
>> diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
>> index 0b12badf86d..51c308a89a8 100644
>> --- a/Documentation/SubmittingPatches
>> +++ b/Documentation/SubmittingPatches
>> @@ -474,7 +474,10 @@ These are the common trailers in use:
>>
>>  While you can also create your own trailer if the situation warrants it, we
>>  encourage you to instead use one of the common trailers in this project
>> -highlighted above.
>> +highlighted above. A trailer that credits someone might be more likely
>> +to be accepted since these are the most common ones. But another kind of
>> +trailer might be relevant, for example to link to an issue tracker
>> +belonging to a downstream project that is affected by a bug in Git.
>
> Hm, I wonder whether this is a bit too vague to really be helpful for a
> newcomer. Instead of alluding to such trailers, wouldn't it be
> preferable if we added those as actual examples to the list of known
> trailers and then tell folks that they can invent their own ones if
> there is a good reason to do so?

Honestly there are so few non-ident trailers that I don’t think they can
be listed as common trailers:

1. The Git project doesn’t need them (e.g. no bug tracker)
2. They seem mostly for use by other projects (bug trackers again)

With this list:

    git log --format='%(trailers:only,keyonly)' | sort | uniq

If you filter out the ident-looking ones:

    grep -v --extended-regexp -- '-[Bb]y$'

There are few left. And some can be discarded:

• Change-Id
• Message-ID
• Fixes (pointing to a commit)

So to address your point:

1. Maybe this is so niche that it is not worth mentioning; or
2. Maybe give a concrete example like `Closes: <bug link>`?

^ permalink raw reply

* Re: SHA-1/SHA-256 interoperability work is functional
From: Junio C Hamano @ 2026-06-16 20:01 UTC (permalink / raw)
  To: brian m. carlson; +Cc: git
In-Reply-To: <ajCWBG9RHBrm8jMZ@fruit.crustytoothpaste.net>

"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> I'm pleased to announce that I have Git fully passing the testsuite and
> CI in interoperability mode, both with SHA-256 and SHA-1 as the main
> algorithm.  While this is very exciting, the work is not ready to send
> to the list and is effectively a draft, since there is still cleanup
> and efficiency work to be done.

Great to hear about a great milestone.

> Features which are currently unsupported (and which may or may not be
> supported in the future):
>
> * Filtered bundles are unsupported because there is currently no way to provide
> 	a mapping.
> * Multi-pack index cannot be used as the sole pack index format because it does
> 	not yet provide mappings.
> * Pack index v1 and v2 cannot be used because they do not provide object
> 	mappings.  Git automatically uses pack index v3 instead when necessary, which
> 	does handle mappings.

> * Packfile URIs are not supported because the protocol-provided packfile is not
> 	complete and its objects cannot be mapped.

Not that I specifically care about packfile URI, but this one is
curious.  How would regular "fetch" and "push" traffic work under
the new world order?  Presumably we will keep one characteristic of
the protocol, that the packdata stream is the only thing that is
given to the other side and no object names are given, because the
receiving end would not want to blindly trust the object name the
sending end _claims_ to have sent and instead recomputes the object
name out of the packed objects in the data stream ("if we rehash
and recompute the object names from the datastream, the other side
cannot lie to us" IIRC was a security measure).

For a regular "fetch" and "push" to work, we would need to recompute
the native object names and also somehow compute the compatibility
object names if we are in interoperability mode, no?

If we download *.pack files from a packfile URI, wouldn't it be the
same story?

> * Large object promisors cannot be used if the server does not actually have
> 	the entire history, since the server must have a complete history in order to
> 	provide object mappings.

Again, this one worries me a bit, but perhaps I am not reading it
correctly.  Does this mean that the server side says "this is the
data for object whose name is X in the SHA-1 world, which translates
to X256 in the SHA-256 world", the receiving end blindly trusts
without having a way to verify?

> There is new documentation in `Documentation/gitformat-hash.adoc` that
> outlines the requirements for using the protocol.  The protocol
> restrictions described there are hard technical limitations that cannot
> be avoided; I've intentionally made things as featureful as they can be.
> This imposes real restrictions on using protocol interoperability with many
> projects, including Git and Linux[0].  Interested parties may wish to look
> at t1017 to see what's tested vis-à-vis the protocol and
> interoperability.
> ...
> If you're interested in testing or perusing the work, you may get it
> from the `sha256-interop` branch of https://github.com/bk2204/git.git.
> Please note that it may be rebased, rewound, or otherwise folded,
> spindled, or mutilated at any time.

Sounds exciting.

Thanks.

^ permalink raw reply

* Re: Assisted-by tag
From: Kristoffer Haugsbakk @ 2026-06-16 19:53 UTC (permalink / raw)
  To: Marius Spix, git
In-Reply-To: <20260616212553.31ddea83@rockhopper>

On Tue, Jun 16, 2026, at 21:25, Marius Spix wrote:
> as the Linux kernel requires the new Assisted-by tag for AI-assisted
> commits, I was researching how git handles such tags. Thereby I
> observed the following behaviour:
>
> git commit --signoff
> * adds an empty line before the Signed-off-by tag
> * ignores the Signed-off-by tag by checking for an empty commit message

That a bare message which is just `Signed-off-by` is considered an
“empty” commit message seems like a historical quirk. It checks
specifically for that tag/trailer.

>
> git commit --trailer "\nAssisted-by: OpenAI"
> * does not add an empty line (the "\n" is not converted to a newline)
> * does not ignore the tag by checking for empty commit message

This is just a regular trailer. I don’t know why you have a `\n`.

    git commit --trailer "Assisted-by: OpenAI"

Any number of these will populate the trailer block.

> Since there will be more and more AI-assisted commits in projects like
> the Linux kernel in the future, this should be taken in account.
>
> When merging or squashing commits, that tag should also be
> automatically applied to the new commit message to make it clear that
> the commit is tainted by AI.

That `Signed-off-by` has dedicated options and logic is historical
baggage at this point.

A 2013 [patch] to add `git commit --fixes` because the Linux Kernel uses
`Fixes` was rejected because the Git project considers tags/trailers
project-specific. Instead git-interpret-trailers(1) was born which
eventually provided the code base for `git commit --trailer` and similar
options.

  [patch]: https://lore.kernel.org/all/20131027013402.GA7146@leaf/

Handling how trailers are added is also deemed project-specific. There
are only the configurations and options that git-interpret-trailers(1)
provides. And handling that commits are rewritten (combined and so on)
in such a way that trailer-taint sticks is beyond any discussion I’ve
seen on the subject.

^ permalink raw reply

* Assisted-by tag
From: Marius Spix @ 2026-06-16 19:25 UTC (permalink / raw)
  To: git

Hi there,

as the Linux kernel requires the new Assisted-by tag for AI-assisted
commits, I was researching how git handles such tags. Thereby I
observed the following behaviour:

git commit --signoff
* adds an empty line before the Signed-off-by tag
* ignores the Signed-off-by tag by checking for an empty commit message

git commit --trailer "\nAssisted-by: OpenAI"
* does not add an empty line (the "\n" is not converted to a newline)
* does not ignore the tag by checking for empty commit message

Since there will be more and more AI-assisted commits in projects like
the Linux kernel in the future, this should be taken in account.

When merging or squashing commits, that tag should also be
automatically applied to the new commit message to make it clear that
the commit is tainted by AI.

Your opinion?

Best regards

Marius

^ permalink raw reply

* Re: [PATCH v14 4/6] branch: add --prune-merged <branch>
From: Harald Nordgren @ 2026-06-16 19:15 UTC (permalink / raw)
  To: Phillip Wood
  Cc: Harald Nordgren via GitGitGadget, git, Kristoffer Haugsbakk,
	Johannes Sixt
In-Reply-To: <78b6dfdd-df61-4c44-96eb-b527cb26243c@gmail.com>

> > diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
> > index 4e7deddc04..27ea1319bb 100755
> > --- a/t/t3200-branch.sh
> > +++ b/t/t3200-branch.sh
> > @@ -1809,4 +1809,205 @@ test_expect_success '--forked requires a value' '
> >       test_grep "requires a value" err
> >   '
> >
> > +test_expect_success '--prune-merged: setup' '
> > +     test_create_repo pm-upstream &&
>
> The rest of this test would be easier to read if we did
>
>         (
>                 cd pm-upstream &&
>                 ...
>         )
>
> rather than prefixing every command with "-C pm-upstream"

I feel like the discussion to nest or not to nest has come up many
times in other topics as well. I don't feel strongly about either way,
but I just want to flag that if I change it now, another reviewer
might ask me to change it back later.

Should the rules be to nest inside of setup functions (and helpers?)
but not inside the actual tests?

> > +     test_commit -C pm-upstream base &&
> > +     git -C pm-upstream checkout -b next &&
> > +     test_commit -C pm-upstream one-commit &&
> > +     test_commit -C pm-upstream two-commit &&
> > +     git -C pm-upstream branch one HEAD~ &&
> > +     git -C pm-upstream branch two HEAD &&
> > +     git -C pm-upstream branch wip main &&
> > +     git -C pm-upstream checkout main &&
> > +     test_create_repo pm-fork
> > +'
> > +
> > +test_expect_success '--prune-merged deletes branches integrated into upstream' '
> > +     test_when_finished "rm -rf pm-merged" &&
> > +     git clone pm-upstream pm-merged &&
> > +     git -C pm-merged remote add fork ../pm-fork &&
> > +     test_config -C pm-merged remote.pushDefault fork &&
> > +     test_config -C pm-merged push.default current &&
>
> So we clone upstream and add fork as the default push remote. I find the
> pm- prefixes rather distracting. It would be clearer to me if we just
> called the repositories "upstream", "fork" and "repo"

Good point.

> > +     test_must_fail git -C pm-local rev-parse --verify refs/heads/one
> > +'
> > +
> > +test_expect_success '--prune-merged warns instead of erroring on un-integrated commits' '
> > +     test_when_finished "rm -rf pm-unmerged" &&
> > +     git clone pm-upstream pm-unmerged &&
> > +     git -C pm-unmerged remote add fork ../pm-fork &&
> > +     test_config -C pm-unmerged remote.pushDefault fork &&
> > +     test_config -C pm-unmerged push.default current &&
> > +     git -C pm-unmerged checkout -b wip origin/wip &&
> > +     git -C pm-unmerged branch --set-upstream-to=origin/next wip &&
> > +     test_commit -C pm-unmerged local-only &&
> > +     git -C pm-unmerged checkout - &&
> > +
> > +     git -C pm-unmerged branch --prune-merged "origin/*" 2>err &&
> > +     test_grep "not fully merged" err &&
> > +     test_grep ! "If you are sure you want to delete it" err &&
>
> I'm always suspicious of test_grep when we know what the output should
> look like - it might be better to use test_cmp. This test does not check
> that we also delete branches that are merged when we see one that isn't.
>
> I'm going to stop here - the tests I've read seem to me to be too much
> like unit tests checking one aspect of the implementation in isolation
> rather than checking that the whole feature works as expected.

I'll respond to the rest here: Excellent points regarding the testing
aboce, I will take a look at doing this.


Harald

^ permalink raw reply

* Re: [PATCH v14 6/6] branch: add --dry-run for --prune-merged
From: Harald Nordgren @ 2026-06-16 18:28 UTC (permalink / raw)
  To: Phillip Wood
  Cc: Harald Nordgren via GitGitGadget, git, Kristoffer Haugsbakk,
	Johannes Sixt
In-Reply-To: <7b43a0f1-32a0-40f0-8c82-d2ee78809cc2@gmail.com>

> > With --dry-run, --prune-merged prints the local branches it would
> > delete, one "Would delete branch <name>" line each, and exits
> > without touching any ref. The same filtering applies, so the output
> > is exactly the set that the real run would delete.
>
> I can see this being very useful.

Great to hear and thanks for taking the time to review this! Much appreciated!

> >   static int prune_merged_branches(int argc, const char **argv,
> > -                              int quiet)
> > +                              int quiet, int dry_run)
>
> Let's not start adding multiple boolean augments - use a flags argument
> like we do for delete_branches() - if you get feedback on one patch you
> should think about whether it applies later in the series as well. The
> rest of the implementation looks good.

I'm trying to generalize all feedback, but sometimes I miss things.
Thanks for pointing it out!


Harald

^ permalink raw reply

* Re: [GSoC Patch v5 2/4] rev-parse: use append_formatted_path() for path formatting
From: Phillip Wood @ 2026-06-16 18:26 UTC (permalink / raw)
  To: K Jayatheerth
  Cc: git, jltobler, lucasseikioshiro, gitster, phillip.wood, sandals,
	kumarayushjha123, a3205153416
In-Reply-To: <CA+rGoLfhhRNrSReeJ1grhy+2K3BSrikTCNgGpCaGqc4fFp3Lfg@mail.gmail.com>



On 16/06/2026 18:04, K Jayatheerth wrote:
> Hi Phillip,
> Thanks for taking a look!
> 
>> On 16/06/2026 05:49, K Jayatheerth wrote:
>>> The path-formatting logic in builtin/rev-parse.c is tightly coupled
>>> to that command and writes directly to stdout, making it impossible
>>> for other builtins to reuse.
>>>
>>> Extract the core algorithm into append_formatted_path() in path.c
>>> and expose a path_format enum in path.h so that any builtin can
>>> format paths consistently without duplicating logic.
>>
>> Sorry I haven't had time to look at this series recently, it is looking
>> much nicer now that we have a single enum. It would be helpful to
>> explain why we need PATH_FORMAT_DEFAULT that acts exactly like
>> PATH_FORMAT_UNMODIFIED. Looking at the next patch it seems this is still
>> a wart in the api due to rev-parse wanting needing to distinguish the
>> unmodified case from the default case.
> t);
>>> +
>>>    # ifdef USE_THE_REPOSITORY_VARIABLE
>>>    #  include "strbuf.h"
>>>    #  include "repository.h"
>>
> 
> 
>>>    int cmd_rev_parse(int argc,
>>> @@ -717,7 +661,7 @@ int cmd_rev_parse(int argc,
>>>        const char *name = NULL;
>>>        struct strbuf buf = STRBUF_INIT;
>>>        int seen_end_of_options = 0;
>>> -     enum format_type format = FORMAT_DEFAULT;
>>> +     enum path_format arg_path_format = PATH_FORMAT_DEFAULT;
>>
>> This is the source of the api wart I referred to in the previous patch.
>> Could we keep the existing enums and convert them into the appropriate
>> PATH_FORMAT_* flag in print_path() above? I think we already have the
>> logic to do that in the existing code. That would mean that other users
>> of append_formatted_path() don't have to worry about the extra flag.
>>
> 
> That is a much more elegant solution than the current one.
> 
> For v6, I will clean this up by keeping the fallback logic
> localized within builtin/rev-parse.c and removing
> PATH_FORMAT_DEFAULT entirely from enum path_format in path.h.
> 
> Instead, I'll re-introduce a small local enum (e.g., enum
> rev_parse_format) inside rev-parse.c to handle the
> command-line parsing state (tracking whether the user
> explicitly provided a flag or if we are still in a
> neutral/default state).

I think it is probably simplest to keep the existing enums and modify 
print_path() to convert them to the appropriate PATH_FORMAT_*. That way 
we can keep the option parsing code as is.

Thanks

Phillip

> As you said, most of the logic is already present. In
> print_path(), we will check that local tracking enum. If it’s
> set to the local default, we can map it directly to the
> path-specific def_format before invoking append_formatted_path().
> This ensures other users of the function don't have to worry
> about the extra flag.
> 
> I will send out the v6 series with these fixes shortly.
> 
> Regards,
> - K Jayatheerth


^ permalink raw reply

* [PATCH] sequencer: Skip copying notes for commits that disappear during rebase
From: Uwe Kleine-König @ 2026-06-16 17:40 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood

When a commit disappears during rebase because the patch content is
already there (but not by the same patch in which case the commit would
be skipped) the notes of that disappearing commit should not be copied
to the unrelated commit that happens to be HEAD.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
---
Hello,

after also my 2nd bug report[1] didn't motivate anyone to come up with a
fix, I invested the time to work out one according to Phillip Wood's
suggestion.

IMHO it's not pretty, but it works for me.

Note that Phillip also suggested to integrete the test into
t3400-rebase.sh . IMHO it doesn't matter much if this is considered a
rebase test or a notes test. I kept it where I have it because I'm lazy
and failed to understand the git history created in that test.

Best regards
Uwe

[1] https://lore.kernel.org/git/20260612143952.3281115-2-u.kleine-koenig@baylibre.com


 sequencer.c             | 20 ++++++++++----------
 t/meson.build           |  1 +
 t/t3322-notes-rebase.sh | 37 +++++++++++++++++++++++++++++++++++++
 3 files changed, 48 insertions(+), 10 deletions(-)
 create mode 100755 t/t3322-notes-rebase.sh

diff --git a/sequencer.c b/sequencer.c
index 57855b0066ac..da2185a37c5d 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2263,7 +2263,7 @@ static const char *reflog_message(struct replay_opts *opts,
 static int do_pick_commit(struct repository *r,
 			  struct todo_item *item,
 			  struct replay_opts *opts,
-			  int final_fixup, int *check_todo)
+			  int final_fixup, int *check_todo, int *dropped_commit)
 {
 	struct replay_ctx *ctx = opts->ctx;
 	unsigned int flags = should_edit(opts) ? EDIT_MSG : 0;
@@ -2273,7 +2273,7 @@ static int do_pick_commit(struct repository *r,
 	const char *base_label, *next_label, *reflog_action;
 	char *author = NULL;
 	struct commit_message msg = { NULL, NULL, NULL, NULL };
-	int res, unborn = 0, reword = 0, allow, drop_commit;
+	int res, unborn = 0, reword = 0, allow;
 	enum todo_command command = item->command;
 	struct commit *commit = item->commit;
 
@@ -2492,7 +2492,7 @@ static int do_pick_commit(struct repository *r,
 		goto leave;
 	}
 
-	drop_commit = 0;
+	*dropped_commit = 0;
 	allow = allow_empty(r, opts, commit);
 	if (allow < 0) {
 		res = allow;
@@ -2500,7 +2500,7 @@ static int do_pick_commit(struct repository *r,
 	} else if (allow == 1) {
 		flags |= ALLOW_EMPTY;
 	} else if (allow == 2) {
-		drop_commit = 1;
+		*dropped_commit = 1;
 		refs_delete_ref(get_main_ref_store(r), "", "CHERRY_PICK_HEAD",
 				NULL, REF_NO_DEREF);
 		unlink(git_path_merge_msg(r));
@@ -2510,7 +2510,7 @@ static int do_pick_commit(struct repository *r,
 			_("dropping %s %s -- patch contents already upstream\n"),
 			oid_to_hex(&commit->object.oid), msg.subject);
 	} /* else allow == 0 and there's nothing special to do */
-	if (!opts->no_commit && !drop_commit) {
+	if (!opts->no_commit && !*dropped_commit) {
 		if (author || command == TODO_REVERT || (flags & AMEND_MSG))
 			res = do_commit(r, msg_file, author, reflog_action,
 					opts, flags,
@@ -4943,12 +4943,12 @@ static int pick_one_commit(struct repository *r,
 			   struct replay_opts *opts,
 			   int *check_todo, int* reschedule)
 {
-	int res;
+	int res, dropped_commit;
 	struct todo_item *item = todo_list->items + todo_list->current;
 	const char *arg = todo_item_get_arg(todo_list, item);
 
 	res = do_pick_commit(r, item, opts, is_final_fixup(todo_list),
-			     check_todo);
+			     check_todo, &dropped_commit);
 	if (is_rebase_i(opts) && res < 0) {
 		/* Reschedule */
 		*reschedule = 1;
@@ -4965,7 +4965,7 @@ static int pick_one_commit(struct repository *r,
 		return error_with_patch(r, commit,
 					arg, item->arg_len, opts, res, !res);
 	}
-	if (is_rebase_i(opts) && !res)
+	if (is_rebase_i(opts) && !res && !dropped_commit)
 		record_in_rewritten(&item->commit->object.oid,
 				    peek_command(todo_list, 1));
 	if (res && is_fixup(item->command)) {
@@ -5523,14 +5523,14 @@ static int single_pick(struct repository *r,
 		       struct commit *cmit,
 		       struct replay_opts *opts)
 {
-	int check_todo;
+	int check_todo, dummy;
 	struct todo_item item;
 
 	item.command = opts->action == REPLAY_PICK ?
 			TODO_PICK : TODO_REVERT;
 	item.commit = cmit;
 
-	return do_pick_commit(r, &item, opts, 0, &check_todo);
+	return do_pick_commit(r, &item, opts, 0, &check_todo, &dummy);
 }
 
 int sequencer_pick_revisions(struct repository *r,
diff --git a/t/meson.build b/t/meson.build
index c5832fee0535..6927bd9c794f 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -358,6 +358,7 @@ integration_tests = [
   't3311-notes-merge-fanout.sh',
   't3320-notes-merge-worktrees.sh',
   't3321-notes-stripspace.sh',
+  't3322-notes-rebase.sh',
   't3400-rebase.sh',
   't3401-rebase-and-am-rename.sh',
   't3402-rebase-merge.sh',
diff --git a/t/t3322-notes-rebase.sh b/t/t3322-notes-rebase.sh
new file mode 100755
index 000000000000..0eddde7f9961
--- /dev/null
+++ b/t/t3322-notes-rebase.sh
@@ -0,0 +1,37 @@
+#!/bin/sh
+
+test_description='Test notes on rebase'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+	git init &&
+	git config notes.rewriteRef refs/notes/commits &&
+	git version > version &&
+	echo A > A &&
+	git add A &&
+	git commit -m A &&
+	git branch branch &&
+	echo B > B &&
+	git add B &&
+	git commit -m B &&
+	git notes add -m "This is B" @ &&
+	echo C > C &&
+	git add C &&
+	git commit -m C &&
+	git checkout branch &&
+	echo B > B &&
+	echo D > D &&
+	git add B D &&
+	git commit -m BD
+'
+
+test_expect_success 'rebase B + C on top of BD' '
+	git rebase @ master
+'
+
+test_expect_success 'assert there is no note on BD' '
+	if git notes list branch >/tmp/lalaa; then return 1; fi
+'
+
+test_done

base-commit: 3e65291872de10c3f0bf05ea8c24187e7a71ebf0
-- 
2.47.3


^ permalink raw reply related

* Re: [PATCH] rebase: mention --abort alongside --continue
From: Junio C Hamano @ 2026-06-16 17:33 UTC (permalink / raw)
  To: Phillip Wood; +Cc: Harald Nordgren via GitGitGadget, git, Harald Nordgren
In-Reply-To: <89d72342-5aa1-4dcf-951b-d0c791f91738@gmail.com>

Phillip Wood <phillip.wood123@gmail.com> writes:

> Hi Harald
>
> On 15/06/2026 20:19, Harald Nordgren via GitGitGadget wrote:
>> From: Harald Nordgren <haraldnordgren@gmail.com>
>> 
>> The warning shown when an "exec" step fails and the "git status"
>> advice while splitting or editing a commit pointed users at "git
>> rebase --continue" but not "--abort". Mention it in both, matching
>> the conflict case.
>
> I'm not sure that the "failed exec" and "conflicts" cases are equivalent 
> though. If you have some nasty conflict that you don't want to resolve 
> then aborting and trying another approach such is incrementally rebasing 
> is the only option. If an exec command fails then it likely means that a 
> test has failed or some something similar which is minor inconvenience 
> which needs fixing before continuing - it seems very unlikely that the 
> user would want to abort the rebase.

It is very true that users who know what they are doing and got into
such conflicts are opted to go into such a situation tnat it is
unlikely that they would appreciate a choice to abort.

But given that for any system, everybody starts as a newbie, it may
be assuring to always give "here is a way out" option when they get
in a nasty confusing situation.  Discouraging the way to use the
tool that can lead to confusing situation by guiding them with BCP
workflows would help, but they always get into pitfall.

The patch adds new message into the existing message to suggest how
to move forward, but as a training wheel option, it may not be a bad
thing to offer "--abort" as an extra hint, separate from the
existing warning() message.

^ permalink raw reply

* Re: [GSoC Patch v5 2/4] rev-parse: use append_formatted_path() for path formatting
From: K Jayatheerth @ 2026-06-16 17:04 UTC (permalink / raw)
  To: Phillip Wood
  Cc: git, jltobler, lucasseikioshiro, gitster, phillip.wood, sandals,
	kumarayushjha123, a3205153416
In-Reply-To: <0077b1ae-3c85-4b34-a0ac-766395157c4f@gmail.com>

Hi Phillip,
Thanks for taking a look!

> On 16/06/2026 05:49, K Jayatheerth wrote:
> > The path-formatting logic in builtin/rev-parse.c is tightly coupled
> > to that command and writes directly to stdout, making it impossible
> > for other builtins to reuse.
> >
> > Extract the core algorithm into append_formatted_path() in path.c
> > and expose a path_format enum in path.h so that any builtin can
> > format paths consistently without duplicating logic.
>
> Sorry I haven't had time to look at this series recently, it is looking
> much nicer now that we have a single enum. It would be helpful to
> explain why we need PATH_FORMAT_DEFAULT that acts exactly like
> PATH_FORMAT_UNMODIFIED. Looking at the next patch it seems this is still
> a wart in the api due to rev-parse wanting needing to distinguish the
> unmodified case from the default case.
t);
> > +
> >   # ifdef USE_THE_REPOSITORY_VARIABLE
> >   #  include "strbuf.h"
> >   #  include "repository.h"
>

> >   int cmd_rev_parse(int argc,
> > @@ -717,7 +661,7 @@ int cmd_rev_parse(int argc,
> >       const char *name = NULL;
> >       struct strbuf buf = STRBUF_INIT;
> >       int seen_end_of_options = 0;
> > -     enum format_type format = FORMAT_DEFAULT;
> > +     enum path_format arg_path_format = PATH_FORMAT_DEFAULT;
>
> This is the source of the api wart I referred to in the previous patch.
> Could we keep the existing enums and convert them into the appropriate
> PATH_FORMAT_* flag in print_path() above? I think we already have the
> logic to do that in the existing code. That would mean that other users
> of append_formatted_path() don't have to worry about the extra flag.
>

That is a much more elegant solution than the current one.

For v6, I will clean this up by keeping the fallback logic
localized within builtin/rev-parse.c and removing
PATH_FORMAT_DEFAULT entirely from enum path_format in path.h.

Instead, I'll re-introduce a small local enum (e.g., enum
rev_parse_format) inside rev-parse.c to handle the
command-line parsing state (tracking whether the user
explicitly provided a flag or if we are still in a
neutral/default state).

As you said, most of the logic is already present. In
print_path(), we will check that local tracking enum. If it’s
set to the local default, we can map it directly to the
path-specific def_format before invoking append_formatted_path().
This ensures other users of the function don't have to worry
about the extra flag.

I will send out the v6 series with these fixes shortly.

Regards,
- K Jayatheerth

^ permalink raw reply

* Re: [PATCH v5 2/2] graph: indent visual root in graph
From: Junio C Hamano @ 2026-06-16 16:36 UTC (permalink / raw)
  To: Pablo Sabater
  Cc: Taylor Blau, git, ayu.chandekar, chandrapratap3519,
	christian.couder, jltobler, karthik.188, peff, phillip.wood,
	siddharthasthana31
In-Reply-To: <CAN5EUNR-o_sLzeWuy7M9UMFHBKxSuytNd=4p2svtFuv40E8vZg@mail.gmail.com>

Pablo Sabater <pabloosabaterr@gmail.com> writes:

> Back to the test:
>
>   * 41_octopus
>   | * 43_B
>   |  \
>   |   * 43_A
>   | * 42_B
>   | * 42_A
>   * 41_B
>   * 41_A
>
> 43_A is rendered on the second column (first column is active by the
> 41_* branch) and gets indented to the third one. With commit-graph it
> would be on the first and get indented to the second, making it the
> same as more general tests above in "t4218", it is an edge case but
> shows that indentation works correctly independently where the visual
> root is.

Sounds good.

^ permalink raw reply

* Re: [PATCH v2 0/6] Support hashing objects larger than 4GB on Windows
From: Junio C Hamano @ 2026-06-16 16:09 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Philip Oakley, Patrick Steinhardt, Johannes Schindelin
In-Reply-To: <pull.2138.v2.git.1781621398.gitgitgadget@gmail.com>

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> Changes since v1:
>
>  * Rebased to current master to resolve the conflicts with
>    ps/odb-source-loose

Very much appreciated.

>  * Dropped the !LONG_IS_64BIT prereq from the added/touched tests, as it is
>    now no longer needed

Good thing to do and see that the code works as well as it could,
whether a long is 32-bit or 64-bit ;-).

> Philip Oakley (6):
>   hash-object: demonstrate a >4GB/LLP64 problem
>   object-file.c: use size_t for header lengths
>   hash algorithms: use size_t for section lengths
>   hash-object --stdin: verify that it works with >4GB/LLP64
>   hash-object: add another >4GB/LLP64 test case
>   hash-object: add a >4GB/LLP64 test case using filtered input
>
>  object-file.c          | 14 +++++++-------
>  object-file.h          |  6 +++---
>  odb/source-files.c     |  2 +-
>  odb/source-inmemory.c  |  2 +-
>  odb/source-loose.c     |  4 ++--
>  odb/source.h           |  2 +-
>  sha1dc_git.c           |  3 +--
>  sha1dc_git.h           |  2 +-
>  t/t1007-hash-object.sh | 39 +++++++++++++++++++++++++++++++++++++++
>  9 files changed, 56 insertions(+), 18 deletions(-)

Will queue.  Thanks.

>
>
> base-commit: 700432b2ba22603a0bcb71475c9c333d17c9b0d1
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2138%2Fdscho%2FPhilipOakley%2Fhashliteral_t-v2
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2138/dscho/PhilipOakley/hashliteral_t-v2
> Pull-Request: https://github.com/gitgitgadget/git/pull/2138
>
> Range-diff vs v1:
>
>  1:  84e1cd0aa0 = 1:  9c01bac407 hash-object: demonstrate a >4GB/LLP64 problem
>  2:  809d83e46f ! 2:  aa5859c14f object-file.c: use size_t for header lengths
>      @@ Commit message

By the way, how is range-diff driven via GGG?  After applying these
patches on the same base commit, my "git range-diff v1...v2" invocation
punts on matching step 2 and I do not get a comparison like this
unless I give --creation-factor=<large number> from the command line.



^ permalink raw reply

* Re: [PATCH v3] read_gitfile(): simplify NOT_A_REPO error message
From: Junio C Hamano @ 2026-06-16 15:48 UTC (permalink / raw)
  To: Jeff King; +Cc: Patrick Steinhardt, git, Tian Yuchen
In-Reply-To: <20260616144554.GA2305974@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> On Tue, Jun 16, 2026 at 07:25:02AM -0700, Junio C Hamano wrote:
>
>> >     +@@ t/t7450-bad-git-dotfiles.sh: test_expect_success 'git dirs of sibling submodules must not be nested' '
>> >     + test_expect_success 'submodule git dir nesting detection must work with parallel cloning' '
>> >     + 	test_must_fail git clone --recurse-submodules --jobs=2 nested clone_parallel 2>err &&
>> >     + 	cat err &&
>> >     +-	grep -E "(already exists|is inside git dir|not a git repository)" err &&
>> >     ++	grep -E "(already exists|is inside git dir|does not point to a valid repository)" err &&
>> 
>> A few things.
>> 
>>  * Will we be happy to see only one of these possibilities, or do we
>>    expect to see these once for each kind?
>
> I imagine it is only one. This all comes from 9cf8547320 (clone: prevent
> clashing git dirs when cloning submodule in parallel, 2024-01-28), and
> it is expecting the nested path to cause a failure. Which failure I
> guess depends on the racy ordering. If we create the inner one first,
> then we probably get "already exists", and if the outer one, then "is
> inside git dir". I don't know exactly what sequence yields the
> NOT_A_REPO message.
>
> But none of that is changing in this patch, just what the user-visible
> text is for the NOT_A_REPO case.
>
> I did briefly wonder if we might see "not a git repository" from a
> _different_ code path, and need to catch it along with the new message.
> But running successfully with --stress implies that we never see the old
> one anymore.

I see.  Thanks.

>
>>  * a recently started in-flight topic tries to catch bare "grep" and
>>    fails until you write test_grep X-<.
>
> Yeah. This will create a merge conflict for you, but hopefully the
> resolution should be obvious. I don't think it makes sense to fix here,
> as it's orthogonal to the purpose of the patch.

Yup.  I agree that, given that others in the same script will be
updated by that other topic to conflict with this change, it would
not make sense to do the same changes here.

Thanks.

^ permalink raw reply

* Re: [PATCH 4/4] builtin/refs: add "rename" subcommand
From: Junio C Hamano @ 2026-06-16 14:53 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git
In-Reply-To: <20260616-pks-refs-writing-subcommands-v1-4-9f5219b6109d@pks.im>

Patrick Steinhardt <ps@pks.im> writes:

> +static int cmd_refs_rename(int argc, const char **argv, const char *prefix,
> +			   struct repository *repo)
> +{
> +	static char const * const refs_rename_usage[] = {
> +		REFS_RENAME_USAGE,
> +		NULL
> +	};
> +	const char *message = NULL;
> +	struct option opts[] = {
> +		OPT_STRING(0, "message", &message, N_("reason"),
> +			   N_("reason of the update")),
> +		OPT_END(),
> +	};
> +	const char *oldref, *newref;
> +
> +	argc = parse_options(argc, argv, prefix, opts, refs_rename_usage, 0);
> +	if (argc != 2)
> +		usage(_("rename requires old and new reference name"));
> +	if (message && !*message)
> +		die(_("refusing to perform update with empty message"));
> +
> +	oldref = argv[0];
> +	newref = argv[1];
> +
> +	if (check_refname_format(oldref, 0))
> +		die(_("invalid ref format: %s"), oldref);
> +	if (check_refname_format(newref, 0))
> +		die(_("invalid ref format: %s"), newref);

Do we want to quote the value?  What other subcommands do in "git refs"?

> +	if (!refs_ref_exists(get_main_ref_store(repo), oldref))
> +		die(_("reference does not exist: '%s'"), oldref);
> +	if (refs_ref_exists(get_main_ref_store(repo), newref))
> +		die(_("reference already exists: '%s'"), newref);
> +
> +	return refs_rename_ref(get_main_ref_store(repo), oldref, newref, message);
> +}

I suspect that my version shared the same issue, but doesn't
refs_rename_ref() return -1 for failure, which we may want to turn
to positive 1 before returning?

This is a tangent but git.c:handle_builtin() that calls
git.c:run_builtin() may want to do the "negative return? flip the
polarity" conversion to make this worry go away.  I dunno what such
a change would break, though.

If we rename a ref that does not have a reflog, would it leave the
ref under the new name without reflog, or would we get a reflog with
a single entry that marks the fact the old ref was renamed into the
new ref?  Should that be controlled via --create-reflog option?

^ permalink raw reply

* Re: [PATCH 3/4] builtin/refs: add "update" subcommand
From: Junio C Hamano @ 2026-06-16 14:52 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git
In-Reply-To: <20260616-pks-refs-writing-subcommands-v1-3-9f5219b6109d@pks.im>

Patrick Steinhardt <ps@pks.im> writes:

>  git refs delete [--message=<reason>] [--no-deref] <ref> [<oldvalue>]
> +git refs update [--message=<reason>] [--no-deref] [--create-reflog] <ref> <new-value> [<old-value>]

"<old-value> vs <new-value>" is good, we should update "delete" to
use "<old-value>" to match.


>  DESCRIPTION
>  -----------
> @@ -58,6 +59,12 @@ delete::
>  	reference is only deleted after verifying that it currently contains
>  	`<oldvalue>`.
>  
> +update::
> +	Update the given reference to point at `<new-value>`. This subcommand
> +	mirrors `git update-ref` (see linkgit:git-update-ref[1]). When
> +	`<old-value>` is given, the reference is only updated after verifying
> +	that it currently contains `<old-value>`.

As to the lack of "create", among the two potential changes, I have
a slight preference for adding "create" and failing "update" that
does not refer to an existing ref.  If we go that route, the
"--create-reflog" option should move to "create", as "update" will
never be used to create a new ref.


> +	if (repo_get_oid_with_flags(repo, argv[1], &newoid,
> +				    GET_OID_SKIP_AMBIGUITY_CHECK))
> +		die(_("invalid new object ID: %s"), argv[1]);
> +	if (argc == 3 &&
> +	    repo_get_oid_with_flags(repo, argv[2], &oldoid,
> +				    GET_OID_SKIP_AMBIGUITY_CHECK))
> +		die(_("invalid old object ID: %s"), argv[2]);

On the "delete" side, these messages quote the object name, i.e.,

			die(_("invalid old object ID: '%s'"), argv[1]);

We should be consistent.

^ permalink raw reply

* [PATCH v2 6/6] hash-object: add a >4GB/LLP64 test case using filtered input
From: Philip Oakley via GitGitGadget @ 2026-06-16 14:49 UTC (permalink / raw)
  To: git; +Cc: Philip Oakley, Patrick Steinhardt, Johannes Schindelin,
	Philip Oakley
In-Reply-To: <pull.2138.v2.git.1781621398.gitgitgadget@gmail.com>

From: Philip Oakley <philipoakley@iee.email>

To verify that the `clean` side of the `clean`/`smudge` filter code is
correct with regards to LLP64 (read: to ensure that `size_t` is used
instead of `unsigned long`), here is a test case using a trivial filter,
specifically _not_ writing anything to the object store to limit the
scope of the test case.

As in previous commits, the `big` file from previous test cases is
reused if available, to save setup time, otherwise re-generated.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t1007-hash-object.sh | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index f96c29ce68..4bc82dd968 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -285,4 +285,16 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
 	test_cmp expect actual
 '
 
+# This clean filter does nothing, other than excercising the interface.
+# We ensure that cleaning doesn't mangle large files on 64-bit Windows.
+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
+		'hash filtered files over 4GB correctly' '
+	{ test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
+	test_oid large5GB >expect &&
+	test_config filter.null-filter.clean "cat" &&
+	echo "big filter=null-filter" >.gitattributes &&
+	git hash-object -- big >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
gitgitgadget

^ permalink raw reply related

* [PATCH v2 5/6] hash-object: add another >4GB/LLP64 test case
From: Philip Oakley via GitGitGadget @ 2026-06-16 14:49 UTC (permalink / raw)
  To: git; +Cc: Philip Oakley, Patrick Steinhardt, Johannes Schindelin,
	Philip Oakley
In-Reply-To: <pull.2138.v2.git.1781621398.gitgitgadget@gmail.com>

From: Philip Oakley <philipoakley@iee.email>

To complement the `--stdin` and `--literally` test cases that verify
that we can hash files larger than 4GB on 64-bit platforms using the
LLP64 data model, here is a test case that exercises `hash-object`
_without_ any options.

Just as before, we use the `big` file from the previous test case if it
exists to save on setup time, otherwise generate it.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t1007-hash-object.sh | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index bcae3fc54c..f96c29ce68 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -277,4 +277,12 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
 	test_cmp expect actual
 '
 
+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
+		'files over 4GB hash correctly' '
+	{ test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
+	test_oid large5GB >expect &&
+	git hash-object -- big >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v2 4/6] hash-object --stdin: verify that it works with >4GB/LLP64
From: Philip Oakley via GitGitGadget @ 2026-06-16 14:49 UTC (permalink / raw)
  To: git; +Cc: Philip Oakley, Patrick Steinhardt, Johannes Schindelin,
	Philip Oakley
In-Reply-To: <pull.2138.v2.git.1781621398.gitgitgadget@gmail.com>

From: Philip Oakley <philipoakley@iee.email>

Just like the `hash-object --literally` code path, the `--stdin` code
path also needs to use `size_t` instead of `unsigned long` to represent
memory sizes, otherwise it would cause problems on platforms using the
LLP64 data model (such as Windows).

To limit the scope of the test case, the object is explicitly not
written to the object store, nor are any filters applied.

The `big` file from the previous test case is reused to save setup time;
To avoid relying on that side effect, it is generated if it does not
exist (e.g. when running via `sh t1007-*.sh --long --run=1,41`).

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t1007-hash-object.sh | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index f028a1cbcc..bcae3fc54c 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -269,4 +269,12 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
 	test_cmp expect actual
 '

+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
+		'files over 4GB hash correctly via --stdin' '
+	{ test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
+	test_oid large5GB >expect &&
+	git hash-object --stdin <big >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
gitgitgadget

^ permalink raw reply related

* [PATCH v2 3/6] hash algorithms: use size_t for section lengths
From: Philip Oakley via GitGitGadget @ 2026-06-16 14:49 UTC (permalink / raw)
  To: git; +Cc: Philip Oakley, Patrick Steinhardt, Johannes Schindelin,
	Philip Oakley
In-Reply-To: <pull.2138.v2.git.1781621398.gitgitgadget@gmail.com>

From: Philip Oakley <philipoakley@iee.email>

Continue walking the code path for the >4GB `hash-object --literally`
test to the hash algorithm step for LLP64 systems.

This patch lets the SHA1DC code use `size_t`, making it compatible with
LLP64 data models (as used e.g. by Windows).

The interested reader of this patch will note that we adjust the
signature of the `git_SHA1DCUpdate()` function without updating _any_
call site. This certainly puzzled at least one reviewer already, so here
is an explanation:

This function is never called directly, but always via the macro
`platform_SHA1_Update`, which is usually called via the macro
`git_SHA1_Update`. However, we never call `git_SHA1_Update()` directly
in `struct git_hash_algo`. Instead, we call `git_hash_sha1_update()`,
which is defined thusly:

    static void git_hash_sha1_update(git_hash_ctx *ctx,
                                     const void *data, size_t len)
    {
        git_SHA1_Update(&ctx->sha1, data, len);
    }

i.e. it contains an implicit downcast from `size_t` to `unsigned long`
(before this here patch). With this patch, there is no downcast anymore.

With this patch, finally, the t1007-hash-object.sh "files over 4GB hash
literally" test case is fixed.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 object-file.c          | 4 ++--
 sha1dc_git.c           | 3 +--
 sha1dc_git.h           | 2 +-
 t/t1007-hash-object.sh | 2 +-
 4 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/object-file.c b/object-file.c
index dccbe0fc3e..0056c369ce 100644
--- a/object-file.c
+++ b/object-file.c
@@ -316,7 +316,7 @@ int parse_loose_header(const char *hdr, struct object_info *oi)
 }
 
 static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_ctx *c,
-			     const void *buf, unsigned long len,
+			     const void *buf, size_t len,
 			     struct object_id *oid,
 			     char *hdr, size_t *hdrlen)
 {
@@ -336,7 +336,7 @@ void write_object_file_prepare(const struct git_hash_algo *algo,
 	/* Generate the header */
 	*hdrlen = format_object_header(hdr, *hdrlen, type, len);
 
-	/* Sha1.. */
+	/* Hash (function pointers) computation */
 	hash_object_body(algo, &c, buf, len, oid, hdr, hdrlen);
 }
 
diff --git a/sha1dc_git.c b/sha1dc_git.c
index 9b675a046e..fe58d7962a 100644
--- a/sha1dc_git.c
+++ b/sha1dc_git.c
@@ -27,10 +27,9 @@ void git_SHA1DCFinal(unsigned char hash[20], SHA1_CTX *ctx)
 /*
  * Same as SHA1DCUpdate, but adjust types to match git's usual interface.
  */
-void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, unsigned long len)
+void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, size_t len)
 {
 	const char *data = vdata;
-	/* We expect an unsigned long, but sha1dc only takes an int */
 	while (len > INT_MAX) {
 		SHA1DCUpdate(ctx, data, INT_MAX);
 		data += INT_MAX;
diff --git a/sha1dc_git.h b/sha1dc_git.h
index f6f880cabe..0bcf1aa84b 100644
--- a/sha1dc_git.h
+++ b/sha1dc_git.h
@@ -15,7 +15,7 @@ void git_SHA1DCInit(SHA1_CTX *);
 #endif
 
 void git_SHA1DCFinal(unsigned char [20], SHA1_CTX *);
-void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, unsigned long len);
+void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, size_t len);
 
 #define platform_SHA_IS_SHA1DC /* used by "test-tool sha1-is-sha1dc" */
 
diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index 7867fd1dbf..f028a1cbcc 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -261,7 +261,7 @@ test_expect_success '--stdin outside of repository (uses default hash)' '
 	test_cmp expect actual
 '
 
-test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
 		'files over 4GB hash literally' '
 	test-tool genzeros $((5*1024*1024*1024)) >big &&
 	test_oid large5GB >expect &&
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v2 2/6] object-file.c: use size_t for header lengths
From: Philip Oakley via GitGitGadget @ 2026-06-16 14:49 UTC (permalink / raw)
  To: git; +Cc: Philip Oakley, Patrick Steinhardt, Johannes Schindelin,
	Philip Oakley
In-Reply-To: <pull.2138.v2.git.1781621398.gitgitgadget@gmail.com>

From: Philip Oakley <philipoakley@iee.email>

Continue walking the code path for the >4GB `hash-object --literally`
test. The `hash_object_file_literally()` function internally uses both
`hash_object_file()` and `write_object_file_prepare()`. Both function
signatures use `unsigned long` rather than `size_t` for the mem buffer
sizes. Use `size_t` instead, for LLP64 compatibility.

While at it, convert those function's object's header buffer length to
`size_t` for consistency. The value is already upcast to `uintmax_t` for
print format compatibility.

Note: The hash-object test still does not pass. A subsequent commit
continues to walk the call tree's lower level hash functions to identify
further fixes.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 object-file.c         | 10 +++++-----
 object-file.h         |  6 +++---
 odb/source-files.c    |  2 +-
 odb/source-inmemory.c |  2 +-
 odb/source-loose.c    |  4 ++--
 odb/source.h          |  2 +-
 6 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/object-file.c b/object-file.c
index 9afa842da2..dccbe0fc3e 100644
--- a/object-file.c
+++ b/object-file.c
@@ -318,7 +318,7 @@ int parse_loose_header(const char *hdr, struct object_info *oi)
 static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_ctx *c,
 			     const void *buf, unsigned long len,
 			     struct object_id *oid,
-			     char *hdr, int *hdrlen)
+			     char *hdr, size_t *hdrlen)
 {
 	algo->init_fn(c);
 	git_hash_update(c, hdr, *hdrlen);
@@ -327,9 +327,9 @@ static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_c
 }
 
 void write_object_file_prepare(const struct git_hash_algo *algo,
-			       const void *buf, unsigned long len,
+			       const void *buf, size_t len,
 			       enum object_type type, struct object_id *oid,
-			       char *hdr, int *hdrlen)
+			       char *hdr, size_t *hdrlen)
 {
 	struct git_hash_ctx c;
 
@@ -472,11 +472,11 @@ out:
 }
 
 void hash_object_file(const struct git_hash_algo *algo, const void *buf,
-		      unsigned long len, enum object_type type,
+		      size_t len, enum object_type type,
 		      struct object_id *oid)
 {
 	char hdr[MAX_HEADER_LEN];
-	int hdrlen = sizeof(hdr);
+	size_t hdrlen = sizeof(hdr);
 
 	write_object_file_prepare(algo, buf, len, type, oid, hdr, &hdrlen);
 }
diff --git a/object-file.h b/object-file.h
index 528c4e6e69..4c87cd160b 100644
--- a/object-file.h
+++ b/object-file.h
@@ -131,12 +131,12 @@ int finalize_object_file_flags(struct repository *repo,
 			       enum finalize_object_file_flags flags);
 
 void hash_object_file(const struct git_hash_algo *algo, const void *buf,
-		      unsigned long len, enum object_type type,
+		      size_t len, enum object_type type,
 		      struct object_id *oid);
 void write_object_file_prepare(const struct git_hash_algo *algo,
-			       const void *buf, unsigned long len,
+			       const void *buf, size_t len,
 			       enum object_type type, struct object_id *oid,
-			       char *hdr, int *hdrlen);
+			       char *hdr, size_t *hdrlen);
 int write_loose_object(struct odb_source_loose *loose,
 		       const struct object_id *oid, char *hdr,
 		       int hdrlen, const void *buf, unsigned long len,
diff --git a/odb/source-files.c b/odb/source-files.c
index 5bdd042922..3b1261eba9 100644
--- a/odb/source-files.c
+++ b/odb/source-files.c
@@ -159,7 +159,7 @@ static int odb_source_files_freshen_object(struct odb_source *source,
 }
 
 static int odb_source_files_write_object(struct odb_source *source,
-					 const void *buf, unsigned long len,
+					 const void *buf, size_t len,
 					 enum object_type type,
 					 struct object_id *oid,
 					 struct object_id *compat_oid,
diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index e004566d76..7f1b6f4636 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -227,7 +227,7 @@ static int odb_source_inmemory_count_objects(struct odb_source *source,
 }
 
 static int odb_source_inmemory_write_object(struct odb_source *source,
-					    const void *buf, unsigned long len,
+					    const void *buf, size_t len,
 					    enum object_type type,
 					    struct object_id *oid,
 					    struct object_id *compat_oid UNUSED,
diff --git a/odb/source-loose.c b/odb/source-loose.c
index 7d7ea2fb84..4cfa5b23fc 100644
--- a/odb/source-loose.c
+++ b/odb/source-loose.c
@@ -591,7 +591,7 @@ static int odb_source_loose_freshen_object(struct odb_source *source,
 }
 
 static int odb_source_loose_write_object(struct odb_source *source,
-					 const void *buf, unsigned long len,
+					 const void *buf, size_t len,
 					 enum object_type type, struct object_id *oid,
 					 struct object_id *compat_oid_in,
 					 enum odb_write_object_flags flags)
@@ -601,7 +601,7 @@ static int odb_source_loose_write_object(struct odb_source *source,
 	const struct git_hash_algo *compat = source->odb->repo->compat_hash_algo;
 	struct object_id compat_oid;
 	char hdr[MAX_HEADER_LEN];
-	int hdrlen = sizeof(hdr);
+	size_t hdrlen = sizeof(hdr);
 
 	/* Generate compat_oid */
 	if (compat) {
diff --git a/odb/source.h b/odb/source.h
index 2192a101b8..1c65a05e2c 100644
--- a/odb/source.h
+++ b/odb/source.h
@@ -199,7 +199,7 @@ struct odb_source {
 	 * return 0 on success, a negative error code otherwise.
 	 */
 	int (*write_object)(struct odb_source *source,
-			    const void *buf, unsigned long len,
+			    const void *buf, size_t len,
 			    enum object_type type,
 			    struct object_id *oid,
 			    struct object_id *compat_oid,
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v2 1/6] hash-object: demonstrate a >4GB/LLP64 problem
From: Philip Oakley via GitGitGadget @ 2026-06-16 14:49 UTC (permalink / raw)
  To: git; +Cc: Philip Oakley, Patrick Steinhardt, Johannes Schindelin,
	Philip Oakley
In-Reply-To: <pull.2138.v2.git.1781621398.gitgitgadget@gmail.com>

From: Philip Oakley <philipoakley@iee.email>

On LLP64 systems, such as Windows, the size of `long`, `int`, etc. is
only 32 bits (for backward compatibility). Git's use of `unsigned long`
for file memory sizes in many places, rather than size_t, limits the
handling of large files on LLP64 systems (commonly given as `>4GB`).

Provide a minimum test for handling a >4GB file. The `hash-object`
command, with the  `--literally` and without `-w` option avoids
writing the object, either loose or packed. This avoids the code paths
hitting the `bigFileThreshold` config test code, the zlib code, and the
pack code.

Subsequent patches will walk the test's call chain, converting types to
`size_t` (which is larger in LLP64 data models) where appropriate.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t1007-hash-object.sh | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index de076293b6..7867fd1dbf 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -49,6 +49,9 @@ test_expect_success 'setup' '

 	example sha1:ddd3f836d3e3fbb7ae289aa9ae83536f76956399
 	example sha256:b44fe1fe65589848253737db859bd490453510719d7424daab03daf0767b85ae
+
+	large5GB sha1:0be2be10a4c8764f32c4bf372a98edc731a4b204
+	large5GB sha256:dc18ca621300c8d3cfa505a275641ebab00de189859e022a975056882d313e64
 	EOF
 '

@@ -258,4 +261,12 @@ test_expect_success '--stdin outside of repository (uses default hash)' '
 	test_cmp expect actual
 '

+test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
+		'files over 4GB hash literally' '
+	test-tool genzeros $((5*1024*1024*1024)) >big &&
+	test_oid large5GB >expect &&
+	git hash-object --stdin --literally <big >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
gitgitgadget

^ permalink raw reply related

* [PATCH v2 0/6] Support hashing objects larger than 4GB on Windows
From: Johannes Schindelin via GitGitGadget @ 2026-06-16 14:49 UTC (permalink / raw)
  To: git; +Cc: Philip Oakley, Patrick Steinhardt, Johannes Schindelin
In-Reply-To: <pull.2138.git.1780593313.gitgitgadget@gmail.com>

Philip Oakley has contributed these patches ~4.5 years ago, and they have
been carried in Git for Windows ever since.

Now that there are already other patch series flying around that try to
address various aspects about >4GB objects (which aren't handled well by Git
until it stops forcing unsigned long to do size_t's job), it seems a good
time to upstream these patches, too, at long last.

Changes since v1:

 * Rebased to current master to resolve the conflicts with
   ps/odb-source-loose
 * Dropped the !LONG_IS_64BIT prereq from the added/touched tests, as it is
   now no longer needed

Philip Oakley (6):
  hash-object: demonstrate a >4GB/LLP64 problem
  object-file.c: use size_t for header lengths
  hash algorithms: use size_t for section lengths
  hash-object --stdin: verify that it works with >4GB/LLP64
  hash-object: add another >4GB/LLP64 test case
  hash-object: add a >4GB/LLP64 test case using filtered input

 object-file.c          | 14 +++++++-------
 object-file.h          |  6 +++---
 odb/source-files.c     |  2 +-
 odb/source-inmemory.c  |  2 +-
 odb/source-loose.c     |  4 ++--
 odb/source.h           |  2 +-
 sha1dc_git.c           |  3 +--
 sha1dc_git.h           |  2 +-
 t/t1007-hash-object.sh | 39 +++++++++++++++++++++++++++++++++++++++
 9 files changed, 56 insertions(+), 18 deletions(-)


base-commit: 700432b2ba22603a0bcb71475c9c333d17c9b0d1
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2138%2Fdscho%2FPhilipOakley%2Fhashliteral_t-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2138/dscho/PhilipOakley/hashliteral_t-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/2138

Range-diff vs v1:

 1:  84e1cd0aa0 = 1:  9c01bac407 hash-object: demonstrate a >4GB/LLP64 problem
 2:  809d83e46f ! 2:  aa5859c14f object-file.c: use size_t for header lengths
     @@ Commit message
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
       ## object-file.c ##
     -@@ object-file.c: int odb_source_loose_read_object_info(struct odb_source *source,
     +@@ object-file.c: int parse_loose_header(const char *hdr, struct object_info *oi)
       static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_ctx *c,
       			     const void *buf, unsigned long len,
       			     struct object_id *oid,
     @@ object-file.c: int odb_source_loose_read_object_info(struct odb_source *source,
      @@ object-file.c: static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_c
       }
       
     - static void write_object_file_prepare(const struct git_hash_algo *algo,
     --				      const void *buf, unsigned long len,
     -+				      const void *buf, size_t len,
     - 				      enum object_type type, struct object_id *oid,
     --				      char *hdr, int *hdrlen)
     -+				      char *hdr, size_t *hdrlen)
     + void write_object_file_prepare(const struct git_hash_algo *algo,
     +-			       const void *buf, unsigned long len,
     ++			       const void *buf, size_t len,
     + 			       enum object_type type, struct object_id *oid,
     +-			       char *hdr, int *hdrlen)
     ++			       char *hdr, size_t *hdrlen)
       {
       	struct git_hash_ctx c;
       
     @@ object-file.c: out:
       
       	write_object_file_prepare(algo, buf, len, type, oid, hdr, &hdrlen);
       }
     -@@ object-file.c: cleanup:
     +
     + ## object-file.h ##
     +@@ object-file.h: int finalize_object_file_flags(struct repository *repo,
     + 			       enum finalize_object_file_flags flags);
     + 
     + void hash_object_file(const struct git_hash_algo *algo, const void *buf,
     +-		      unsigned long len, enum object_type type,
     ++		      size_t len, enum object_type type,
     + 		      struct object_id *oid);
     + void write_object_file_prepare(const struct git_hash_algo *algo,
     +-			       const void *buf, unsigned long len,
     ++			       const void *buf, size_t len,
     + 			       enum object_type type, struct object_id *oid,
     +-			       char *hdr, int *hdrlen);
     ++			       char *hdr, size_t *hdrlen);
     + int write_loose_object(struct odb_source_loose *loose,
     + 		       const struct object_id *oid, char *hdr,
     + 		       int hdrlen, const void *buf, unsigned long len,
     +
     + ## odb/source-files.c ##
     +@@ odb/source-files.c: static int odb_source_files_freshen_object(struct odb_source *source,
     + }
     + 
     + static int odb_source_files_write_object(struct odb_source *source,
     +-					 const void *buf, unsigned long len,
     ++					 const void *buf, size_t len,
     + 					 enum object_type type,
     + 					 struct object_id *oid,
     + 					 struct object_id *compat_oid,
     +
     + ## odb/source-inmemory.c ##
     +@@ odb/source-inmemory.c: static int odb_source_inmemory_count_objects(struct odb_source *source,
       }
       
     - int odb_source_loose_write_object(struct odb_source *source,
     --				  const void *buf, unsigned long len,
     -+				  const void *buf, size_t len,
     - 				  enum object_type type, struct object_id *oid,
     - 				  struct object_id *compat_oid_in,
     - 				  enum odb_write_object_flags flags)
     -@@ object-file.c: int odb_source_loose_write_object(struct odb_source *source,
     + static int odb_source_inmemory_write_object(struct odb_source *source,
     +-					    const void *buf, unsigned long len,
     ++					    const void *buf, size_t len,
     + 					    enum object_type type,
     + 					    struct object_id *oid,
     + 					    struct object_id *compat_oid UNUSED,
     +
     + ## odb/source-loose.c ##
     +@@ odb/source-loose.c: static int odb_source_loose_freshen_object(struct odb_source *source,
     + }
     + 
     + static int odb_source_loose_write_object(struct odb_source *source,
     +-					 const void *buf, unsigned long len,
     ++					 const void *buf, size_t len,
     + 					 enum object_type type, struct object_id *oid,
     + 					 struct object_id *compat_oid_in,
     + 					 enum odb_write_object_flags flags)
     +@@ odb/source-loose.c: static int odb_source_loose_write_object(struct odb_source *source,
       	const struct git_hash_algo *compat = source->odb->repo->compat_hash_algo;
       	struct object_id compat_oid;
       	char hdr[MAX_HEADER_LEN];
     @@ object-file.c: int odb_source_loose_write_object(struct odb_source *source,
       	/* Generate compat_oid */
       	if (compat) {
      
     - ## object-file.h ##
     -@@ object-file.h: int odb_source_loose_freshen_object(struct odb_source *source,
     - 				    const struct object_id *oid);
     - 
     - int odb_source_loose_write_object(struct odb_source *source,
     --				  const void *buf, unsigned long len,
     -+				  const void *buf, size_t len,
     - 				  enum object_type type, struct object_id *oid,
     - 				  struct object_id *compat_oid_in,
     - 				  enum odb_write_object_flags flags);
     -@@ object-file.h: int finalize_object_file_flags(struct repository *repo,
     - 			       enum finalize_object_file_flags flags);
     - 
     - void hash_object_file(const struct git_hash_algo *algo, const void *buf,
     --		      unsigned long len, enum object_type type,
     -+		      size_t len, enum object_type type,
     - 		      struct object_id *oid);
     - 
     - /* Helper to check and "touch" a file */
     + ## odb/source.h ##
     +@@ odb/source.h: struct odb_source {
     + 	 * return 0 on success, a negative error code otherwise.
     + 	 */
     + 	int (*write_object)(struct odb_source *source,
     +-			    const void *buf, unsigned long len,
     ++			    const void *buf, size_t len,
     + 			    enum object_type type,
     + 			    struct object_id *oid,
     + 			    struct object_id *compat_oid,
 3:  253d6f8004 ! 3:  b401eb490f hash algorithms: use size_t for section lengths
     @@ Commit message
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
       ## object-file.c ##
     -@@ object-file.c: int odb_source_loose_read_object_info(struct odb_source *source,
     +@@ object-file.c: int parse_loose_header(const char *hdr, struct object_info *oi)
       }
       
       static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_ctx *c,
     @@ object-file.c: int odb_source_loose_read_object_info(struct odb_source *source,
       			     struct object_id *oid,
       			     char *hdr, size_t *hdrlen)
       {
     -@@ object-file.c: static void write_object_file_prepare(const struct git_hash_algo *algo,
     +@@ object-file.c: void write_object_file_prepare(const struct git_hash_algo *algo,
       	/* Generate the header */
       	*hdrlen = format_object_header(hdr, *hdrlen, type, len);
       
     @@ t/t1007-hash-object.sh: test_expect_success '--stdin outside of repository (uses
       '
       
      -test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
     -+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
     ++test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
       		'files over 4GB hash literally' '
       	test-tool genzeros $((5*1024*1024*1024)) >big &&
       	test_oid large5GB >expect &&
 4:  ba629a3f03 ! 4:  411727336a hash-object --stdin: verify that it works with >4GB/LLP64
     @@ Commit message
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
       ## t/t1007-hash-object.sh ##
     -@@ t/t1007-hash-object.sh: test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
     +@@ t/t1007-hash-object.sh: test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
       	test_cmp expect actual
       '
       
     -+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
     ++test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
      +		'files over 4GB hash correctly via --stdin' '
      +	{ test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
      +	test_oid large5GB >expect &&
 5:  f48d570bba ! 5:  e6bb4e6228 hash-object: add another >4GB/LLP64 test case
     @@ Commit message
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
       ## t/t1007-hash-object.sh ##
     -@@ t/t1007-hash-object.sh: test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
     +@@ t/t1007-hash-object.sh: test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
       	test_cmp expect actual
       '
       
     -+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
     ++test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
      +		'files over 4GB hash correctly' '
      +	{ test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
      +	test_oid large5GB >expect &&
 6:  8a6beeb16d ! 6:  568807ac34 hash-object: add a >4GB/LLP64 test case using filtered input
     @@ Commit message
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
       ## t/t1007-hash-object.sh ##
     -@@ t/t1007-hash-object.sh: test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
     +@@ t/t1007-hash-object.sh: test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
       	test_cmp expect actual
       '
       
      +# This clean filter does nothing, other than excercising the interface.
      +# We ensure that cleaning doesn't mangle large files on 64-bit Windows.
     -+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
     ++test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
      +		'hash filtered files over 4GB correctly' '
      +	{ test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
      +	test_oid large5GB >expect &&

-- 
gitgitgadget

^ permalink raw reply

* Re: [PATCH 3/6] hash algorithms: use size_t for section lengths
From: Johannes Schindelin @ 2026-06-16 14:48 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Philip Oakley via GitGitGadget, git, Philip Oakley
In-Reply-To: <ai-5VmawU2MRiAHQ@pks.im>

Hi Patrick,

On Tue, 16 Jun 2026, Patrick Steinhardt wrote:

> On Thu, Jun 04, 2026 at 05:15:09PM +0000, Philip Oakley via GitGitGadget wrote:
> > diff --git a/object-file.c b/object-file.c
> > index 1f5f9daf24..c648cecd80 100644
> > --- a/object-file.c
> > +++ b/object-file.c
> > @@ -581,7 +581,7 @@ static void write_object_file_prepare(const struct git_hash_algo *algo,
> >  	/* Generate the header */
> >  	*hdrlen = format_object_header(hdr, *hdrlen, type, len);
> >  
> > -	/* Sha1.. */
> > +	/* Hash (function pointers) computation */
> >  	hash_object_body(algo, &c, buf, len, oid, hdr, hdrlen);
> >  }
> >  
> 
> Thanks for updating this comment while at it :)

It wasn't my idea, it was Claude Opus'. I would have left it as-is, but
then decided that it's actually a good change and not worth splitting out
into a separate commit.

> > diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
> > index 7867fd1dbf..10382a815e 100755
> > --- a/t/t1007-hash-object.sh
> > +++ b/t/t1007-hash-object.sh
> > @@ -261,7 +261,7 @@ test_expect_success '--stdin outside of repository (uses default hash)' '
> >  	test_cmp expect actual
> >  '
> >  
> > -test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
> > +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
> >  		'files over 4GB hash literally' '
> >  	test-tool genzeros $((5*1024*1024*1024)) >big &&
> >  	test_oid large5GB >expect &&
> 
> Previously we required `!LONG_IS_64BIT`, because the test would have
> succeeded on platforms where it is 64 bit wide. But now that this test
> works on all platforms I rather wonder whether we should completely drop
> that prerequisite here, as we expect it to pass regardless of whether or
> not long is 64 bit now.

Good point!

Thank you for the review,
Johannes

^ permalink raw reply

* Re: [PATCH 5/6] hash-object: add another >4GB/LLP64 test case
From: Johannes Schindelin @ 2026-06-16 14:48 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Philip Oakley via GitGitGadget, git, Philip Oakley
In-Reply-To: <ai-5ZvDgc4smGfGc@pks.im>

Hi Patrick,

On Tue, 16 Jun 2026, Patrick Steinhardt wrote:

> On Thu, Jun 04, 2026 at 05:15:11PM +0000, Philip Oakley via GitGitGadget wrote:
> > diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
> > index 59efee3aff..f2722380ee 100755
> > --- a/t/t1007-hash-object.sh
> > +++ b/t/t1007-hash-object.sh
> > @@ -277,4 +277,12 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
> >  	test_cmp expect actual
> >  '
> >  
> > +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
> > +		'files over 4GB hash correctly' '
> > +	{ test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
> > +	test_oid large5GB >expect &&
> > +	git hash-object -- big >actual &&
> > +	test_cmp expect actual
> > +'
> 
> Same comment here.

[Comment was the suggestion to drop the !LONG_IS_64BIT prerequisite]

Same comment here. [My reply: Good point!]

> Nit: I feel like we could've easily introduced all of these tests in the
> first commit.

Sure, but I actually liked the structuring by Philip when I accepted the
patches into Git for Windows, and I still do: The commits all have
slightly different concerns, and I love that the cognitive load is
lightened somewhat by keeping those concerns in separate patches with
separate commit messages.

Ciao,
Johannes

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox