* [PATCH v2 6/6] hash-object: add a >4GB/LLP64 test case using filtered input
From: Philip Oakley via GitGitGadget @ 2026-06-16 14:49 UTC (permalink / raw)
To: git; +Cc: Philip Oakley, Patrick Steinhardt, Johannes Schindelin,
Philip Oakley
In-Reply-To: <pull.2138.v2.git.1781621398.gitgitgadget@gmail.com>
From: Philip Oakley <philipoakley@iee.email>
To verify that the `clean` side of the `clean`/`smudge` filter code is
correct with regards to LLP64 (read: to ensure that `size_t` is used
instead of `unsigned long`), here is a test case using a trivial filter,
specifically _not_ writing anything to the object store to limit the
scope of the test case.
As in previous commits, the `big` file from previous test cases is
reused if available, to save setup time, otherwise re-generated.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
t/t1007-hash-object.sh | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index f96c29ce68..4bc82dd968 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -285,4 +285,16 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
test_cmp expect actual
'
+# This clean filter does nothing, other than excercising the interface.
+# We ensure that cleaning doesn't mangle large files on 64-bit Windows.
+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
+ 'hash filtered files over 4GB correctly' '
+ { test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
+ test_oid large5GB >expect &&
+ test_config filter.null-filter.clean "cat" &&
+ echo "big filter=null-filter" >.gitattributes &&
+ git hash-object -- big >actual &&
+ test_cmp expect actual
+'
+
test_done
--
gitgitgadget
^ permalink raw reply related
* [PATCH v2 5/6] hash-object: add another >4GB/LLP64 test case
From: Philip Oakley via GitGitGadget @ 2026-06-16 14:49 UTC (permalink / raw)
To: git; +Cc: Philip Oakley, Patrick Steinhardt, Johannes Schindelin,
Philip Oakley
In-Reply-To: <pull.2138.v2.git.1781621398.gitgitgadget@gmail.com>
From: Philip Oakley <philipoakley@iee.email>
To complement the `--stdin` and `--literally` test cases that verify
that we can hash files larger than 4GB on 64-bit platforms using the
LLP64 data model, here is a test case that exercises `hash-object`
_without_ any options.
Just as before, we use the `big` file from the previous test case if it
exists to save on setup time, otherwise generate it.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
t/t1007-hash-object.sh | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index bcae3fc54c..f96c29ce68 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -277,4 +277,12 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
test_cmp expect actual
'
+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
+ 'files over 4GB hash correctly' '
+ { test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
+ test_oid large5GB >expect &&
+ git hash-object -- big >actual &&
+ test_cmp expect actual
+'
+
test_done
--
gitgitgadget
^ permalink raw reply related
* [PATCH v2 4/6] hash-object --stdin: verify that it works with >4GB/LLP64
From: Philip Oakley via GitGitGadget @ 2026-06-16 14:49 UTC (permalink / raw)
To: git; +Cc: Philip Oakley, Patrick Steinhardt, Johannes Schindelin,
Philip Oakley
In-Reply-To: <pull.2138.v2.git.1781621398.gitgitgadget@gmail.com>
From: Philip Oakley <philipoakley@iee.email>
Just like the `hash-object --literally` code path, the `--stdin` code
path also needs to use `size_t` instead of `unsigned long` to represent
memory sizes, otherwise it would cause problems on platforms using the
LLP64 data model (such as Windows).
To limit the scope of the test case, the object is explicitly not
written to the object store, nor are any filters applied.
The `big` file from the previous test case is reused to save setup time;
To avoid relying on that side effect, it is generated if it does not
exist (e.g. when running via `sh t1007-*.sh --long --run=1,41`).
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
t/t1007-hash-object.sh | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index f028a1cbcc..bcae3fc54c 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -269,4 +269,12 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
test_cmp expect actual
'
+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
+ 'files over 4GB hash correctly via --stdin' '
+ { test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
+ test_oid large5GB >expect &&
+ git hash-object --stdin <big >actual &&
+ test_cmp expect actual
+'
+
test_done
--
gitgitgadget
^ permalink raw reply related
* [PATCH v2 3/6] hash algorithms: use size_t for section lengths
From: Philip Oakley via GitGitGadget @ 2026-06-16 14:49 UTC (permalink / raw)
To: git; +Cc: Philip Oakley, Patrick Steinhardt, Johannes Schindelin,
Philip Oakley
In-Reply-To: <pull.2138.v2.git.1781621398.gitgitgadget@gmail.com>
From: Philip Oakley <philipoakley@iee.email>
Continue walking the code path for the >4GB `hash-object --literally`
test to the hash algorithm step for LLP64 systems.
This patch lets the SHA1DC code use `size_t`, making it compatible with
LLP64 data models (as used e.g. by Windows).
The interested reader of this patch will note that we adjust the
signature of the `git_SHA1DCUpdate()` function without updating _any_
call site. This certainly puzzled at least one reviewer already, so here
is an explanation:
This function is never called directly, but always via the macro
`platform_SHA1_Update`, which is usually called via the macro
`git_SHA1_Update`. However, we never call `git_SHA1_Update()` directly
in `struct git_hash_algo`. Instead, we call `git_hash_sha1_update()`,
which is defined thusly:
static void git_hash_sha1_update(git_hash_ctx *ctx,
const void *data, size_t len)
{
git_SHA1_Update(&ctx->sha1, data, len);
}
i.e. it contains an implicit downcast from `size_t` to `unsigned long`
(before this here patch). With this patch, there is no downcast anymore.
With this patch, finally, the t1007-hash-object.sh "files over 4GB hash
literally" test case is fixed.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
object-file.c | 4 ++--
sha1dc_git.c | 3 +--
sha1dc_git.h | 2 +-
t/t1007-hash-object.sh | 2 +-
4 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/object-file.c b/object-file.c
index dccbe0fc3e..0056c369ce 100644
--- a/object-file.c
+++ b/object-file.c
@@ -316,7 +316,7 @@ int parse_loose_header(const char *hdr, struct object_info *oi)
}
static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_ctx *c,
- const void *buf, unsigned long len,
+ const void *buf, size_t len,
struct object_id *oid,
char *hdr, size_t *hdrlen)
{
@@ -336,7 +336,7 @@ void write_object_file_prepare(const struct git_hash_algo *algo,
/* Generate the header */
*hdrlen = format_object_header(hdr, *hdrlen, type, len);
- /* Sha1.. */
+ /* Hash (function pointers) computation */
hash_object_body(algo, &c, buf, len, oid, hdr, hdrlen);
}
diff --git a/sha1dc_git.c b/sha1dc_git.c
index 9b675a046e..fe58d7962a 100644
--- a/sha1dc_git.c
+++ b/sha1dc_git.c
@@ -27,10 +27,9 @@ void git_SHA1DCFinal(unsigned char hash[20], SHA1_CTX *ctx)
/*
* Same as SHA1DCUpdate, but adjust types to match git's usual interface.
*/
-void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, unsigned long len)
+void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, size_t len)
{
const char *data = vdata;
- /* We expect an unsigned long, but sha1dc only takes an int */
while (len > INT_MAX) {
SHA1DCUpdate(ctx, data, INT_MAX);
data += INT_MAX;
diff --git a/sha1dc_git.h b/sha1dc_git.h
index f6f880cabe..0bcf1aa84b 100644
--- a/sha1dc_git.h
+++ b/sha1dc_git.h
@@ -15,7 +15,7 @@ void git_SHA1DCInit(SHA1_CTX *);
#endif
void git_SHA1DCFinal(unsigned char [20], SHA1_CTX *);
-void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, unsigned long len);
+void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, size_t len);
#define platform_SHA_IS_SHA1DC /* used by "test-tool sha1-is-sha1dc" */
diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index 7867fd1dbf..f028a1cbcc 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -261,7 +261,7 @@ test_expect_success '--stdin outside of repository (uses default hash)' '
test_cmp expect actual
'
-test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
'files over 4GB hash literally' '
test-tool genzeros $((5*1024*1024*1024)) >big &&
test_oid large5GB >expect &&
--
gitgitgadget
^ permalink raw reply related
* [PATCH v2 2/6] object-file.c: use size_t for header lengths
From: Philip Oakley via GitGitGadget @ 2026-06-16 14:49 UTC (permalink / raw)
To: git; +Cc: Philip Oakley, Patrick Steinhardt, Johannes Schindelin,
Philip Oakley
In-Reply-To: <pull.2138.v2.git.1781621398.gitgitgadget@gmail.com>
From: Philip Oakley <philipoakley@iee.email>
Continue walking the code path for the >4GB `hash-object --literally`
test. The `hash_object_file_literally()` function internally uses both
`hash_object_file()` and `write_object_file_prepare()`. Both function
signatures use `unsigned long` rather than `size_t` for the mem buffer
sizes. Use `size_t` instead, for LLP64 compatibility.
While at it, convert those function's object's header buffer length to
`size_t` for consistency. The value is already upcast to `uintmax_t` for
print format compatibility.
Note: The hash-object test still does not pass. A subsequent commit
continues to walk the call tree's lower level hash functions to identify
further fixes.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
object-file.c | 10 +++++-----
object-file.h | 6 +++---
odb/source-files.c | 2 +-
odb/source-inmemory.c | 2 +-
odb/source-loose.c | 4 ++--
odb/source.h | 2 +-
6 files changed, 13 insertions(+), 13 deletions(-)
diff --git a/object-file.c b/object-file.c
index 9afa842da2..dccbe0fc3e 100644
--- a/object-file.c
+++ b/object-file.c
@@ -318,7 +318,7 @@ int parse_loose_header(const char *hdr, struct object_info *oi)
static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_ctx *c,
const void *buf, unsigned long len,
struct object_id *oid,
- char *hdr, int *hdrlen)
+ char *hdr, size_t *hdrlen)
{
algo->init_fn(c);
git_hash_update(c, hdr, *hdrlen);
@@ -327,9 +327,9 @@ static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_c
}
void write_object_file_prepare(const struct git_hash_algo *algo,
- const void *buf, unsigned long len,
+ const void *buf, size_t len,
enum object_type type, struct object_id *oid,
- char *hdr, int *hdrlen)
+ char *hdr, size_t *hdrlen)
{
struct git_hash_ctx c;
@@ -472,11 +472,11 @@ out:
}
void hash_object_file(const struct git_hash_algo *algo, const void *buf,
- unsigned long len, enum object_type type,
+ size_t len, enum object_type type,
struct object_id *oid)
{
char hdr[MAX_HEADER_LEN];
- int hdrlen = sizeof(hdr);
+ size_t hdrlen = sizeof(hdr);
write_object_file_prepare(algo, buf, len, type, oid, hdr, &hdrlen);
}
diff --git a/object-file.h b/object-file.h
index 528c4e6e69..4c87cd160b 100644
--- a/object-file.h
+++ b/object-file.h
@@ -131,12 +131,12 @@ int finalize_object_file_flags(struct repository *repo,
enum finalize_object_file_flags flags);
void hash_object_file(const struct git_hash_algo *algo, const void *buf,
- unsigned long len, enum object_type type,
+ size_t len, enum object_type type,
struct object_id *oid);
void write_object_file_prepare(const struct git_hash_algo *algo,
- const void *buf, unsigned long len,
+ const void *buf, size_t len,
enum object_type type, struct object_id *oid,
- char *hdr, int *hdrlen);
+ char *hdr, size_t *hdrlen);
int write_loose_object(struct odb_source_loose *loose,
const struct object_id *oid, char *hdr,
int hdrlen, const void *buf, unsigned long len,
diff --git a/odb/source-files.c b/odb/source-files.c
index 5bdd042922..3b1261eba9 100644
--- a/odb/source-files.c
+++ b/odb/source-files.c
@@ -159,7 +159,7 @@ static int odb_source_files_freshen_object(struct odb_source *source,
}
static int odb_source_files_write_object(struct odb_source *source,
- const void *buf, unsigned long len,
+ const void *buf, size_t len,
enum object_type type,
struct object_id *oid,
struct object_id *compat_oid,
diff --git a/odb/source-inmemory.c b/odb/source-inmemory.c
index e004566d76..7f1b6f4636 100644
--- a/odb/source-inmemory.c
+++ b/odb/source-inmemory.c
@@ -227,7 +227,7 @@ static int odb_source_inmemory_count_objects(struct odb_source *source,
}
static int odb_source_inmemory_write_object(struct odb_source *source,
- const void *buf, unsigned long len,
+ const void *buf, size_t len,
enum object_type type,
struct object_id *oid,
struct object_id *compat_oid UNUSED,
diff --git a/odb/source-loose.c b/odb/source-loose.c
index 7d7ea2fb84..4cfa5b23fc 100644
--- a/odb/source-loose.c
+++ b/odb/source-loose.c
@@ -591,7 +591,7 @@ static int odb_source_loose_freshen_object(struct odb_source *source,
}
static int odb_source_loose_write_object(struct odb_source *source,
- const void *buf, unsigned long len,
+ const void *buf, size_t len,
enum object_type type, struct object_id *oid,
struct object_id *compat_oid_in,
enum odb_write_object_flags flags)
@@ -601,7 +601,7 @@ static int odb_source_loose_write_object(struct odb_source *source,
const struct git_hash_algo *compat = source->odb->repo->compat_hash_algo;
struct object_id compat_oid;
char hdr[MAX_HEADER_LEN];
- int hdrlen = sizeof(hdr);
+ size_t hdrlen = sizeof(hdr);
/* Generate compat_oid */
if (compat) {
diff --git a/odb/source.h b/odb/source.h
index 2192a101b8..1c65a05e2c 100644
--- a/odb/source.h
+++ b/odb/source.h
@@ -199,7 +199,7 @@ struct odb_source {
* return 0 on success, a negative error code otherwise.
*/
int (*write_object)(struct odb_source *source,
- const void *buf, unsigned long len,
+ const void *buf, size_t len,
enum object_type type,
struct object_id *oid,
struct object_id *compat_oid,
--
gitgitgadget
^ permalink raw reply related
* [PATCH v2 1/6] hash-object: demonstrate a >4GB/LLP64 problem
From: Philip Oakley via GitGitGadget @ 2026-06-16 14:49 UTC (permalink / raw)
To: git; +Cc: Philip Oakley, Patrick Steinhardt, Johannes Schindelin,
Philip Oakley
In-Reply-To: <pull.2138.v2.git.1781621398.gitgitgadget@gmail.com>
From: Philip Oakley <philipoakley@iee.email>
On LLP64 systems, such as Windows, the size of `long`, `int`, etc. is
only 32 bits (for backward compatibility). Git's use of `unsigned long`
for file memory sizes in many places, rather than size_t, limits the
handling of large files on LLP64 systems (commonly given as `>4GB`).
Provide a minimum test for handling a >4GB file. The `hash-object`
command, with the `--literally` and without `-w` option avoids
writing the object, either loose or packed. This avoids the code paths
hitting the `bigFileThreshold` config test code, the zlib code, and the
pack code.
Subsequent patches will walk the test's call chain, converting types to
`size_t` (which is larger in LLP64 data models) where appropriate.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
t/t1007-hash-object.sh | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index de076293b6..7867fd1dbf 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -49,6 +49,9 @@ test_expect_success 'setup' '
example sha1:ddd3f836d3e3fbb7ae289aa9ae83536f76956399
example sha256:b44fe1fe65589848253737db859bd490453510719d7424daab03daf0767b85ae
+
+ large5GB sha1:0be2be10a4c8764f32c4bf372a98edc731a4b204
+ large5GB sha256:dc18ca621300c8d3cfa505a275641ebab00de189859e022a975056882d313e64
EOF
'
@@ -258,4 +261,12 @@ test_expect_success '--stdin outside of repository (uses default hash)' '
test_cmp expect actual
'
+test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
+ 'files over 4GB hash literally' '
+ test-tool genzeros $((5*1024*1024*1024)) >big &&
+ test_oid large5GB >expect &&
+ git hash-object --stdin --literally <big >actual &&
+ test_cmp expect actual
+'
+
test_done
--
gitgitgadget
^ permalink raw reply related
* [PATCH v2 0/6] Support hashing objects larger than 4GB on Windows
From: Johannes Schindelin via GitGitGadget @ 2026-06-16 14:49 UTC (permalink / raw)
To: git; +Cc: Philip Oakley, Patrick Steinhardt, Johannes Schindelin
In-Reply-To: <pull.2138.git.1780593313.gitgitgadget@gmail.com>
Philip Oakley has contributed these patches ~4.5 years ago, and they have
been carried in Git for Windows ever since.
Now that there are already other patch series flying around that try to
address various aspects about >4GB objects (which aren't handled well by Git
until it stops forcing unsigned long to do size_t's job), it seems a good
time to upstream these patches, too, at long last.
Changes since v1:
* Rebased to current master to resolve the conflicts with
ps/odb-source-loose
* Dropped the !LONG_IS_64BIT prereq from the added/touched tests, as it is
now no longer needed
Philip Oakley (6):
hash-object: demonstrate a >4GB/LLP64 problem
object-file.c: use size_t for header lengths
hash algorithms: use size_t for section lengths
hash-object --stdin: verify that it works with >4GB/LLP64
hash-object: add another >4GB/LLP64 test case
hash-object: add a >4GB/LLP64 test case using filtered input
object-file.c | 14 +++++++-------
object-file.h | 6 +++---
odb/source-files.c | 2 +-
odb/source-inmemory.c | 2 +-
odb/source-loose.c | 4 ++--
odb/source.h | 2 +-
sha1dc_git.c | 3 +--
sha1dc_git.h | 2 +-
t/t1007-hash-object.sh | 39 +++++++++++++++++++++++++++++++++++++++
9 files changed, 56 insertions(+), 18 deletions(-)
base-commit: 700432b2ba22603a0bcb71475c9c333d17c9b0d1
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2138%2Fdscho%2FPhilipOakley%2Fhashliteral_t-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2138/dscho/PhilipOakley/hashliteral_t-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/2138
Range-diff vs v1:
1: 84e1cd0aa0 = 1: 9c01bac407 hash-object: demonstrate a >4GB/LLP64 problem
2: 809d83e46f ! 2: aa5859c14f object-file.c: use size_t for header lengths
@@ Commit message
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
## object-file.c ##
-@@ object-file.c: int odb_source_loose_read_object_info(struct odb_source *source,
+@@ object-file.c: int parse_loose_header(const char *hdr, struct object_info *oi)
static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_ctx *c,
const void *buf, unsigned long len,
struct object_id *oid,
@@ object-file.c: int odb_source_loose_read_object_info(struct odb_source *source,
@@ object-file.c: static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_c
}
- static void write_object_file_prepare(const struct git_hash_algo *algo,
-- const void *buf, unsigned long len,
-+ const void *buf, size_t len,
- enum object_type type, struct object_id *oid,
-- char *hdr, int *hdrlen)
-+ char *hdr, size_t *hdrlen)
+ void write_object_file_prepare(const struct git_hash_algo *algo,
+- const void *buf, unsigned long len,
++ const void *buf, size_t len,
+ enum object_type type, struct object_id *oid,
+- char *hdr, int *hdrlen)
++ char *hdr, size_t *hdrlen)
{
struct git_hash_ctx c;
@@ object-file.c: out:
write_object_file_prepare(algo, buf, len, type, oid, hdr, &hdrlen);
}
-@@ object-file.c: cleanup:
+
+ ## object-file.h ##
+@@ object-file.h: int finalize_object_file_flags(struct repository *repo,
+ enum finalize_object_file_flags flags);
+
+ void hash_object_file(const struct git_hash_algo *algo, const void *buf,
+- unsigned long len, enum object_type type,
++ size_t len, enum object_type type,
+ struct object_id *oid);
+ void write_object_file_prepare(const struct git_hash_algo *algo,
+- const void *buf, unsigned long len,
++ const void *buf, size_t len,
+ enum object_type type, struct object_id *oid,
+- char *hdr, int *hdrlen);
++ char *hdr, size_t *hdrlen);
+ int write_loose_object(struct odb_source_loose *loose,
+ const struct object_id *oid, char *hdr,
+ int hdrlen, const void *buf, unsigned long len,
+
+ ## odb/source-files.c ##
+@@ odb/source-files.c: static int odb_source_files_freshen_object(struct odb_source *source,
+ }
+
+ static int odb_source_files_write_object(struct odb_source *source,
+- const void *buf, unsigned long len,
++ const void *buf, size_t len,
+ enum object_type type,
+ struct object_id *oid,
+ struct object_id *compat_oid,
+
+ ## odb/source-inmemory.c ##
+@@ odb/source-inmemory.c: static int odb_source_inmemory_count_objects(struct odb_source *source,
}
- int odb_source_loose_write_object(struct odb_source *source,
-- const void *buf, unsigned long len,
-+ const void *buf, size_t len,
- enum object_type type, struct object_id *oid,
- struct object_id *compat_oid_in,
- enum odb_write_object_flags flags)
-@@ object-file.c: int odb_source_loose_write_object(struct odb_source *source,
+ static int odb_source_inmemory_write_object(struct odb_source *source,
+- const void *buf, unsigned long len,
++ const void *buf, size_t len,
+ enum object_type type,
+ struct object_id *oid,
+ struct object_id *compat_oid UNUSED,
+
+ ## odb/source-loose.c ##
+@@ odb/source-loose.c: static int odb_source_loose_freshen_object(struct odb_source *source,
+ }
+
+ static int odb_source_loose_write_object(struct odb_source *source,
+- const void *buf, unsigned long len,
++ const void *buf, size_t len,
+ enum object_type type, struct object_id *oid,
+ struct object_id *compat_oid_in,
+ enum odb_write_object_flags flags)
+@@ odb/source-loose.c: static int odb_source_loose_write_object(struct odb_source *source,
const struct git_hash_algo *compat = source->odb->repo->compat_hash_algo;
struct object_id compat_oid;
char hdr[MAX_HEADER_LEN];
@@ object-file.c: int odb_source_loose_write_object(struct odb_source *source,
/* Generate compat_oid */
if (compat) {
- ## object-file.h ##
-@@ object-file.h: int odb_source_loose_freshen_object(struct odb_source *source,
- const struct object_id *oid);
-
- int odb_source_loose_write_object(struct odb_source *source,
-- const void *buf, unsigned long len,
-+ const void *buf, size_t len,
- enum object_type type, struct object_id *oid,
- struct object_id *compat_oid_in,
- enum odb_write_object_flags flags);
-@@ object-file.h: int finalize_object_file_flags(struct repository *repo,
- enum finalize_object_file_flags flags);
-
- void hash_object_file(const struct git_hash_algo *algo, const void *buf,
-- unsigned long len, enum object_type type,
-+ size_t len, enum object_type type,
- struct object_id *oid);
-
- /* Helper to check and "touch" a file */
+ ## odb/source.h ##
+@@ odb/source.h: struct odb_source {
+ * return 0 on success, a negative error code otherwise.
+ */
+ int (*write_object)(struct odb_source *source,
+- const void *buf, unsigned long len,
++ const void *buf, size_t len,
+ enum object_type type,
+ struct object_id *oid,
+ struct object_id *compat_oid,
3: 253d6f8004 ! 3: b401eb490f hash algorithms: use size_t for section lengths
@@ Commit message
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
## object-file.c ##
-@@ object-file.c: int odb_source_loose_read_object_info(struct odb_source *source,
+@@ object-file.c: int parse_loose_header(const char *hdr, struct object_info *oi)
}
static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_ctx *c,
@@ object-file.c: int odb_source_loose_read_object_info(struct odb_source *source,
struct object_id *oid,
char *hdr, size_t *hdrlen)
{
-@@ object-file.c: static void write_object_file_prepare(const struct git_hash_algo *algo,
+@@ object-file.c: void write_object_file_prepare(const struct git_hash_algo *algo,
/* Generate the header */
*hdrlen = format_object_header(hdr, *hdrlen, type, len);
@@ t/t1007-hash-object.sh: test_expect_success '--stdin outside of repository (uses
'
-test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
-+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
++test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
'files over 4GB hash literally' '
test-tool genzeros $((5*1024*1024*1024)) >big &&
test_oid large5GB >expect &&
4: ba629a3f03 ! 4: 411727336a hash-object --stdin: verify that it works with >4GB/LLP64
@@ Commit message
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
## t/t1007-hash-object.sh ##
-@@ t/t1007-hash-object.sh: test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
+@@ t/t1007-hash-object.sh: test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
test_cmp expect actual
'
-+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
++test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
+ 'files over 4GB hash correctly via --stdin' '
+ { test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
+ test_oid large5GB >expect &&
5: f48d570bba ! 5: e6bb4e6228 hash-object: add another >4GB/LLP64 test case
@@ Commit message
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
## t/t1007-hash-object.sh ##
-@@ t/t1007-hash-object.sh: test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
+@@ t/t1007-hash-object.sh: test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
test_cmp expect actual
'
-+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
++test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
+ 'files over 4GB hash correctly' '
+ { test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
+ test_oid large5GB >expect &&
6: 8a6beeb16d ! 6: 568807ac34 hash-object: add a >4GB/LLP64 test case using filtered input
@@ Commit message
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
## t/t1007-hash-object.sh ##
-@@ t/t1007-hash-object.sh: test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
+@@ t/t1007-hash-object.sh: test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
test_cmp expect actual
'
+# This clean filter does nothing, other than excercising the interface.
+# We ensure that cleaning doesn't mangle large files on 64-bit Windows.
-+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
++test_expect_success EXPENSIVE,SIZE_T_IS_64BIT \
+ 'hash filtered files over 4GB correctly' '
+ { test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
+ test_oid large5GB >expect &&
--
gitgitgadget
^ permalink raw reply
* Re: [PATCH 3/6] hash algorithms: use size_t for section lengths
From: Johannes Schindelin @ 2026-06-16 14:48 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Philip Oakley via GitGitGadget, git, Philip Oakley
In-Reply-To: <ai-5VmawU2MRiAHQ@pks.im>
Hi Patrick,
On Tue, 16 Jun 2026, Patrick Steinhardt wrote:
> On Thu, Jun 04, 2026 at 05:15:09PM +0000, Philip Oakley via GitGitGadget wrote:
> > diff --git a/object-file.c b/object-file.c
> > index 1f5f9daf24..c648cecd80 100644
> > --- a/object-file.c
> > +++ b/object-file.c
> > @@ -581,7 +581,7 @@ static void write_object_file_prepare(const struct git_hash_algo *algo,
> > /* Generate the header */
> > *hdrlen = format_object_header(hdr, *hdrlen, type, len);
> >
> > - /* Sha1.. */
> > + /* Hash (function pointers) computation */
> > hash_object_body(algo, &c, buf, len, oid, hdr, hdrlen);
> > }
> >
>
> Thanks for updating this comment while at it :)
It wasn't my idea, it was Claude Opus'. I would have left it as-is, but
then decided that it's actually a good change and not worth splitting out
into a separate commit.
> > diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
> > index 7867fd1dbf..10382a815e 100755
> > --- a/t/t1007-hash-object.sh
> > +++ b/t/t1007-hash-object.sh
> > @@ -261,7 +261,7 @@ test_expect_success '--stdin outside of repository (uses default hash)' '
> > test_cmp expect actual
> > '
> >
> > -test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
> > +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
> > 'files over 4GB hash literally' '
> > test-tool genzeros $((5*1024*1024*1024)) >big &&
> > test_oid large5GB >expect &&
>
> Previously we required `!LONG_IS_64BIT`, because the test would have
> succeeded on platforms where it is 64 bit wide. But now that this test
> works on all platforms I rather wonder whether we should completely drop
> that prerequisite here, as we expect it to pass regardless of whether or
> not long is 64 bit now.
Good point!
Thank you for the review,
Johannes
^ permalink raw reply
* Re: [PATCH 5/6] hash-object: add another >4GB/LLP64 test case
From: Johannes Schindelin @ 2026-06-16 14:48 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Philip Oakley via GitGitGadget, git, Philip Oakley
In-Reply-To: <ai-5ZvDgc4smGfGc@pks.im>
Hi Patrick,
On Tue, 16 Jun 2026, Patrick Steinhardt wrote:
> On Thu, Jun 04, 2026 at 05:15:11PM +0000, Philip Oakley via GitGitGadget wrote:
> > diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
> > index 59efee3aff..f2722380ee 100755
> > --- a/t/t1007-hash-object.sh
> > +++ b/t/t1007-hash-object.sh
> > @@ -277,4 +277,12 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
> > test_cmp expect actual
> > '
> >
> > +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
> > + 'files over 4GB hash correctly' '
> > + { test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
> > + test_oid large5GB >expect &&
> > + git hash-object -- big >actual &&
> > + test_cmp expect actual
> > +'
>
> Same comment here.
[Comment was the suggestion to drop the !LONG_IS_64BIT prerequisite]
Same comment here. [My reply: Good point!]
> Nit: I feel like we could've easily introduced all of these tests in the
> first commit.
Sure, but I actually liked the structuring by Philip when I accepted the
patches into Git for Windows, and I still do: The commits all have
slightly different concerns, and I love that the cognitive load is
lightened somewhat by keeping those concerns in separate patches with
separate commit messages.
Ciao,
Johannes
^ permalink raw reply
* Re: [PATCH 2/6] object-file.c: use size_t for header lengths
From: Johannes Schindelin @ 2026-06-16 14:48 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Philip Oakley via GitGitGadget, git, Philip Oakley
In-Reply-To: <ai-5XO9gsc_HdMFX@pks.im>
Hi Patrick,
On Mon, 15 Jun 2026, Patrick Steinhardt wrote:
> On Thu, Jun 04, 2026 at 05:15:08PM +0000, Philip Oakley via GitGitGadget wrote:
> > From: Philip Oakley <philipoakley@iee.email>
> >
> > Continue walking the code path for the >4GB `hash-object --literally`
> > test. The `hash_object_file_literally()` function internally uses both
> > `hash_object_file()` and `write_object_file_prepare()`. Both function
> > signatures use `unsigned long` rather than `size_t` for the mem buffer
> > sizes. Use `size_t` instead, for LLP64 compatibility.
> >
> > While at it, convert those function's object's header buffer length to
> > `size_t` for consistency. The value is already upcast to `uintmax_t` for
> > print format compatibility.
>
> One thing I was wondering is whether we should rather migrate to a size
> that is consistent across different platforms. We could e.g. `typedef
> uint64_t objsize_t` and then use that going forward.
No, the point of `size_t` is to represent what the current platform can
handle in-memory. That cannot (and should not) be consolidated.
> I guess the question though is whether that'd buy us anything. In other
> words, are there any platforms that we care about where `size_t` is only
> 32 bit wide? And would such platforms even be able to handle such large
> objects?
There are ways to handle objects larger than 4GB on 32-bit platforms, via
streaming. In those cases, what you need is `off_t`, not `size_t`.
Obviously, there is a large class of problems with such setups. For
example, you can forget about efficiently reconstructing a large Git
object from a delta chain. If you cannot do that in-memory, trying to work
around that limitation merely opens up the user for a world of pain.
Ciao,
Johannes
^ permalink raw reply
* Re: [PATCH v3] read_gitfile(): simplify NOT_A_REPO error message
From: Jeff King @ 2026-06-16 14:45 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Patrick Steinhardt, git, Tian Yuchen
In-Reply-To: <xmqq7bnya7gh.fsf@gitster.g>
On Tue, Jun 16, 2026 at 07:25:02AM -0700, Junio C Hamano wrote:
> > +@@ t/t7450-bad-git-dotfiles.sh: test_expect_success 'git dirs of sibling submodules must not be nested' '
> > + test_expect_success 'submodule git dir nesting detection must work with parallel cloning' '
> > + test_must_fail git clone --recurse-submodules --jobs=2 nested clone_parallel 2>err &&
> > + cat err &&
> > +- grep -E "(already exists|is inside git dir|not a git repository)" err &&
> > ++ grep -E "(already exists|is inside git dir|does not point to a valid repository)" err &&
>
> A few things.
>
> * Will we be happy to see only one of these possibilities, or do we
> expect to see these once for each kind?
I imagine it is only one. This all comes from 9cf8547320 (clone: prevent
clashing git dirs when cloning submodule in parallel, 2024-01-28), and
it is expecting the nested path to cause a failure. Which failure I
guess depends on the racy ordering. If we create the inner one first,
then we probably get "already exists", and if the outer one, then "is
inside git dir". I don't know exactly what sequence yields the
NOT_A_REPO message.
But none of that is changing in this patch, just what the user-visible
text is for the NOT_A_REPO case.
I did briefly wonder if we might see "not a git repository" from a
_different_ code path, and need to catch it along with the new message.
But running successfully with --stress implies that we never see the old
one anymore.
> * a recently started in-flight topic tries to catch bare "grep" and
> fails until you write test_grep X-<.
Yeah. This will create a merge conflict for you, but hopefully the
resolution should be obvious. I don't think it makes sense to fix here,
as it's orthogonal to the purpose of the patch.
-Peff
^ permalink raw reply
* Re: [PATCH v3] read_gitfile(): simplify NOT_A_REPO error message
From: Junio C Hamano @ 2026-06-16 14:25 UTC (permalink / raw)
To: Jeff King; +Cc: Patrick Steinhardt, git, Tian Yuchen
In-Reply-To: <20260616123516.GA2301231@coredump.intra.peff.net>
Jeff King <peff@peff.net> writes:
> On Tue, Jun 16, 2026 at 07:19:20AM -0400, Jeff King wrote:
>
>> Here it is.
Thanks.
> +@@ t/t7450-bad-git-dotfiles.sh: test_expect_success 'git dirs of sibling submodules must not be nested' '
> + test_expect_success 'submodule git dir nesting detection must work with parallel cloning' '
> + test_must_fail git clone --recurse-submodules --jobs=2 nested clone_parallel 2>err &&
> + cat err &&
> +- grep -E "(already exists|is inside git dir|not a git repository)" err &&
> ++ grep -E "(already exists|is inside git dir|does not point to a valid repository)" err &&
A few things.
* Will we be happy to see only one of these possibilities, or do we
expect to see these once for each kind?
* a recently started in-flight topic tries to catch bare "grep" and
fails until you write test_grep X-<.
> We also racily trigger this in t7450. During parallel cloning we might
> see one of several errors, including this one. And so we must update
> that message, too (you can otherwise find the failure pretty quickly by
> running t7450 with --stress).
^ permalink raw reply
* Re: [PATCH] Add a test about broken notes handling on rebase
From: Phillip Wood @ 2026-06-16 13:12 UTC (permalink / raw)
To: Uwe Kleine-König, git
In-Reply-To: <20260612143952.3281115-2-u.kleine-koenig@baylibre.com>
On 12/06/2026 15:39, Uwe Kleine-König wrote:
> When a commit disappears during rebase because the patch content is
> already there (but not by the same patch in which case the commit would
> be skipped) the notes of that disappearing commit still survives and is
> added to the (rebased) parent of the disappearing commit.
>
> So with the commit graph
>
> A -- B -- C
> `
> `-BD
>
> where BD includes the changes done in B, when rebasing C on top of BD,
> the note for B should disappear and not be added to BD.
>
> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
> ---
> Hello,
>
> this is a behaviour of git that really bothers me when working on big
> patch series. I use notes to track the Message-Id of the patches when I
> send them out. Then when rebasing to a newer upstream version, the
> tracking gets confused because the Message-Id notes end up on commits
> that were not sent out yet (or I got two Message-Ids in them).
>
> I reported that already back in 2023[1],
That thread includes a suggestion on how to fix it if anyone reading
this is interesting in working on it.
> but obviously not in a way that
> resulted in a fix. So I'm trying again with a patch that adds a failing
> test.
I'm not sure carrying this test makes it any more likely that it will be
fixed, though your mail might get someone interested in fixing it. Don't
we already have some relevant tests t3400 rather than adding a whole new
file for a single test?
Thanks
Phillip
> Best regards
> Uwe
>
> [1] https://lore.kernel.org/git/20230530092155.3zbb5uxa7eisdzxb@pengutronix.de/
>
> t/meson.build | 1 +
> t/t3322-notes-rebase.sh | 35 +++++++++++++++++++++++++++++++++++
> 2 files changed, 36 insertions(+)
> create mode 100644 t/t3322-notes-rebase.sh
>
> diff --git a/t/meson.build b/t/meson.build
> index c5832fee0535..6927bd9c794f 100644
> --- a/t/meson.build
> +++ b/t/meson.build
> @@ -358,6 +358,7 @@ integration_tests = [
> 't3311-notes-merge-fanout.sh',
> 't3320-notes-merge-worktrees.sh',
> 't3321-notes-stripspace.sh',
> + 't3322-notes-rebase.sh',
> 't3400-rebase.sh',
> 't3401-rebase-and-am-rename.sh',
> 't3402-rebase-merge.sh',
> diff --git a/t/t3322-notes-rebase.sh b/t/t3322-notes-rebase.sh
> new file mode 100644
> index 000000000000..64c40a523b50
> --- /dev/null
> +++ b/t/t3322-notes-rebase.sh
> @@ -0,0 +1,35 @@
> +#!/bin/sh
> +
> +test_description='Test notes on rebase'
> +
> +. ./test-lib.sh
> +
> +test_expect_success setup '
> + git init &&
> + echo A > A &&
> + git add A &&
> + git commit -m A &&
> + git branch branch &&
> + echo B > B &&
> + git add B &&
> + git commit -m B &&
> + git notes add -m "This is B" @ &&
> + echo C > C &&
> + git add C &&
> + git commit -m C &&
> + git checkout branch &&
> + echo B > B &&
> + echo D > D &&
> + git add B D &&
> + git commit -m BD
> +'
> +
> +test_expect_success 'rebase B + C on top of BD' '
> + git rebase @ master
> +'
> +
> +test_expect_failure 'assert there is no note on BD' '
> + git notes show branch
> +'
> +
> +test_done
>
> base-commit: 3e65291872de10c3f0bf05ea8c24187e7a71ebf0
^ permalink raw reply
* Re: [GSoC Patch v5 2/4] rev-parse: use append_formatted_path() for path formatting
From: Phillip Wood @ 2026-06-16 13:08 UTC (permalink / raw)
To: K Jayatheerth, git
Cc: jltobler, lucasseikioshiro, gitster, phillip.wood, sandals,
kumarayushjha123, a3205153416
In-Reply-To: <20260616044953.184806-3-jayatheerthkulkarni2005@gmail.com>
On 16/06/2026 05:49, K Jayatheerth wrote:
> -static void print_path(const char *path, const char *prefix, enum format_type format, enum default_type def)
> +static void print_path(const char *path, const char *prefix,
> + enum path_format arg_path_format, enum path_format def_format)
> {
> - char *cwd = NULL;
> - /*
> - * We don't ever produce a relative path if prefix is NULL, so set the
> - * prefix to the current directory so that we can produce a relative
> - * path whenever possible. If we're using RELATIVE_IF_SHARED mode, then
> - * we want an absolute path unless the two share a common prefix, so don't
> - * set it in that case, since doing so causes a relative path to always
> - * be produced if possible.
> - */
> - if (!prefix && (format != FORMAT_DEFAULT || def != DEFAULT_RELATIVE_IF_SHARED))
> - prefix = cwd = xgetcwd();
> - if (format == FORMAT_DEFAULT && def == DEFAULT_UNMODIFIED) {
> - puts(path);
> - } else if (format == FORMAT_RELATIVE ||
> - (format == FORMAT_DEFAULT && def == DEFAULT_RELATIVE)) {
> - /*
> - * In order for relative_path to work as expected, we need to
> - * make sure that both paths are absolute paths. If we don't,
> - * we can end up with an unexpected absolute path that the user
> - * didn't want.
> - */
> - struct strbuf buf = STRBUF_INIT, realbuf = STRBUF_INIT, prefixbuf = STRBUF_INIT;
> - if (!is_absolute_path(path)) {
> - strbuf_realpath_forgiving(&realbuf, path, 1);
> - path = realbuf.buf;
> - }
> - if (!is_absolute_path(prefix)) {
> - strbuf_realpath_forgiving(&prefixbuf, prefix, 1);
> - prefix = prefixbuf.buf;
> - }
> - puts(relative_path(path, prefix, &buf));
> - strbuf_release(&buf);
> - strbuf_release(&realbuf);
> - strbuf_release(&prefixbuf);
> - } else if (format == FORMAT_DEFAULT && def == DEFAULT_RELATIVE_IF_SHARED) {
> - struct strbuf buf = STRBUF_INIT;
> - puts(relative_path(path, prefix, &buf));
> - strbuf_release(&buf);
> - } else {
> - struct strbuf buf = STRBUF_INIT;
> - strbuf_realpath_forgiving(&buf, path, 1);
> - puts(buf.buf);
> - strbuf_release(&buf);
> - }
> - free(cwd);
> + struct strbuf sb = STRBUF_INIT;
> + /* If the user didn't explicitly specify a format, fallback to the path-specific default. */
> + enum path_format fmt = (arg_path_format != PATH_FORMAT_DEFAULT) ? arg_path_format : def_format;
> +
> + append_formatted_path(&sb, path, prefix, fmt);
> + puts(sb.buf);
> +
> + strbuf_release(&sb);
> }
>
> int cmd_rev_parse(int argc,
> @@ -717,7 +661,7 @@ int cmd_rev_parse(int argc,
> const char *name = NULL;
> struct strbuf buf = STRBUF_INIT;
> int seen_end_of_options = 0;
> - enum format_type format = FORMAT_DEFAULT;
> + enum path_format arg_path_format = PATH_FORMAT_DEFAULT;
This is the source of the api wart I referred to in the previous patch.
Could we keep the existing enums and convert them into the appropriate
PATH_FORMAT_* flag in print_path() above? I think we already have the
logic to do that in the existing code. That would mean that other users
of append_formatted_path() don't have to worry about the extra flag.
Thanks
Phillip
>
> show_usage_if_asked(argc, argv, builtin_rev_parse_usage);
>
> @@ -797,8 +741,8 @@ int cmd_rev_parse(int argc,
> die(_("--git-path requires an argument"));
> print_path(repo_git_path_replace(the_repository, &buf,
> "%s", argv[i + 1]), prefix,
> - format,
> - DEFAULT_RELATIVE_IF_SHARED);
> + arg_path_format,
> + PATH_FORMAT_RELATIVE_IF_SHARED);
> i++;
> continue;
> }
> @@ -820,9 +764,9 @@ int cmd_rev_parse(int argc,
> if (!arg)
> die(_("--path-format requires an argument"));
> if (!strcmp(arg, "absolute")) {
> - format = FORMAT_CANONICAL;
> + arg_path_format = PATH_FORMAT_CANONICAL;
> } else if (!strcmp(arg, "relative")) {
> - format = FORMAT_RELATIVE;
> + arg_path_format = PATH_FORMAT_RELATIVE;
> } else {
> die(_("unknown argument to --path-format: %s"), arg);
> }
> @@ -985,7 +929,7 @@ int cmd_rev_parse(int argc,
> if (!strcmp(arg, "--show-toplevel")) {
> const char *work_tree = repo_get_work_tree(the_repository);
> if (work_tree)
> - print_path(work_tree, prefix, format, DEFAULT_UNMODIFIED);
> + print_path(work_tree, prefix, arg_path_format, PATH_FORMAT_UNMODIFIED);
> else
> die(_("this operation must be run in a work tree"));
> continue;
> @@ -993,7 +937,7 @@ int cmd_rev_parse(int argc,
> if (!strcmp(arg, "--show-superproject-working-tree")) {
> struct strbuf superproject = STRBUF_INIT;
> if (get_superproject_working_tree(&superproject))
> - print_path(superproject.buf, prefix, format, DEFAULT_UNMODIFIED);
> + print_path(superproject.buf, prefix, arg_path_format, PATH_FORMAT_UNMODIFIED);
> strbuf_release(&superproject);
> continue;
> }
> @@ -1028,18 +972,18 @@ int cmd_rev_parse(int argc,
> const char *gitdir = getenv(GIT_DIR_ENVIRONMENT);
> char *cwd;
> int len;
> - enum format_type wanted = format;
> + enum path_format wanted = arg_path_format;
> if (arg[2] == 'g') { /* --git-dir */
> if (gitdir) {
> - print_path(gitdir, prefix, format, DEFAULT_UNMODIFIED);
> + print_path(gitdir, prefix, arg_path_format, PATH_FORMAT_UNMODIFIED);
> continue;
> }
> if (!prefix) {
> - print_path(".git", prefix, format, DEFAULT_UNMODIFIED);
> + print_path(".git", prefix, arg_path_format, PATH_FORMAT_UNMODIFIED);
> continue;
> }
> } else { /* --absolute-git-dir */
> - wanted = FORMAT_CANONICAL;
> + wanted = PATH_FORMAT_CANONICAL;
> if (!gitdir && !prefix)
> gitdir = ".git";
> if (gitdir) {
> @@ -1055,11 +999,11 @@ int cmd_rev_parse(int argc,
> strbuf_reset(&buf);
> strbuf_addf(&buf, "%s%s.git", cwd, len && cwd[len-1] != '/' ? "/" : "");
> free(cwd);
> - print_path(buf.buf, prefix, wanted, DEFAULT_CANONICAL);
> + print_path(buf.buf, prefix, wanted, PATH_FORMAT_CANONICAL);
> continue;
> }
> if (!strcmp(arg, "--git-common-dir")) {
> - print_path(repo_get_common_dir(the_repository), prefix, format, DEFAULT_RELATIVE_IF_SHARED);
> + print_path(repo_get_common_dir(the_repository), prefix, arg_path_format, PATH_FORMAT_RELATIVE_IF_SHARED);
> continue;
> }
> if (!strcmp(arg, "--is-inside-git-dir")) {
> @@ -1089,7 +1033,7 @@ int cmd_rev_parse(int argc,
> if (the_repository->index->split_index) {
> const struct object_id *oid = &the_repository->index->split_index->base_oid;
> const char *path = repo_git_path_replace(the_repository, &buf, "sharedindex.%s", oid_to_hex(oid));
> - print_path(path, prefix, format, DEFAULT_RELATIVE);
> + print_path(path, prefix, arg_path_format, PATH_FORMAT_RELATIVE);
> }
> continue;
> }
^ permalink raw reply
* Re: [GSoC Patch v5 1/4] path: introduce append_formatted_path() for shared path formatting
From: Phillip Wood @ 2026-06-16 13:08 UTC (permalink / raw)
To: K Jayatheerth, git
Cc: jltobler, lucasseikioshiro, gitster, phillip.wood, sandals,
kumarayushjha123, a3205153416
In-Reply-To: <20260616044953.184806-2-jayatheerthkulkarni2005@gmail.com>
On 16/06/2026 05:49, K Jayatheerth wrote:
> The path-formatting logic in builtin/rev-parse.c is tightly coupled
> to that command and writes directly to stdout, making it impossible
> for other builtins to reuse.
>
> Extract the core algorithm into append_formatted_path() in path.c
> and expose a path_format enum in path.h so that any builtin can
> format paths consistently without duplicating logic.
Sorry I haven't had time to look at this series recently, it is looking
much nicer now that we have a single enum. It would be helpful to
explain why we need PATH_FORMAT_DEFAULT that acts exactly like
PATH_FORMAT_UNMODIFIED. Looking at the next patch it seems this is still
a wart in the api due to rev-parse wanting needing to distinguish the
unmodified case from the default case.
Thanks
Phillip
> Mentored-by: Justin Tobler <jltobler@gmail.com>
> Mentored-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
> Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
> ---
> path.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> path.h | 36 ++++++++++++++++++++++++++++++
> 2 files changed, 106 insertions(+)
>
> diff --git a/path.c b/path.c
> index d7e17bf174..5e83e3e4f6 100644
> --- a/path.c
> +++ b/path.c
> @@ -1579,6 +1579,76 @@ char *xdg_cache_home(const char *filename)
> return NULL;
> }
>
> +void append_formatted_path(struct strbuf *dest, const char *path,
> + const char *prefix, enum path_format format)
> +{
> + switch (format) {
> + case PATH_FORMAT_DEFAULT:
> + case PATH_FORMAT_UNMODIFIED:
> + strbuf_addstr(dest, path);
> + break;
> +
> + case PATH_FORMAT_RELATIVE: {
> + struct strbuf relative_buf = STRBUF_INIT;
> + struct strbuf real_path = STRBUF_INIT;
> + struct strbuf real_prefix = STRBUF_INIT;
> + char *cwd = NULL;
> +
> + /*
> + * We don't ever produce a relative path if prefix is NULL,
> + * so set the prefix to the current directory so that we can
> + * produce a relative path whenever possible.
> + */
> + if (!prefix)
> + prefix = cwd = xgetcwd();
> +
> + if (!is_absolute_path(path)) {
> + strbuf_realpath_forgiving(&real_path, path, 1);
> + path = real_path.buf;
> + }
> + if (!is_absolute_path(prefix)) {
> + strbuf_realpath_forgiving(&real_prefix, prefix, 1);
> + prefix = real_prefix.buf;
> + }
> +
> + strbuf_addstr(dest, relative_path(path, prefix, &relative_buf));
> +
> + strbuf_release(&relative_buf);
> + strbuf_release(&real_path);
> + strbuf_release(&real_prefix);
> + free(cwd);
> + break;
> + }
> +
> + case PATH_FORMAT_RELATIVE_IF_SHARED: {
> + struct strbuf relative_buf = STRBUF_INIT;
> +
> + /*
> + * If we're using RELATIVE_IF_SHARED mode, then we want an
> + * absolute path unless the two share a common prefix, so don't
> + * default the prefix to the current working directory. Doing so
> + * would cause a relative path to always be produced if possible.
> + */
> + strbuf_addstr(dest, relative_path(path, prefix, &relative_buf));
> + strbuf_release(&relative_buf);
> + break;
> + }
> +
> + case PATH_FORMAT_CANONICAL: {
> + struct strbuf canonical_buf = STRBUF_INIT;
> +
> + strbuf_realpath_forgiving(&canonical_buf, path, 1);
> + strbuf_addbuf(dest, &canonical_buf);
> +
> + strbuf_release(&canonical_buf);
> + break;
> + }
> +
> + default:
> + BUG("unknown path_format value %d", format);
> + }
> +}
> +
> REPO_GIT_PATH_FUNC(squash_msg, "SQUASH_MSG")
> REPO_GIT_PATH_FUNC(merge_msg, "MERGE_MSG")
> REPO_GIT_PATH_FUNC(merge_rr, "MERGE_RR")
> diff --git a/path.h b/path.h
> index 0434ba5e07..6aca53b100 100644
> --- a/path.h
> +++ b/path.h
> @@ -262,6 +262,42 @@ enum scld_error safe_create_leading_directories_no_share(char *path);
> int safe_create_file_with_leading_directories(struct repository *repo,
> const char *path);
>
> +/**
> + * The formatting strategy to apply when writing a path into a buffer.
> + */
> +enum path_format {
> + /*
> + * Represents the default formatting behavior. Treated as
> + * PATH_FORMAT_UNMODIFIED by append_formatted_path().
> + */
> + PATH_FORMAT_DEFAULT,
> +
> + /* Output the path exactly as-is without any modifications. */
> + PATH_FORMAT_UNMODIFIED,
> +
> + /* Output a path relative to the provided directory prefix. */
> + PATH_FORMAT_RELATIVE,
> +
> + /* Output a relative path only if the path shares a root with the prefix. */
> + PATH_FORMAT_RELATIVE_IF_SHARED,
> +
> + /* Output a fully resolved, absolute canonical path. */
> + PATH_FORMAT_CANONICAL
> +};
> +
> +/**
> + * Format a path according to the specified formatting strategy and append
> + * the result to the given strbuf.
> + *
> + * `dest` : The string buffer to append the formatted path to.
> + * `path` : The path string that needs to be formatted.
> + * `prefix` : The directory prefix to calculate relative offsets against.
> + * Pass NULL to default to the current working directory where applicable.
> + * `format` : The formatting behavior rule to execute.
> + */
> +void append_formatted_path(struct strbuf *dest, const char *path,
> + const char *prefix, enum path_format format);
> +
> # ifdef USE_THE_REPOSITORY_VARIABLE
> # include "strbuf.h"
> # include "repository.h"
^ permalink raw reply
* Re: [PATCH v5 2/2] graph: indent visual root in graph
From: Pablo Sabater @ 2026-06-16 13:06 UTC (permalink / raw)
To: Junio C Hamano
Cc: Taylor Blau, git, ayu.chandekar, chandrapratap3519,
christian.couder, jltobler, karthik.188, peff, phillip.wood,
siddharthasthana31
In-Reply-To: <xmqqzf0vbyj8.fsf@gitster.g>
El lun, 15 jun 2026 a las 17:42, Junio C Hamano (<gitster@pobox.com>) escribió:
>
> Pablo Sabater <pabloosabaterr@gmail.com> writes:
>
> > It does not make it unpredictable but it makes it not output what I
> > wanted to test, what I wanted to test is having an active column at
> > the same time that visual roots in different cases were being rendered
> > on another column.
>
> Oh, use of commit-graph changes the traversal order, which would
> affect how the graph is drawn, and there is no way to ensure that we
> traverse in the same way with or without commit-graph? That's
> inconvenient. But even without commit-graph, do we guarantee the
> same traversal order forever? I doubt it. So I suspect that it is
> a brittle workaround to disable commit-graph in the longer term.
Hi!
About the traversal order, aren't all the graph tests dependent on the
traversal order? If it changed they would all need to be updated
because the tests are hardcoded expects of the graph.
I guess it might be more brittle than other graph tests specially
because it also depends on removing files, I tried using "git config
core.commitGraph false" or "--date-order" but I still get different
results and removing the files fixed it. If someone knows a better way
of doing it I'm happy to change it.
>
> As long as the graph engine shows correct graph no matter what order
> the commits come out of the revision traversal engine, we won't hurt
> end-users, but we need our tests to be reproducible, so that is a
> bit unfortunate.
>
> Anyway, stepping back a bit,
>
> > However having GIT_TEST_COMMIT_GRAPH in the last
> > text for example changes from:
> >
> > * 41_octopus
> > | * 43_B
> > | \
> > | * 43_A
> > | * 42_B
> > | * 42_A
> > * 41_B
> > * 41_A
>
> Does the "vertically aligned * on 2nd and later columns do not mean
> any parent-child relationship" rule no longer apply in this version?
> IOW, does the above graph show that
>
> - 41_A is a parent of 41_B, which is a parent of 41_octopus
> - 42_A is a parent of 42_B, and
> - 43_A is a parent of 43_B but is not related to 42_B
Yes, this means that all commits vertically adjacent are related,
those who are not related and can cause that ambiguity get indented
(43_A).
>
> ? Who are the parents of 41_octopus? It has no relationship with
> 42_B and 43_B, and unlike what its name suggests, it has only 41_b
> as its parent (probably with history simplification that makes only
> these commits shown)?
On this test we are using "--first-parent" which excludes all the
parents but the first one, but later we force its excluded parents to
be shown.
We exclude 42_* and 43_* branches and then force them to appear as
unrelated branches.
>
> > to:
> >
> > * 41_octopus
> > * 41_B
> > \
> > * 41_A
> > * 43_B
> > \
> > * 43_A
> > * 42_B
> > * 42_A
>
> And this graph shows the same inter-commit relationship. So both
> are correctly showing what we want to express, but they show the
> same information differently, making test_cmp unhappy?
Yes they show the same information. On the second graph every commit
is on the first column (or second if they get indented) but on the
first graph we have:
*
| * <- visual root on second column
^
`----- first column remains active
If you tested v3 with this case you would see that it assumes that
visual roots only happen to be rendered on the first column, therefore
failing to correctly indent those visual roots on the second column,
which this test proves that they can appear on other columns.
Back to the test:
* 41_octopus
| * 43_B
| \
| * 43_A
| * 42_B
| * 42_A
* 41_B
* 41_A
43_A is rendered on the second column (first column is active by the
41_* branch) and gets indented to the third one. With commit-graph it
would be on the first and get indented to the second, making it the
same as more general tests above in "t4218", it is an edge case but
shows that indentation works correctly independently where the visual
root is.
>
> Thanks.
Thanks,
Pablo
^ permalink raw reply
* [PATCH v3] read_gitfile(): simplify NOT_A_REPO error message
From: Jeff King @ 2026-06-16 12:35 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Patrick Steinhardt, git, Tian Yuchen
In-Reply-To: <20260616111919.GC687438@coredump.intra.peff.net>
On Tue, Jun 16, 2026 at 07:19:20AM -0400, Jeff King wrote:
> Here it is.
>
> -- >8 --
> Subject: read_gitfile(): simplify NOT_A_REPO error message
<sigh> This triggered a failure in CI after passing tests locally.
Turned out to be a race in t7450.
Here's an update, with range-diff.
1: daf7f99511 ! 1: 67d42141e9 read_gitfile(): simplify NOT_A_REPO error message
@@ Commit message
We'll tweak the test to match the new error, but there's no need to beef
it up further, since we're not showing the pointed-to path at all.
+ We also racily trigger this in t7450. During parallel cloning we might
+ see one of several errors, including this one. And so we must update
+ that message, too (you can otherwise find the failure pretty quickly by
+ running t7450 with --stress).
+
Signed-off-by: Jeff King <peff@peff.net>
## setup.c ##
@@ t/t0002-gitfile.sh: test_expect_success 'bad setup: invalid .git file format' '
'
test_expect_success 'final setup + check rev-parse --git-dir' '
+
+ ## t/t7450-bad-git-dotfiles.sh ##
+@@ t/t7450-bad-git-dotfiles.sh: test_expect_success 'git dirs of sibling submodules must not be nested' '
+ test_expect_success 'submodule git dir nesting detection must work with parallel cloning' '
+ test_must_fail git clone --recurse-submodules --jobs=2 nested clone_parallel 2>err &&
+ cat err &&
+- grep -E "(already exists|is inside git dir|not a git repository)" err &&
++ grep -E "(already exists|is inside git dir|does not point to a valid repository)" err &&
+ {
+ test_path_is_missing .git/modules/hippo/HEAD ||
+ test_path_is_missing .git/modules/hippo/hooks/HEAD
-- >8 --
Subject: [PATCH] read_gitfile(): simplify NOT_A_REPO error message
If a .git file is well-formed but points to a directory that is not
itself a valid repository, then we say:
fatal: not a git repository: <pointed-to-repo>
without mentioning the .git file that pointed us there in the first
place. Doing so could better help the user understand the source of the
problem.
In theory the most helpful thing we could do is mention both paths,
like:
gitfile '<gitfile>' points to invalid repository: <pointed-to-repo>
But there's another catch: when we generate the error, we don't always
know the pointed-to repository! This leads to a potential segfault.
The message comes from read_gitfile_error_die(). Originally we only
called that function from inside read_gitfile_gently(), passing in both
the gitfile path and the pointed-to path. But that changed in 1dd27bfbfd
(setup: improve error diagnosis for invalid .git files, 2026-03-04).
Since then, the caller in setup_git_directory_gently(), even if it wants
to die on error, always passes in the "return_error_code" flag, asking
the function to instead return a numeric error code. And then it calls
read_gitfile_error_die() itself, passing NULL for the pointed-to path.
If we get the READ_GITFILE_ERR_NOT_A_REPO code, we form a message using
that NULL pointer, and either segfault or get garbage like "not a git
repository: (null)", depending on the platform.
We could fix this by having the function pass out both the numeric error
code and the pointed-to path. But that creates a new headache: we have
to allocate that string on the heap and pass ownership back to the
caller. So now every caller has to be aware of it (and either free the
result, or signal that they are not interested by using an extra
parameter).
Instead, let's just drop the pointed-to path from the error message
entirely, and mention only the gitfile. This fixes the NULL dereference
without introducing any more complexity. The user-facing error message
is not as detailed as it could be, but is better than the original.
Since it mentions the gitfile, a user investigating the situation can
look there to find the pointed-to path (whereas you could not go the
other way from the original message).
There's an existing test in t0002 which triggers this case, but we
didn't notice the problem because it checks only that we said "not a
repository", and not the full string. So if we print "(null)" it is
happy. It will probably crash on some non-glibc platforms, but nobody
seems to have reported it yet (the breakage is recent-ish as of v2.54).
I'm also somewhat surprised that building with ASan/UBSan doesn't catch
this, but it doesn't seem to (and I found an open issue with somebody
asking for NULL printf checks to be implemented in the sanitizers).
We'll tweak the test to match the new error, but there's no need to beef
it up further, since we're not showing the pointed-to path at all.
We also racily trigger this in t7450. During parallel cloning we might
see one of several errors, including this one. And so we must update
that message, too (you can otherwise find the failure pretty quickly by
running t7450 with --stress).
Signed-off-by: Jeff King <peff@peff.net>
---
setup.c | 9 +++++----
setup.h | 2 +-
submodule.c | 2 +-
t/t0002-gitfile.sh | 2 +-
t/t7450-bad-git-dotfiles.sh | 2 +-
5 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/setup.c b/setup.c
index b4652651df..b1d9249d91 100644
--- a/setup.c
+++ b/setup.c
@@ -917,7 +917,7 @@ int verify_repository_format(const struct repository_format *format,
return 0;
}
-void read_gitfile_error_die(int error_code, const char *path, const char *dir)
+void read_gitfile_error_die(int error_code, const char *path)
{
switch (error_code) {
case READ_GITFILE_ERR_NOT_A_FILE:
@@ -937,7 +937,8 @@ void read_gitfile_error_die(int error_code, const char *path, const char *dir)
case READ_GITFILE_ERR_NO_PATH:
die(_("no path in gitfile: %s"), path);
case READ_GITFILE_ERR_NOT_A_REPO:
- die(_("not a git repository: %s"), dir);
+ die(_("gitfile does not point to a valid repository: %s"),
+ path);
default:
BUG("unknown error code");
}
@@ -1028,7 +1029,7 @@ const char *read_gitfile_gently(const char *path, int *return_error_code)
if (return_error_code)
*return_error_code = error_code;
else if (error_code)
- read_gitfile_error_die(error_code, path, dir);
+ read_gitfile_error_die(error_code, path);
free(buf);
return error_code ? NULL : path;
@@ -1629,7 +1630,7 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
return GIT_DIR_INVALID_GITFILE;
default:
if (die_on_error)
- read_gitfile_error_die(error_code, dir->buf, NULL);
+ read_gitfile_error_die(error_code, dir->buf);
else
return GIT_DIR_INVALID_GITFILE;
}
diff --git a/setup.h b/setup.h
index 705d1d6ff7..df8c93687a 100644
--- a/setup.h
+++ b/setup.h
@@ -38,7 +38,7 @@ int is_nonbare_repository_dir(struct strbuf *path);
#define READ_GITFILE_ERR_TOO_LARGE 8
#define READ_GITFILE_ERR_MISSING 9
#define READ_GITFILE_ERR_IS_A_DIR 10
-void read_gitfile_error_die(int error_code, const char *path, const char *dir);
+void read_gitfile_error_die(int error_code, const char *path);
const char *read_gitfile_gently(const char *path, int *return_error_code);
#define read_gitfile(path) read_gitfile_gently((path), NULL)
const char *resolve_gitdir_gently(const char *suspect, int *return_error_code);
diff --git a/submodule.c b/submodule.c
index fd91201a92..93d0361072 100644
--- a/submodule.c
+++ b/submodule.c
@@ -2579,7 +2579,7 @@ void absorb_git_dir_into_superproject(const char *path,
if (err_code != READ_GITFILE_ERR_NOT_A_REPO)
/* We don't know what broke here. */
- read_gitfile_error_die(err_code, path, NULL);
+ read_gitfile_error_die(err_code, path);
/*
* Maybe populated, but no git directory was found?
diff --git a/t/t0002-gitfile.sh b/t/t0002-gitfile.sh
index dfbcdddbcc..6356e9ec72 100755
--- a/t/t0002-gitfile.sh
+++ b/t/t0002-gitfile.sh
@@ -27,7 +27,7 @@ test_expect_success 'bad setup: invalid .git file format' '
test_expect_success 'bad setup: invalid .git file path' '
echo "gitdir: $REAL.not" >.git &&
test_must_fail git rev-parse 2>.err &&
- test_grep "not a git repository" .err
+ test_grep "gitfile does not point to a valid repository" .err
'
test_expect_success 'final setup + check rev-parse --git-dir' '
diff --git a/t/t7450-bad-git-dotfiles.sh b/t/t7450-bad-git-dotfiles.sh
index 8cc86522b2..69a17a9d13 100755
--- a/t/t7450-bad-git-dotfiles.sh
+++ b/t/t7450-bad-git-dotfiles.sh
@@ -350,7 +350,7 @@ test_expect_success 'git dirs of sibling submodules must not be nested' '
test_expect_success 'submodule git dir nesting detection must work with parallel cloning' '
test_must_fail git clone --recurse-submodules --jobs=2 nested clone_parallel 2>err &&
cat err &&
- grep -E "(already exists|is inside git dir|not a git repository)" err &&
+ grep -E "(already exists|is inside git dir|does not point to a valid repository)" err &&
{
test_path_is_missing .git/modules/hippo/HEAD ||
test_path_is_missing .git/modules/hippo/hooks/HEAD
--
2.55.0.rc0.346.g4c7eff6ddc
^ permalink raw reply related
* [PATCH v2] read_gitfile(): simplify NOT_A_REPO error message
From: Jeff King @ 2026-06-16 11:19 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Patrick Steinhardt, git, Tian Yuchen
In-Reply-To: <xmqqeciezh0w.fsf@gitster.g>
On Wed, Jun 10, 2026 at 06:01:03AM -0700, Junio C Hamano wrote:
> > I have to agree that the patch is somewhat gross, and I myself don't
> > really see much of an issue to move to an error message like the above
> > if it ends up simplifying the logic.
>
> So we are in agreement among three of us that simplifying the code
> to lose error message with dubious value would be a good way
> forward.
>
> Peff, can we have a formal [v2] then?
Here it is.
-- >8 --
Subject: read_gitfile(): simplify NOT_A_REPO error message
If a .git file is well-formed but points to a directory that is not
itself a valid repository, then we say:
fatal: not a git repository: <pointed-to-repo>
without mentioning the .git file that pointed us there in the first
place. Doing so could better help the user understand the source of the
problem.
In theory the most helpful thing we could do is mention both paths,
like:
gitfile '<gitfile>' points to invalid repository: <pointed-to-repo>
But there's another catch: when we generate the error, we don't always
know the pointed-to repository! This leads to a potential segfault.
The message comes from read_gitfile_error_die(). Originally we only
called that function from inside read_gitfile_gently(), passing in both
the gitfile path and the pointed-to path. But that changed in 1dd27bfbfd
(setup: improve error diagnosis for invalid .git files, 2026-03-04).
Since then, the caller in setup_git_directory_gently(), even if it wants
to die on error, always passes in the "return_error_code" flag, asking
the function to instead return a numeric error code. And then it calls
read_gitfile_error_die() itself, passing NULL for the pointed-to path.
If we get the READ_GITFILE_ERR_NOT_A_REPO code, we form a message using
that NULL pointer, and either segfault or get garbage like "not a git
repository: (null)", depending on the platform.
We could fix this by having the function pass out both the numeric error
code and the pointed-to path. But that creates a new headache: we have
to allocate that string on the heap and pass ownership back to the
caller. So now every caller has to be aware of it (and either free the
result, or signal that they are not interested by using an extra
parameter).
Instead, let's just drop the pointed-to path from the error message
entirely, and mention only the gitfile. This fixes the NULL dereference
without introducing any more complexity. The user-facing error message
is not as detailed as it could be, but is better than the original.
Since it mentions the gitfile, a user investigating the situation can
look there to find the pointed-to path (whereas you could not go the
other way from the original message).
There's an existing test in t0002 which triggers this case, but we
didn't notice the problem because it checks only that we said "not a
repository", and not the full string. So if we print "(null)" it is
happy. It will probably crash on some non-glibc platforms, but nobody
seems to have reported it yet (the breakage is recent-ish as of v2.54).
I'm also somewhat surprised that building with ASan/UBSan doesn't catch
this, but it doesn't seem to (and I found an open issue with somebody
asking for NULL printf checks to be implemented in the sanitizers).
We'll tweak the test to match the new error, but there's no need to beef
it up further, since we're not showing the pointed-to path at all.
Signed-off-by: Jeff King <peff@peff.net>
---
setup.c | 9 +++++----
setup.h | 2 +-
submodule.c | 2 +-
t/t0002-gitfile.sh | 2 +-
4 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/setup.c b/setup.c
index b4652651df..b1d9249d91 100644
--- a/setup.c
+++ b/setup.c
@@ -917,7 +917,7 @@ int verify_repository_format(const struct repository_format *format,
return 0;
}
-void read_gitfile_error_die(int error_code, const char *path, const char *dir)
+void read_gitfile_error_die(int error_code, const char *path)
{
switch (error_code) {
case READ_GITFILE_ERR_NOT_A_FILE:
@@ -937,7 +937,8 @@ void read_gitfile_error_die(int error_code, const char *path, const char *dir)
case READ_GITFILE_ERR_NO_PATH:
die(_("no path in gitfile: %s"), path);
case READ_GITFILE_ERR_NOT_A_REPO:
- die(_("not a git repository: %s"), dir);
+ die(_("gitfile does not point to a valid repository: %s"),
+ path);
default:
BUG("unknown error code");
}
@@ -1028,7 +1029,7 @@ const char *read_gitfile_gently(const char *path, int *return_error_code)
if (return_error_code)
*return_error_code = error_code;
else if (error_code)
- read_gitfile_error_die(error_code, path, dir);
+ read_gitfile_error_die(error_code, path);
free(buf);
return error_code ? NULL : path;
@@ -1629,7 +1630,7 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
return GIT_DIR_INVALID_GITFILE;
default:
if (die_on_error)
- read_gitfile_error_die(error_code, dir->buf, NULL);
+ read_gitfile_error_die(error_code, dir->buf);
else
return GIT_DIR_INVALID_GITFILE;
}
diff --git a/setup.h b/setup.h
index 705d1d6ff7..df8c93687a 100644
--- a/setup.h
+++ b/setup.h
@@ -38,7 +38,7 @@ int is_nonbare_repository_dir(struct strbuf *path);
#define READ_GITFILE_ERR_TOO_LARGE 8
#define READ_GITFILE_ERR_MISSING 9
#define READ_GITFILE_ERR_IS_A_DIR 10
-void read_gitfile_error_die(int error_code, const char *path, const char *dir);
+void read_gitfile_error_die(int error_code, const char *path);
const char *read_gitfile_gently(const char *path, int *return_error_code);
#define read_gitfile(path) read_gitfile_gently((path), NULL)
const char *resolve_gitdir_gently(const char *suspect, int *return_error_code);
diff --git a/submodule.c b/submodule.c
index fd91201a92..93d0361072 100644
--- a/submodule.c
+++ b/submodule.c
@@ -2579,7 +2579,7 @@ void absorb_git_dir_into_superproject(const char *path,
if (err_code != READ_GITFILE_ERR_NOT_A_REPO)
/* We don't know what broke here. */
- read_gitfile_error_die(err_code, path, NULL);
+ read_gitfile_error_die(err_code, path);
/*
* Maybe populated, but no git directory was found?
diff --git a/t/t0002-gitfile.sh b/t/t0002-gitfile.sh
index dfbcdddbcc..6356e9ec72 100755
--- a/t/t0002-gitfile.sh
+++ b/t/t0002-gitfile.sh
@@ -27,7 +27,7 @@ test_expect_success 'bad setup: invalid .git file format' '
test_expect_success 'bad setup: invalid .git file path' '
echo "gitdir: $REAL.not" >.git &&
test_must_fail git rev-parse 2>.err &&
- test_grep "not a git repository" .err
+ test_grep "gitfile does not point to a valid repository" .err
'
test_expect_success 'final setup + check rev-parse --git-dir' '
--
2.55.0.rc0.342.g45e27e83e2
^ permalink raw reply related
* Re: [PATCH 3/4] builtin/refs: add "update" subcommand
From: Junio C Hamano @ 2026-06-16 11:17 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
In-Reply-To: <20260616-pks-refs-writing-subcommands-v1-3-9f5219b6109d@pks.im>
Patrick Steinhardt <ps@pks.im> writes:
> Add a new "update" subcommand which mirrors `git update-ref <refname>
> <oldoid> <newoid>`. This follows the same reasoning as the preceding
> commit.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> Documentation/git-refs.adoc | 7 ++
> builtin/refs.c | 50 +++++++++++++
> t/meson.build | 1 +
> t/t1465-refs-update.sh | 179 ++++++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 237 insertions(+)
I do not offhand know (and I am still away by 2 hours from the time
I wake up and start functioning) if update-ref shares the same
issue, but with "delete, update, rename" combo, lack of "create"
feels a bit annoying. Wouldn't we want to offer an option to users
who want to ensure that the refs they create are truly new and they
are not overwriting a ref somebody has created? Either (1) drop
"delete" and take a special value (e.g. "") as <newvalue> to signal
deletion and make the same special value used as <oldvalue> signals
creation, or (2) add "create" and insist that "update" takes only an
existing ref, would make the annoyance go away, I guess.
> +test_expect_success 'update creates a new reference' '
> + test_when_finished "rm -rf repo" &&
> + setup_repo repo &&
> + (
> + cd repo &&
> + A=$(git rev-parse A) &&
> + git refs update refs/heads/foo $A &&
> + test_ref_matches refs/heads/foo "$A"
> + )
> +'
Here we cannot test (and I strongly suspect that "git refs update"
and "git update-ref" lack ability to do so) a case where a creation
is attempted on an existing ref and fails.
^ permalink raw reply
* Re: [PATCH] cat-file: speed up default format
From: Jeff King @ 2026-06-16 11:15 UTC (permalink / raw)
To: René Scharfe; +Cc: Git List
In-Reply-To: <df933ffa-1be2-4401-a4ac-9d72c9c4cdcc@web.de>
On Mon, Jun 15, 2026 at 11:53:10PM +0200, René Scharfe wrote:
> > } else if (is_atom("rest", atom, len)) {
> > - if (data->mark_query)
> > - data->split_on_whitespace = 1;
> > - else if (data->rest)
>
> This removes support for rest being NULL, breaking t1006.381.
Yup. I did say "only lightly tested". ;)
The fix is obviously just:
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 9cc7ec7a6f..370ca6d771 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -363,7 +363,8 @@ static void objectsize_disk_add(struct format_item *item UNUSED,
static void rest_add(struct format_item *item UNUSED,
struct strbuf *sb, struct expand_data *data)
{
- strbuf_addstr(sb, data->rest);
+ if (data->rest)
+ strbuf_addstr(sb, data->rest);
}
static void deltabase_add(struct format_item *item UNUSED,
I think perhaps this error shows that the mark_query thing in the
existing code obfuscates the logic a bit.
-Peff
^ permalink raw reply related
* Re: [PATCH] cat-file: speed up default format
From: Jeff King @ 2026-06-16 11:12 UTC (permalink / raw)
To: René Scharfe; +Cc: Git List
In-Reply-To: <10a33614-837f-4166-aa30-6de28b052692@web.de>
On Mon, Jun 15, 2026 at 11:53:07PM +0200, René Scharfe wrote:
> > IMHO that is probably not worth it for a custom parsing system just for
> > cat-file. But if we were to finally unify ref-filter and cat-file (and
> > even --pretty=format) then it would probably worth doing this kind of
> > pre-parsing.
> It could be worth it for cat-file alone if we find the right balance, as
> it already does do a separate parsing step, but that is awkward with its
> mark_query checks all over the place and remembers only object property
> requirements and no other format string details.
Yeah, getting rid of the mark_query pass was a nice side effect of
having a true parse step.
> Making the opcodes small should be beneficial. We need only a handful
> of them, so a byte each should suffice. We can use a strbuf for that.
>
> We can also store literal characters in there. An opcode plus with a
> payload char incurs an overhead of 50%, which sounds high, but at least
> the default format only has two of them and it's much better than
> storing pointer plus size for an overhead of more than 90% in case of a
> single char.
True, and it's a size win if the literal portions tend to be small
(fewer than 15 bytes). You do lose out on the ability to strbuf_add()
them in one go, though. So lots more strbuf_grow() checks, etc. If you
really wanted to get fancy, you could follow the opcode with a length
represented as a variable-sized integer, followed by the literal bytes.
I'm not sure that Git's formatting code needs to squeeze out quite that
much performance, though.
> That gets us closer to native speed, at least on an Apple M1:
>
> Benchmark 1: ./git_fp cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
> Time (mean ± σ): 992.7 ms ± 3.2 ms [User: 967.5 ms, System: 23.8 ms]
> Range (min … max): 990.1 ms … 1000.7 ms 10 runs
>
> Benchmark 2: ./git_switch cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
> Time (mean ± σ): 991.8 ms ± 1.6 ms [User: 967.0 ms, System: 23.3 ms]
> Range (min … max): 989.3 ms … 994.4 ms 10 runs
>
> Benchmark 3: ./git cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
> Time (mean ± σ): 985.8 ms ± 2.9 ms [User: 960.5 ms, System: 23.6 ms]
> Range (min … max): 982.9 ms … 993.0 ms 10 runs
>
> Benchmark 4: ./git cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)'
> Time (mean ± σ): 982.1 ms ± 3.2 ms [User: 956.7 ms, System: 23.6 ms]
> Range (min … max): 979.2 ms … 989.2 ms 10 runs
OK, so we managed another 1%. But I'm skeptical that this linear opcode
technique is where we want to go in the long run, if we're ever going to
unify formatters.
One, for more advanced features like %(if) we'd want to support some
notion of hierarchy and recursion. We have to speculatively format the
inner part and see if it is empty.
Though I guess that is possible with a linearized set of opcodes. If you
turn "%(if)%(HEAD)%(then)*%(end)" into:
FMT_IF
FMT_HEAD
FMT_THEN
FMT_LITERAL
*
FMT_END
then I guess you just start a sub-execution of the opcodes after FMT_IF
and tell it to stop when it sees FMT_THEN. It does mean that the opcodes
themselves need to control the program counter, rather than the executor
blindly walking along the opcodes and asking them to put stuff in the
output. Whereas I think if the parser builds a tree of structs then this
falls out pretty naturally.
The second thing is that many of the ref-filter atoms have options, and
those options have to be represented in the opcodes. That works
naturally if each opcode gets its own struct (either with a big union,
or true polymorphism). But representing "%(describe:match=versions/v*)"
in opcodes sounds gross. Now you need opcodes to represent the options
(and maybe "no more options"), and some way of encoding arbitrary input
for those option arguments.
-Peff
^ permalink raw reply
* Re: [PATCH v3 0/4] Add support for an external command for fetching notes
From: Siddh Raman Pant @ 2026-06-16 10:59 UTC (permalink / raw)
To: git@vger.kernel.org
Cc: oswald.buddenhagen@gmx.de, gitster@pobox.com,
code@khaugsbakk.name, j6t@kdbg.org, peff@peff.net, ps@pks.im,
sandals@crustytoothpaste.net, newren@gmail.com
In-Reply-To: <cover.1779532562.git.siddh.raman.pant@oracle.com>
[-- Attachment #1: Type: text/plain, Size: 3274 bytes --]
Ping...
Thread link:
https://lore.kernel.org/git/cover.1779532562.git.siddh.raman.pant@oracle.com/
Thanks,
Siddh
On Sat, May 23 2026 at 16:08:08 +0530, Siddh Raman Pant wrote:
> v2: https://lore.kernel.org/git/cover.1779464886.git.siddh.raman.pant@oracle.com/
> v1: https://lore.kernel.org/git/cover.1779207350.git.siddh.raman.pant@oracle.com/
>
> <...insert text from v1 cover here...>
>
> Changes since v2:
> - Removed stale help text talking about force-killing helper process.
>
> Changes since v1:
> - Removed Documentation commit and sent as a standalone patch.
> - Removed finish_command_with_timeout addition (and thus sleep_nanosec).
> - Squashed the external notes command code, doc, and test commits.
> - Removed horizontal separators from note-external.c.
> - Removed global variables from translation unit and instead store config in
> a dedicated new struct member in struct display_notes_opt.
> - Reworded the main commit to have better explanation of the motivation.
>
> Siddh Raman Pant (4):
> notes: convert raw arg in format_display_notes() to bool
> wrapper: add support for timeout and deadline in read helpers
> t3301: cover generic displayed notes behavior
> notes: support an external command to display notes
>
> Documentation/config/notes.adoc | 59 +++
> Documentation/git-format-patch.adoc | 11 +-
> Documentation/git-range-diff.adoc | 6 +
> Documentation/pretty-options.adoc | 9 +
> Makefile | 2 +
> builtin/log.c | 17 +-
> builtin/name-rev.c | 9 +-
> builtin/range-diff.c | 2 +
> contrib/completion/git-completion.bash | 4 +-
> log-tree.c | 10 +-
> meson.build | 1 +
> notes-external.c | 414 ++++++++++++++++++
> notes-external.h | 53 +++
> notes.c | 266 +++++++++---
> notes.h | 33 +-
> revision.c | 36 +-
> strbuf.c | 26 +-
> strbuf.h | 4 +
> t/helper/meson.build | 1 +
> t/helper/test-external-notes | 64 +++
> t/helper/test-notes-external-config-reset.c | 24 ++
> t/helper/test-tool.c | 1 +
> t/helper/test-tool.h | 1 +
> t/lib-notes.sh | 19 +
> t/t3206-range-diff.sh | 68 +++
> t/t3301-notes.sh | 448 ++++++++++++++++++++
> t/t6120-describe.sh | 17 +
> wrapper.c | 139 +++++-
> wrapper.h | 23 +
> 29 files changed, 1691 insertions(+), 76 deletions(-)
> create mode 100644 notes-external.c
> create mode 100644 notes-external.h
> create mode 100755 t/helper/test-external-notes
> create mode 100644 t/helper/test-notes-external-config-reset.c
> create mode 100644 t/lib-notes.sh
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* Re: [PATCH v2 0/2] rebase: add --squash to fold a range into its first commit
From: Phillip Wood @ 2026-06-16 10:10 UTC (permalink / raw)
To: Harald Nordgren via GitGitGadget, git
Cc: Harald Nordgren, D. Ben Knoble, Patrick Steinhardt,
Junio C Hamano
In-Reply-To: <pull.2337.v2.git.git.1781512625.gitgitgadget@gmail.com>
Hi Harald
On 15/06/2026 09:37, Harald Nordgren via GitGitGadget wrote:
> Rename to rebase --squash.
Please include the original cover letter as well so people who have not
read the previous version know what the series is about.
I agree that git-history might be a better place for this but I'm not
opposed to adding a --squash option to git-rebase. However I do think we
need to think about the implementation - picking consecutive commits
when we want to squash them together is inefficient and if we're
changing the base risks the user having to stop to resolve conflicts
multiple times. Regardless of whether this ends up in git-rebase or
git-history I think the implementation should cherry-pick the whole
commit range by doing the equivalent of
git merge-tree --merge-base $(git merge-base HEAD @{upstream}) \
@{upstream} HEAD
We should also let the user edit the commit message to reflect the
changes that are being squashed in. We should think about what support
we want for "amend!" commits that replace the commit message when rebasing.
Thanks
Phillip
> Harald Nordgren (2):
> t3415: remove prepare-commit-msg hook after use
> rebase: add --squash to fold a range
>
> Documentation/git-rebase.adoc | 11 ++++
> builtin/rebase.c | 16 ++++-
> sequencer.c | 24 ++++++-
> sequencer.h | 2 +-
> t/t3415-rebase-autosquash.sh | 118 ++++++++++++++++++++++++++++++++++
> 5 files changed, 166 insertions(+), 5 deletions(-)
>
>
> base-commit: ea97ad8d017de0c9037451a78008a0fd60abea0c
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2337%2FHaraldNordgren%2Frebase-fixup-fold-v2
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2337/HaraldNordgren/rebase-fixup-fold-v2
> Pull-Request: https://github.com/git/git/pull/2337
>
> Range-diff vs v1:
>
> 1: c55b9cd6f7 = 1: c55b9cd6f7 t3415: remove prepare-commit-msg hook after use
> 2: bd1bc62aa8 ! 2: 22d4276ff5 rebase: add --fixup-all to fold a range
> @@ Metadata
> Author: Harald Nordgren <haraldnordgren@gmail.com>
>
> ## Commit message ##
> - rebase: add --fixup-all to fold a range
> + rebase: add --squash to fold a range
>
> Folding a series of commits into one required either an interactive
> rebase where each commit after the first was hand-edited to "fixup", or
> a "git reset --soft" to the merge base followed by "git commit --amend".
>
> - Add "git rebase --autosquash --fixup-all [<upstream>]" to do this
> - directly. It keeps the first commit in the range as a "pick" and turns
> - every later commit into a "fixup", so the whole range collapses into a
> - single commit that reuses the first commit's message. With no <upstream>
> - argument the range is "@{upstream}..HEAD", folding all unpushed commits
> - into one.
> + Add "git rebase --squash [<upstream>]" to do this directly. It keeps
> + the first commit in the range as a "pick" and turns every later commit
> + into a "fixup", so the whole range collapses into a single commit that
> + reuses the first commit's message. With no <upstream> argument the range
> + is "@{upstream}..HEAD", folding all unpushed commits into one.
>
> - Fold the commits in their original order, so that any fixup!/squash!
> - commits already present in the range are folded in as well. Allow the
> - flag only together with --autosquash, and reject --rebase-merges since a
> - merge commit cannot be folded into another commit.
> + The option implies the merge backend, so it works on its own without
> + --autosquash. Fold the commits in their original order, so that any
> + fixup!/squash! commits already present in the range are folded in as
> + well. Reject --rebase-merges since a merge commit cannot be folded into
> + another commit.
>
> + Inspired-by: Sergey Chernov <serega.morph@gmail.com>
> Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
>
> ## Documentation/git-rebase.adoc ##
> @@ Documentation/git-rebase.adoc: option can be used to override that setting.
> +
> See also INCOMPATIBLE OPTIONS below.
>
> -+--fixup-all::
> -+ Valid only when used with `--autosquash`. Keep the first commit in
> -+ the range as a `pick` and change every later commit to a `fixup`, so
> -+ the whole range is folded into a single commit that reuses the first
> -+ commit's message. With no `<upstream>` argument this folds all commits
> -+ since `@{upstream}` into one. The commits are folded in their original
> -+ order, so any `fixup!`/`squash!` commits already in the range are folded
> -+ in as well. Cannot be combined with `--rebase-merges`, as a merge
> -+ commit cannot be folded into another commit.
> ++--squash::
> ++ Keep the first commit in the range as a `pick` and change every later
> ++ commit to a `fixup`, so the whole range is folded into a single commit
> ++ that reuses the first commit's message. With no `<upstream>` argument
> ++ this folds all commits since `@{upstream}` into one. The commits are
> ++ folded in their original order, so any `fixup!`/`squash!` commits
> ++ already in the range are folded in as well. Cannot be combined with
> ++ `--rebase-merges`, as a merge commit cannot be folded into another
> ++ commit.
> +
> --autostash::
> --no-autostash::
> @@ Documentation/git-rebase.adoc: are incompatible with the following options:
> * --strategy
> * --strategy-option
> * --autosquash
> -+ * --fixup-all
> ++ * --squash
> * --rebase-merges
> * --interactive
> * --exec
> @@ builtin/rebase.c: struct rebase_options {
> int allow_rerere_autoupdate;
> int keep_empty;
> int autosquash;
> -+ int fixup_all;
> ++ int squash;
> char *gpg_sign_opt;
> int autostash;
> int committer_date_is_author_date;
> @@ builtin/rebase.c: static int do_interactive_rebase(struct rebase_options *opts,
> shortrevisions, opts->onto_name, opts->onto,
> &opts->orig_head->object.oid, &opts->exec,
> - opts->autosquash, opts->update_refs, &todo_list);
> -+ opts->autosquash, opts->fixup_all, opts->update_refs,
> ++ opts->autosquash, opts->squash, opts->update_refs,
> + &todo_list);
>
> cleanup:
> @@ builtin/rebase.c: int cmd_rebase(int argc,
> OPT_BOOL(0, "autosquash", &options.autosquash,
> N_("move commits that begin with "
> "squash!/fixup! under -i")),
> -+ OPT_BOOL(0, "fixup-all", &options.fixup_all,
> ++ OPT_BOOL(0, "squash", &options.squash,
> + N_("fold all commits in the range into the first one")),
> OPT_BOOL(0, "update-refs", &options.update_refs,
> N_("update branches that point to commits "
> "that are being rebased")),
> +@@ builtin/rebase.c: int cmd_rebase(int argc,
> + if ((options.flags & REBASE_INTERACTIVE_EXPLICIT) ||
> + (options.action != ACTION_NONE) ||
> + (options.exec.nr > 0) ||
> +- options.autosquash == 1) {
> ++ options.autosquash == 1 ||
> ++ options.squash) {
> + allow_preemptive_ff = 0;
> + }
> + if (options.committer_date_is_author_date || options.ignore_date)
> @@ builtin/rebase.c: int cmd_rebase(int argc,
> options.rebase_merges = (options.rebase_merges >= 0) ? options.rebase_merges :
> ((options.config_rebase_merges >= 0) ? options.config_rebase_merges : 0);
>
> -+ if (options.fixup_all && options.autosquash != 1)
> -+ die(_("--fixup-all requires --autosquash"));
> -+
> -+ if (options.fixup_all && options.rebase_merges)
> ++ if (options.squash && options.rebase_merges)
> + die(_("options '%s' and '%s' cannot be used together"),
> -+ "--fixup-all", "--rebase-merges");
> ++ "--squash", "--rebase-merges");
> ++
> ++ if (options.squash)
> ++ imply_merge(&options, "--squash");
> +
> if (options.autosquash == 1) {
> imply_merge(&options, "--autosquash");
> @@ sequencer.c: static int todo_list_add_update_ref_commands(struct todo_list *todo
> struct commit *onto, const struct object_id *orig_head,
> struct string_list *commands, unsigned autosquash,
> - unsigned update_refs,
> -+ unsigned fixup_all, unsigned update_refs,
> ++ unsigned squash, unsigned update_refs,
> struct todo_list *todo_list)
> {
> char shortonto[GIT_MAX_HEXSZ + 1];
> @@ sequencer.c: int complete_action(struct repository *r, struct replay_opts *opts,
> return -1;
>
> - if (autosquash && todo_list_rearrange_squash(todo_list))
> -+ if (fixup_all)
> ++ if (squash)
> + todo_list_fixup_all_but_first(todo_list);
> + else if (autosquash && todo_list_rearrange_squash(todo_list))
> return -1;
> @@ sequencer.h: int complete_action(struct repository *r, struct replay_opts *opts,
> struct commit *onto, const struct object_id *orig_head,
> struct string_list *commands, unsigned autosquash,
> - unsigned update_refs,
> -+ unsigned fixup_all, unsigned update_refs,
> ++ unsigned squash, unsigned update_refs,
> struct todo_list *todo_list);
> int todo_list_rearrange_squash(struct todo_list *todo_list);
>
> @@ t/t3415-rebase-autosquash.sh: test_expect_success 'pick and fixup respect commit
> test_commit_message HEAD -m "something"
> '
>
> -+test_expect_success '--fixup-all folds the range into the first commit' '
> ++test_expect_success '--squash folds the range into the first commit' '
> + git reset --hard base &&
> + test_commit --no-tag fold1 file_fold a &&
> + test_commit --no-tag fold2 file_fold b &&
> + test_commit --no-tag fold3 file_fold c &&
> -+ git rebase --autosquash --fixup-all HEAD~3 &&
> ++ git rebase --squash HEAD~3 &&
> + test_cmp_rev base HEAD~1 &&
> + test_commit_message HEAD -m "fold1" &&
> + echo c >expect &&
> + test_cmp expect file_fold
> +'
> +
> -+test_expect_success '--fixup-all folds smoothly when a fixup! commit is in the series' '
> ++test_expect_success '--squash folds smoothly when a fixup! commit is in the series' '
> + git reset --hard base &&
> + test_commit --no-tag foldA file_fold a &&
> + test_commit --no-tag foldB file_fold b &&
> + git commit --allow-empty --fixup HEAD~1 &&
> -+ git rebase --autosquash --fixup-all HEAD~3 &&
> ++ git rebase --squash HEAD~3 &&
> + test_cmp_rev base HEAD~1 &&
> + test_commit_message HEAD -m "foldA" &&
> + echo b >expect &&
> + test_cmp expect file_fold
> +'
> +
> -+test_expect_success '--fixup-all picks the first commit even if it is a fixup!' '
> ++test_expect_success '--squash picks the first commit even if it is a fixup!' '
> + git reset --hard base &&
> + test_commit --no-tag fixupbase file_fix a &&
> + git commit --allow-empty --fixup HEAD &&
> + test_commit --no-tag fixuptail file_fix b &&
> -+ git rebase --autosquash --fixup-all HEAD~3 &&
> ++ git rebase --squash HEAD~3 &&
> + test_cmp_rev base HEAD~1 &&
> + echo b >expect &&
> + test_cmp expect file_fix
> +'
> +
> -+test_expect_success '--fixup-all with a single commit in range is a no-op' '
> ++test_expect_success '--squash with a single commit in range is a no-op' '
> + git reset --hard base &&
> + test_commit --no-tag solo file_solo a &&
> + git rev-parse HEAD >expect &&
> -+ git rebase --autosquash --fixup-all HEAD~1 &&
> ++ git rebase --squash HEAD~1 &&
> + git rev-parse HEAD >actual &&
> + test_cmp expect actual
> +'
> +
> -+test_expect_success '--fixup-all with an empty range succeeds' '
> ++test_expect_success '--squash with an empty range succeeds' '
> + git reset --hard base &&
> -+ git rebase --autosquash --fixup-all HEAD &&
> ++ git rebase --squash HEAD &&
> + test_cmp_rev base HEAD
> +'
> +
> -+test_expect_success '--fixup-all skips a dropped commit in the range' '
> ++test_expect_success '--squash skips a dropped commit in the range' '
> + git reset --hard base &&
> + test_commit --no-tag fixdrop1 file_drop a &&
> + git commit --allow-empty -m "empty in the middle" &&
> + test_commit --no-tag fixdrop3 file_drop b &&
> -+ git rebase --autosquash --empty=drop --fixup-all HEAD~3 &&
> ++ git rebase --squash --empty=drop HEAD~3 &&
> + test_cmp_rev base HEAD~1 &&
> + test_commit_message HEAD -m "fixdrop1" &&
> + echo b >expect &&
> + test_cmp expect file_drop
> +'
> +
> -+test_expect_success '--fixup-all folds a merge commit in the middle of the range' '
> ++test_expect_success '--squash folds a merge commit in the middle of the range' '
> + git reset --hard base &&
> + test_commit --no-tag mid-first &&
> + git checkout -b mid-side &&
> @@ t/t3415-rebase-autosquash.sh: test_expect_success 'pick and fixup respect commit
> + git checkout - &&
> + git merge --no-ff -m "merge mid-side" mid-side &&
> + test_commit --no-tag mid-last &&
> -+ git rebase --autosquash --fixup-all base &&
> ++ git rebase --squash base &&
> + test_cmp_rev base HEAD~1 &&
> + test_commit_message HEAD -m "mid-first" &&
> + test_path_is_file mid-merged.t
> +'
> +
> -+test_expect_success '--fixup-all keeps the first flattened commit when a merge sorts first' '
> ++test_expect_success '--squash keeps the first flattened commit when a merge sorts first' '
> + git reset --hard base &&
> + git checkout -b head-side &&
> + test_commit --no-tag head-merged &&
> + git checkout - &&
> + git merge --no-ff -m "merge head-side" head-side &&
> + test_commit --no-tag head-last &&
> -+ git rebase --autosquash --fixup-all base &&
> ++ git rebase --squash base &&
> + test_cmp_rev base HEAD~1 &&
> + test_commit_message HEAD -m "head-merged" &&
> + test_path_is_file head-merged.t
> +'
> +
> -+test_expect_success '--fixup-all requires --autosquash' '
> ++test_expect_success '--squash takes precedence over --autosquash' '
> + git reset --hard base &&
> -+ test_must_fail git rebase --fixup-all HEAD~1 2>err &&
> -+ test_grep "fixup-all requires --autosquash" err &&
> -+ test_must_fail git rebase --no-autosquash --fixup-all HEAD~1 2>err &&
> -+ test_grep "fixup-all requires --autosquash" err
> ++ test_commit --no-tag combo-first &&
> ++ test_commit --no-tag combo-mid &&
> ++ git commit --allow-empty --fixup HEAD~1 &&
> ++ test_commit --no-tag combo-last &&
> ++ git rebase --autosquash --squash base &&
> ++ test_cmp_rev base HEAD~1 &&
> ++ test_commit_message HEAD -m "combo-first"
> ++'
> ++
> ++test_expect_success '--squash folds the range with rebase.autosquash set' '
> ++ test_config rebase.autosquash true &&
> ++ git reset --hard base &&
> ++ test_commit --no-tag cfg-first &&
> ++ test_commit --no-tag cfg-last &&
> ++ git rebase --squash base &&
> ++ test_cmp_rev base HEAD~1 &&
> ++ test_commit_message HEAD -m "cfg-first"
> +'
> +
> -+test_expect_success '--fixup-all and --rebase-merges cannot be combined' '
> ++test_expect_success '--squash and --rebase-merges cannot be combined' '
> + git reset --hard base &&
> -+ test_must_fail git rebase --autosquash --rebase-merges \
> -+ --fixup-all HEAD~1 2>err &&
> ++ test_must_fail git rebase --rebase-merges --squash HEAD~1 2>err &&
> + test_grep "cannot be used together" err &&
> + test_path_is_missing .git/rebase-merge
> +'
>
^ permalink raw reply
* Re: [PATCH v14 4/6] branch: add --prune-merged <branch>
From: Phillip Wood @ 2026-06-16 9:59 UTC (permalink / raw)
To: Harald Nordgren via GitGitGadget, git
Cc: Kristoffer Haugsbakk, Johannes Sixt, Phillip Wood,
Harald Nordgren
In-Reply-To: <9924373da0a0598cabe4f08f3bc4200833679171.1780999917.git.gitgitgadget@gmail.com>
Hi Harald
On 09/06/2026 11:11, Harald Nordgren via GitGitGadget wrote:
Carrying on where I left off yesterday ...
> diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
> index 4e7deddc04..27ea1319bb 100755
> --- a/t/t3200-branch.sh
> +++ b/t/t3200-branch.sh
> @@ -1809,4 +1809,205 @@ test_expect_success '--forked requires a value' '
> test_grep "requires a value" err
> '
>
> +test_expect_success '--prune-merged: setup' '
> + test_create_repo pm-upstream &&
The rest of this test would be easier to read if we did
(
cd pm-upstream &&
...
)
rather than prefixing every command with "-C pm-upstream"
> + test_commit -C pm-upstream base &&
> + git -C pm-upstream checkout -b next &&
> + test_commit -C pm-upstream one-commit &&
> + test_commit -C pm-upstream two-commit &&
> + git -C pm-upstream branch one HEAD~ &&
> + git -C pm-upstream branch two HEAD &&
> + git -C pm-upstream branch wip main &&
> + git -C pm-upstream checkout main &&
> + test_create_repo pm-fork
> +'
> +
> +test_expect_success '--prune-merged deletes branches integrated into upstream' '
> + test_when_finished "rm -rf pm-merged" &&
> + git clone pm-upstream pm-merged &&
> + git -C pm-merged remote add fork ../pm-fork &&
> + test_config -C pm-merged remote.pushDefault fork &&
> + test_config -C pm-merged push.default current &&
So we clone upstream and add fork as the default push remote. I find the
pm- prefixes rather distracting. It would be clearer to me if we just
called the repositories "upstream", "fork" and "repo"
> + git -C pm-merged branch one one-commit &&
> + git -C pm-merged branch --set-upstream-to=origin/next one &&
> + git -C pm-merged branch two two-commit &&
> + git -C pm-merged branch --set-upstream-to=origin/next two &&
Now we set up a couple of local branches with no local commits that
track origin/next which seems a bit odd. Why don't we create local
branches based one origin/next (and origin/main if we're using a
wildcard below) with some local commits like a user would?
> + git -C pm-merged branch --prune-merged "origin/*" &&
Here we delete all the branches whose upstream is on origin - which is
all the branches that we've created so we're not really testing the
safety features.
> + test_must_fail git -C pm-merged rev-parse --verify refs/heads/one &&
> + test_must_fail git -C pm-merged rev-parse --verify refs/heads/two
This verifies that the branches are deleted. It would be a good idea to
check the output of command above to check --prune-merged prints what we
expect it to and that it leaves branches with other upstreams.
> +test_expect_success '--prune-merged accepts a literal upstream' '
> + test_when_finished "rm -rf pm-literal" &&
> + git clone pm-upstream pm-literal &&
let's not litter the test directory with a hundred repositories - just
call it "repo" and add remove it with test_when_finished in each test,
or reuse it so we don't waste time cloning and setting up the config
each time (that would mean not using test_config).
> + git -C pm-literal remote add fork ../pm-fork &&
> + test_config -C pm-literal remote.pushDefault fork &&
> + test_config -C pm-literal push.default current &&
> + git -C pm-literal branch one one-commit &&
> + git -C pm-literal branch --set-upstream-to=origin/next one &&
> +
> + git -C pm-literal branch --prune-merged origin/next &&
> +
> + test_must_fail git -C pm-literal rev-parse --verify refs/heads/one
Again we're not testing that nothing else is deleted.
> +'
> +
> +test_expect_success '--prune-merged unions multiple <branch> arguments' '
> + test_when_finished "rm -rf pm-union" &&
> + git clone pm-upstream pm-union &&
> + git -C pm-union remote add fork ../pm-fork &&
> + test_config -C pm-union remote.pushDefault fork &&
> + test_config -C pm-union push.default current &&
> + git -C pm-union branch one one-commit &&
> + git -C pm-union branch --set-upstream-to=origin/next one &&
> + git -C pm-union branch two base &&
> + git -C pm-union branch --set-upstream-to=origin/main two &&
> + git -C pm-union checkout --detach &&
> +
> + git -C pm-union branch --prune-merged origin/next origin/main &&
This is more interesting - we don't need the test for a single literal
upstream if we're doing this. Again we need to test the safety features.
As these are integration tests you can do that at the same time as
testing that some branches are removed - you don't need so many separate
(expensive) tests.
> + test_must_fail git -C pm-union rev-parse --verify refs/heads/one &&
> + test_must_fail git -C pm-union rev-parse --verify refs/heads/two
> +'
> +
> +test_expect_success '--prune-merged accepts a local upstream' '
> + test_when_finished "rm -rf pm-local" &&
> + git clone pm-upstream pm-local &&
> + git -C pm-local remote add fork ../pm-fork &&
> + test_config -C pm-local remote.pushDefault fork &&
> + test_config -C pm-local push.default current &&
> + git -C pm-local checkout -b trunk &&
> + git -C pm-local branch one one-commit &&
> + git -C pm-local branch --set-upstream-to=trunk one &&
> + git -C pm-local merge --ff-only one-commit &&
> +
> + git -C pm-local branch --prune-merged trunk &&
Can't we test a local upstream at the same time as a remote upstream and
save a test case?
> + test_must_fail git -C pm-local rev-parse --verify refs/heads/one
> +'
> +
> +test_expect_success '--prune-merged warns instead of erroring on un-integrated commits' '
> + test_when_finished "rm -rf pm-unmerged" &&
> + git clone pm-upstream pm-unmerged &&
> + git -C pm-unmerged remote add fork ../pm-fork &&
> + test_config -C pm-unmerged remote.pushDefault fork &&
> + test_config -C pm-unmerged push.default current &&
> + git -C pm-unmerged checkout -b wip origin/wip &&
> + git -C pm-unmerged branch --set-upstream-to=origin/next wip &&
> + test_commit -C pm-unmerged local-only &&
> + git -C pm-unmerged checkout - &&
> +
> + git -C pm-unmerged branch --prune-merged "origin/*" 2>err &&
> + test_grep "not fully merged" err &&
> + test_grep ! "If you are sure you want to delete it" err &&
I'm always suspicious of test_grep when we know what the output should
look like - it might be better to use test_cmp. This test does not check
that we also delete branches that are merged when we see one that isn't.
I'm going to stop here - the tests I've read seem to me to be too much
like unit tests checking one aspect of the implementation in isolation
rather than checking that the whole feature works as expected.
Thanks
Phillip
> + git -C pm-unmerged rev-parse --verify refs/heads/wip
> +'
> +
> +test_expect_success '--prune-merged is silent about not-merged-to-HEAD' '
> + test_when_finished "rm -rf pm-nohead" &&
> + git clone pm-upstream pm-nohead &&
> + git -C pm-nohead remote add fork ../pm-fork &&
> + test_config -C pm-nohead remote.pushDefault fork &&
> + test_config -C pm-nohead push.default current &&
> + git -C pm-nohead branch topic one-commit &&
> + git -C pm-nohead branch --set-upstream-to=origin/next topic &&
> +
> + git -C pm-nohead branch --prune-merged "origin/*" 2>err &&
> +
> + test_grep ! "not yet merged to HEAD" err &&
> + test_must_fail git -C pm-nohead rev-parse --verify refs/heads/topic
> +'
> +
> +test_expect_success '--prune-merged skips branches whose upstream is gone' '
> + test_when_finished "rm -rf pm-upstream-gone" &&
> + git clone pm-upstream pm-upstream-gone &&
> + git -C pm-upstream-gone remote add fork ../pm-fork &&
> + test_config -C pm-upstream-gone remote.pushDefault fork &&
> + test_config -C pm-upstream-gone push.default current &&
> + git -C pm-upstream-gone branch one one-commit &&
> + git -C pm-upstream-gone branch --set-upstream-to=origin/next one &&
> +
> + git -C pm-upstream-gone update-ref -d refs/remotes/origin/next &&
> + git -C pm-upstream-gone branch --prune-merged "origin/*" &&
> +
> + git -C pm-upstream-gone rev-parse --verify refs/heads/one
> +'
> +
> +test_expect_success '--prune-merged never deletes the checked-out branch' '
> + test_when_finished "rm -rf pm-head" &&
> + git clone pm-upstream pm-head &&
> + git -C pm-head remote add fork ../pm-fork &&
> + test_config -C pm-head remote.pushDefault fork &&
> + test_config -C pm-head push.default current &&
> + git -C pm-head checkout -b one one-commit &&
> + git -C pm-head branch --set-upstream-to=origin/next one &&
> +
> + git -C pm-head branch --prune-merged "origin/*" &&
> +
> + git -C pm-head rev-parse --verify refs/heads/one
> +'
> +
> +test_expect_success '--prune-merged spares branches that push back to their upstream' '
> + test_when_finished "rm -rf pm-push-eq" &&
> + git clone pm-upstream pm-push-eq &&
> + git -C pm-push-eq checkout --detach &&
> +
> + git -C pm-push-eq branch --prune-merged "origin/*" &&
> +
> + git -C pm-push-eq rev-parse --verify refs/heads/main
> +'
> +
> +test_expect_success '--prune-merged spares a per-branch pushRemote==upstream remote' '
> + test_when_finished "rm -rf pm-push-branch" &&
> + git clone pm-upstream pm-push-branch &&
> + git -C pm-push-branch remote add fork ../pm-fork &&
> + test_config -C pm-push-branch remote.pushDefault fork &&
> + test_config -C pm-push-branch push.default current &&
> + test_config -C pm-push-branch branch.main.pushRemote origin &&
> + git -C pm-push-branch checkout --detach &&
> +
> + git -C pm-push-branch branch --prune-merged "origin/*" &&
> +
> + git -C pm-push-branch rev-parse --verify refs/heads/main
> +'
> +
> +test_expect_success '--prune-merged prunes when @{push} differs from @{upstream}' '
> + test_when_finished "rm -rf pm-push-diff" &&
> + git clone pm-upstream pm-push-diff &&
> + git -C pm-push-diff remote add fork ../pm-fork &&
> + test_config -C pm-push-diff remote.pushDefault fork &&
> + test_config -C pm-push-diff push.default current &&
> + git -C pm-push-diff branch topic one-commit &&
> + git -C pm-push-diff branch --set-upstream-to=origin/next topic &&
> + git -C pm-push-diff checkout --detach &&
> +
> + git -C pm-push-diff branch --prune-merged "origin/*" &&
> +
> + test_must_fail git -C pm-push-diff rev-parse --verify refs/heads/topic
> +'
> +
> +test_expect_success '--prune-merged requires at least one <branch>' '
> + test_must_fail git -C forked branch --prune-merged 2>err &&
> + test_grep "requires at least one <branch>" err
> +'
> +
> +test_expect_success '--prune-merged takes positional <branch> arguments' '
> + test_when_finished "rm -rf pm-positional" &&
> + git clone pm-upstream pm-positional &&
> + git -C pm-positional remote add fork ../pm-fork &&
> + test_config -C pm-positional remote.pushDefault fork &&
> + test_config -C pm-positional push.default current &&
> + git -C pm-positional branch one one-commit &&
> + git -C pm-positional branch --set-upstream-to=origin/next one &&
> + git -C pm-positional branch two base &&
> + git -C pm-positional branch --set-upstream-to=origin/main two &&
> + git -C pm-positional checkout --detach &&
> +
> + git -C pm-positional branch --prune-merged origin/next origin/main &&
> +
> + test_must_fail git -C pm-positional rev-parse --verify refs/heads/one &&
> + test_must_fail git -C pm-positional rev-parse --verify refs/heads/two
> +'
> +
> test_done
^ permalink raw reply
* Re: [PATCH v14 5/6] branch: add branch.<name>.pruneMerged opt-out
From: Phillip Wood @ 2026-06-16 9:57 UTC (permalink / raw)
To: Harald Nordgren via GitGitGadget, git
Cc: Kristoffer Haugsbakk, Johannes Sixt, Phillip Wood,
Harald Nordgren
In-Reply-To: <d691d5051b35a569dbd3f4a0488030a7d84d72f9.1780999917.git.gitgitgadget@gmail.com>
Hi Harald
On 09/06/2026 11:11, Harald Nordgren via GitGitGadget wrote:
> From: Harald Nordgren <haraldnordgren@gmail.com>
>
> Setting branch.<name>.pruneMerged=false exempts that branch from
> "git branch --prune-merged", which is useful for a topic you want
> to keep developing after an early round of it has been merged
> upstream. Unless --quiet is given, each skip is reported so the
> user knows why their topic was kept.
Sounds good
> @@ -755,6 +757,18 @@ static int prune_merged_branches(int argc, const char **argv,
> if (!push || !strcmp(push, upstream))
> continue;
>
> + strbuf_addf(&key, "branch.%s.prunemerged", short_name);
> + if (!repo_config_get_bool(the_repository, key.buf, &opt_out) &&
> + !opt_out) {
> + if (!quiet)
> + fprintf(stderr,
> + _("Skipping '%s' (branch.%s.pruneMerged is false)\n"),
> + short_name, short_name);
> + strbuf_release(&key);
> + continue;
> + }
> + strbuf_release(&key);
As this is in a loop we don't want to free the buffer on each iteration,
only at the end. You should call strbuf_reset() just before
strbuf_addf() above and then move this call to strbuf_release() out of
the loop.
> +test_expect_success '--prune-merged honours branch.<name>.pruneMerged=false' '
> + test_when_finished "rm -rf pm-optout" &&
> + git clone pm-upstream pm-optout &&
> + git -C pm-optout remote add fork ../pm-fork &&
> + test_config -C pm-optout remote.pushDefault fork &&
> + test_config -C pm-optout push.default current &&
> + git -C pm-optout branch one one-commit &&
> + git -C pm-optout branch --set-upstream-to=origin/next one &&
> + git -C pm-optout branch two two-commit &&
> + git -C pm-optout branch --set-upstream-to=origin/next two &&
> + test_config -C pm-optout branch.one.pruneMerged false &&
> +
> + git -C pm-optout branch --prune-merged "origin/*" 2>err &&
> +
> + git -C pm-optout rev-parse --verify refs/heads/one &&
> + test_must_fail git -C pm-optout rev-parse --verify refs/heads/two &&
> + test_grep "Skipping .one." err
Do we really need a whole new setup to test this - can't we just add a
protected branch to an existing test?
Thanks
Phillip
> +
> +test_expect_success 'branch -d still deletes a pruneMerged=false branch' '
> + test_when_finished "rm -rf pm-optout-d" &&
> + git clone pm-upstream pm-optout-d &&
> + git -C pm-optout-d branch one one-commit &&
> + git -C pm-optout-d branch --set-upstream-to=origin/next one &&
> + test_config -C pm-optout-d branch.one.pruneMerged false &&
> +
> + git -C pm-optout-d branch -d one &&
> + test_must_fail git -C pm-optout-d rev-parse --verify refs/heads/one
> +'
> +
> test_done
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox