Git development
 help / color / mirror / Atom feed
* [PATCH v2 6/7] packfile,delta: drop the `cast_size_t_to_ulong()` wrappers
From: Johannes Schindelin via GitGitGadget @ 2026-06-15 11:52 UTC (permalink / raw)
  To: git
  Cc: Kristofer Karlsson, Patrick Steinhardt, Johannes Schindelin,
	Johannes Schindelin
In-Reply-To: <pull.2137.v2.git.1781524349.gitgitgadget@gmail.com>

From: Johannes Schindelin <johannes.schindelin@gmx.de>

When I started the transition from `unsigned long` to `size_t`, in the
interest of keeping the patches reviewable, I introduced these calls to
prevent data type narrowing from silently failing to handle large object
sizes. I also introduced `*_sz()` variants that would allow most of the
callers to keep using that `unsigned long` that the 90s kindly asked to
be returned.

After the preceding commits, the only places that called the narrow
wrappers either no longer exist or already use the `_sz` form
internally, so the wrappers just narrow values back through
`cast_size_t_to_ulong()` for no reason.

Drop them and rename the `_sz` variants back to the natural names.

Assisted-by: Opus 4.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 delta.h       | 14 ++------------
 packfile.c    | 28 ++++++++--------------------
 packfile.h    |  2 +-
 patch-delta.c |  4 ++--
 4 files changed, 13 insertions(+), 35 deletions(-)

diff --git a/delta.h b/delta.h
index bb149dc82b..eb5c6d2fdb 100644
--- a/delta.h
+++ b/delta.h
@@ -86,11 +86,8 @@ void *patch_delta(const void *src_buf, size_t src_size,
  * This must be called twice on the delta data buffer, first to get the
  * expected source buffer size, and again to get the target buffer size.
  */
-/*
- * Size_t variant that doesn't truncate - use for >4GB objects on Windows.
- */
-static inline size_t get_delta_hdr_size_sz(const unsigned char **datap,
-					   const unsigned char *top)
+static inline size_t get_delta_hdr_size(const unsigned char **datap,
+					const unsigned char *top)
 {
 	const unsigned char *data = *datap;
 	size_t cmd, size = 0;
@@ -104,11 +101,4 @@ static inline size_t get_delta_hdr_size_sz(const unsigned char **datap,
 	return size;
 }
 
-static inline unsigned long get_delta_hdr_size(const unsigned char **datap,
-					       const unsigned char *top)
-{
-	size_t size = get_delta_hdr_size_sz(datap, top);
-	return cast_size_t_to_ulong(size);
-}
-
 #endif
diff --git a/packfile.c b/packfile.c
index dab0a9b16d..c174982d10 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1164,11 +1164,12 @@ unsigned long unpack_object_header_buffer(const unsigned char *buf,
 }
 
 /*
- * Size_t variant for >4GB delta results on Windows.
+ * Read a delta object's header at curpos in p (already inflated as needed)
+ * and return the size of the result object (the post-application target).
  */
-static size_t get_size_from_delta_sz(struct packed_git *p,
-				     struct pack_window **w_curs,
-				     off_t curpos)
+size_t get_size_from_delta(struct packed_git *p,
+			   struct pack_window **w_curs,
+			   off_t curpos)
 {
 	const unsigned char *data;
 	unsigned char delta_head[20], *in;
@@ -1215,18 +1216,10 @@ static size_t get_size_from_delta_sz(struct packed_git *p,
 	data = delta_head;
 
 	/* ignore base size */
-	get_delta_hdr_size_sz(&data, delta_head+sizeof(delta_head));
+	get_delta_hdr_size(&data, delta_head+sizeof(delta_head));
 
 	/* Read the result size */
-	return get_delta_hdr_size_sz(&data, delta_head+sizeof(delta_head));
-}
-
-unsigned long get_size_from_delta(struct packed_git *p,
-				  struct pack_window **w_curs,
-				  off_t curpos)
-{
-	size_t size = get_size_from_delta_sz(p, w_curs, curpos);
-	return cast_size_t_to_ulong(size);
+	return get_delta_hdr_size(&data, delta_head+sizeof(delta_head));
 }
 
 int unpack_object_header(struct packed_git *p,
@@ -1634,12 +1627,7 @@ static int packed_object_info_with_index_pos(struct packed_git *p, off_t obj_off
 				ret = -1;
 				goto out;
 			}
-			/*
-			 * Use size_t variant to avoid die() on >4GB deltas.
-			 * oi->sizep is unsigned long, so truncation may occur,
-			 * but streaming code uses its own size_t tracking.
-			 */
-			size = get_size_from_delta_sz(p, &w_curs, tmp_pos);
+			size = get_size_from_delta(p, &w_curs, tmp_pos);
 			if (size == 0) {
 				ret = -1;
 				goto out;
diff --git a/packfile.h b/packfile.h
index 0b5ae3f9fc..bd4494906d 100644
--- a/packfile.h
+++ b/packfile.h
@@ -458,7 +458,7 @@ int is_pack_valid(struct packed_git *);
 void *unpack_entry(struct repository *r, struct packed_git *, off_t,
 		   enum object_type *, size_t *);
 unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, size_t *sizep);
-unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
+size_t get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
 int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, size_t *);
 off_t get_delta_base(struct packed_git *p, struct pack_window **w_curs,
 		     off_t *curpos, enum object_type type,
diff --git a/patch-delta.c b/patch-delta.c
index 44cda97994..42199fa956 100644
--- a/patch-delta.c
+++ b/patch-delta.c
@@ -27,12 +27,12 @@ void *patch_delta(const void *src_buf, size_t src_size,
 	top = (const unsigned char *) delta_buf + delta_size;
 
 	/* make sure the orig file size matches what we expect */
-	size = get_delta_hdr_size_sz(&data, top);
+	size = get_delta_hdr_size(&data, top);
 	if (size != src_size)
 		return NULL;
 
 	/* now the result size */
-	size = get_delta_hdr_size_sz(&data, top);
+	size = get_delta_hdr_size(&data, top);
 	dst_buf = xmallocz(size);
 
 	out = dst_buf;
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v2 5/7] pack-objects: use size_t for in-core object sizes
From: Johannes Schindelin via GitGitGadget @ 2026-06-15 11:52 UTC (permalink / raw)
  To: git
  Cc: Kristofer Karlsson, Patrick Steinhardt, Johannes Schindelin,
	Johannes Schindelin
In-Reply-To: <pull.2137.v2.git.1781524349.gitgitgadget@gmail.com>

From: Johannes Schindelin <johannes.schindelin@gmx.de>

`pack-objects` stores per-entry object sizes in either the 31-bit
`size_` member of the `struct object_entry` or, when the value does not
fit, the `pack->delta_size[]` spill array.  The accessors (`oe_size`,
`oe_delta_size`, `oe_get_size_slow`, `oe_size_*_than`) and the setters
(`oe_set_size`, `oe_set_delta_size`) used `unsigned long` for the spill
type, which on Windows means the spill silently caps at 4 GiB per entry.
That is what made `upload-pack` die with "object too large to read on
this platform" when serving the >4 GiB blob in `t5608` tests 5 and 6
when run with `GIT_TEST_CLONE_2GB`.

Widen them all to `size_t` (including `pack->delta_size`) and drop the
three `cast_size_t_to_ulong()` calls in `check_object()` that guarded
`in_pack_size`.  The two `SET_SIZE(entry, canonical_size)` calls in the
same function stay cast-free as before, since `canonical_size` is still
`unsigned long` until a later commit widens `object_info::sizep`.

Assisted-by: Opus 4.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/pack-objects.c | 35 ++++++++++++++++++-----------------
 pack-objects.h         |  2 +-
 2 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 56d1bb498d..961d547ef2 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -66,8 +66,8 @@ static inline struct object_entry *oe_delta(
 		return &pack->objects[e->delta_idx - 1];
 }
 
-static inline unsigned long oe_delta_size(struct packing_data *pack,
-					  const struct object_entry *e)
+static inline size_t oe_delta_size(struct packing_data *pack,
+				   const struct object_entry *e)
 {
 	if (e->delta_size_valid)
 		return e->delta_size_;
@@ -83,11 +83,11 @@ static inline unsigned long oe_delta_size(struct packing_data *pack,
 	return pack->delta_size[e - pack->objects];
 }
 
-unsigned long oe_get_size_slow(struct packing_data *pack,
-			       const struct object_entry *e);
+size_t oe_get_size_slow(struct packing_data *pack,
+			const struct object_entry *e);
 
-static inline unsigned long oe_size(struct packing_data *pack,
-				    const struct object_entry *e)
+static inline size_t oe_size(struct packing_data *pack,
+			     const struct object_entry *e)
 {
 	if (e->size_valid)
 		return e->size_;
@@ -145,7 +145,7 @@ static inline void oe_set_delta_sibling(struct packing_data *pack,
 
 static inline void oe_set_size(struct packing_data *pack,
 			       struct object_entry *e,
-			       unsigned long size)
+			       size_t size)
 {
 	if (size < pack->oe_size_limit) {
 		e->size_ = size;
@@ -159,7 +159,7 @@ static inline void oe_set_size(struct packing_data *pack,
 
 static inline void oe_set_delta_size(struct packing_data *pack,
 				     struct object_entry *e,
-				     unsigned long size)
+				     size_t size)
 {
 	if (size < pack->oe_delta_size_limit) {
 		e->delta_size_ = size;
@@ -496,7 +496,7 @@ static void copy_pack_data(struct hashfile *f,
 
 static inline int oe_size_greater_than(struct packing_data *pack,
 				       const struct object_entry *lhs,
-				       unsigned long rhs)
+				       size_t rhs)
 {
 	if (lhs->size_valid)
 		return lhs->size_ > rhs;
@@ -2279,7 +2279,7 @@ static void check_object(struct object_entry *entry, uint32_t object_index)
 		default:
 			/* Not a delta hence we've already got all we need. */
 			oe_set_type(entry, entry->in_pack_type);
-			SET_SIZE(entry, cast_size_t_to_ulong(in_pack_size));
+			SET_SIZE(entry, in_pack_size);
 			entry->in_pack_header_size = used;
 			if (oe_type(entry) < OBJ_COMMIT || oe_type(entry) > OBJ_BLOB)
 				goto give_up;
@@ -2333,8 +2333,8 @@ static void check_object(struct object_entry *entry, uint32_t object_index)
 		if (have_base &&
 		    can_reuse_delta(&base_ref, entry, &base_entry)) {
 			oe_set_type(entry, entry->in_pack_type);
-			SET_SIZE(entry, cast_size_t_to_ulong(in_pack_size)); /* delta size */
-			SET_DELTA_SIZE(entry, cast_size_t_to_ulong(in_pack_size));
+			SET_SIZE(entry, in_pack_size); /* delta size */
+			SET_DELTA_SIZE(entry, in_pack_size);
 
 			if (base_entry) {
 				SET_DELTA(entry, base_entry);
@@ -2357,7 +2357,8 @@ static void check_object(struct object_entry *entry, uint32_t object_index)
 			 * object size from the delta header.
 			 */
 			delta_pos = entry->in_pack_offset + entry->in_pack_header_size;
-			canonical_size = get_size_from_delta(p, &w_curs, delta_pos);
+			canonical_size = get_size_from_delta(p, &w_curs,
+							     delta_pos);
 			if (canonical_size == 0)
 				goto give_up;
 			SET_SIZE(entry, canonical_size);
@@ -2713,7 +2714,7 @@ static pthread_mutex_t progress_mutex;
 
 static inline int oe_size_less_than(struct packing_data *pack,
 				    const struct object_entry *lhs,
-				    unsigned long rhs)
+				    size_t rhs)
 {
 	if (lhs->size_valid)
 		return lhs->size_ < rhs;
@@ -2736,8 +2737,8 @@ static inline void oe_set_tree_depth(struct packing_data *pack,
  * reconstruction (so non-deltas are true object sizes, but deltas
  * return the size of the delta data).
  */
-unsigned long oe_get_size_slow(struct packing_data *pack,
-			       const struct object_entry *e)
+size_t oe_get_size_slow(struct packing_data *pack,
+			const struct object_entry *e)
 {
 	struct packed_git *p;
 	struct pack_window *w_curs;
@@ -2771,7 +2772,7 @@ unsigned long oe_get_size_slow(struct packing_data *pack,
 
 	unuse_pack(&w_curs);
 	packing_data_unlock(&to_pack);
-	return cast_size_t_to_ulong(size);
+	return size;
 }
 
 static int try_delta(struct unpacked *trg, struct unpacked *src,
diff --git a/pack-objects.h b/pack-objects.h
index 83299d4732..e97e84ddcb 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -141,7 +141,7 @@ struct packing_data {
 	uint32_t index_size;
 
 	unsigned int *in_pack_pos;
-	unsigned long *delta_size;
+	size_t *delta_size;
 
 	/*
 	 * Only one of these can be non-NULL and they have different
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v2 4/7] packfile: widen unpack_entry()'s size out-parameter to size_t
From: Johannes Schindelin via GitGitGadget @ 2026-06-15 11:52 UTC (permalink / raw)
  To: git
  Cc: Kristofer Karlsson, Patrick Steinhardt, Johannes Schindelin,
	Johannes Schindelin
In-Reply-To: <pull.2137.v2.git.1781524349.gitgitgadget@gmail.com>

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The topic `js/objects-larger-than-4gb-on-windows` widened the streaming,
index-pack and unpack-objects paths to `size_t` but deliberately stopped
at the in-memory `unpack_entry()` cascade, which still hands back the
unpacked size through `unsigned long *`.  On Windows that boundary
truncates above 4 GiB because that data type is only 32 bits wide on
that platform.

Widen the code path. Except `packed_object_info_with_index_pos()`: It
cannot yet pass `oi->sizep` directly because the field is still
`unsigned long *`; bridge it with a `size_t` temporary that narrows
back, and let a later commit drop the bridge once the field is wide
too. `gfi_unpack_entry()` keeps its narrow signature because fast-import
tracks sizes through `unsigned long` everywhere it crosses subsystem
boundaries, keeping its signature allows the scope of this commit to be
somewhat reasonable, still.

Assisted-by: Opus 4.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/fast-import.c |  7 ++++++-
 pack-check.c          |  5 ++---
 packfile.c            | 28 +++++++++++++++++-----------
 packfile.h            |  3 ++-
 4 files changed, 27 insertions(+), 16 deletions(-)

diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index 82bc6dcc00..3dff898c43 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -1239,6 +1239,8 @@ static void *gfi_unpack_entry(
 	unsigned long *sizep)
 {
 	enum object_type type;
+	size_t size_st = 0;
+	void *data;
 	struct packed_git *p = all_packs[oe->pack_id];
 	if (p == pack_data && p->pack_size < (pack_size + the_hash_algo->rawsz)) {
 		/* The object is stored in the packfile we are writing to
@@ -1260,7 +1262,10 @@ static void *gfi_unpack_entry(
 		 */
 		p->pack_size = pack_size + the_hash_algo->rawsz;
 	}
-	return unpack_entry(the_repository, p, oe->idx.offset, &type, sizep);
+	data = unpack_entry(the_repository, p, oe->idx.offset, &type, &size_st);
+	if (sizep)
+		*sizep = cast_size_t_to_ulong(size_st);
+	return data;
 }
 
 static void load_tree(struct tree_entry *root)
diff --git a/pack-check.c b/pack-check.c
index 2792f34d25..5adfb3f272 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -143,9 +143,8 @@ static int verify_packfile(struct repository *r,
 			data = NULL;
 			data_valid = 0;
 		} else {
-			unsigned long sz;
-			data = unpack_entry(r, p, entries[i].offset, &type, &sz);
-			size = sz;
+			data = unpack_entry(r, p, entries[i].offset, &type,
+					    &size);
 			data_valid = 1;
 		}
 
diff --git a/packfile.c b/packfile.c
index e202f48837..dab0a9b16d 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1454,7 +1454,7 @@ struct delta_base_cache_entry {
 	struct delta_base_cache_key key;
 	struct list_head lru;
 	void *data;
-	unsigned long size;
+	size_t size;
 	enum object_type type;
 };
 
@@ -1525,7 +1525,7 @@ static void detach_delta_base_cache_entry(struct delta_base_cache_entry *ent)
 }
 
 static void *cache_or_unpack_entry(struct repository *r, struct packed_git *p,
-				   off_t base_offset, unsigned long *base_size,
+				   off_t base_offset, size_t *base_size,
 				   enum object_type *type)
 {
 	struct delta_base_cache_entry *ent;
@@ -1558,8 +1558,8 @@ void clear_delta_base_cache(void)
 }
 
 static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
-				 void *base, unsigned long base_size,
-				 unsigned long delta_base_cache_limit,
+				 void *base, size_t base_size,
+				 size_t delta_base_cache_limit,
 				 enum object_type type)
 {
 	struct delta_base_cache_entry *ent;
@@ -1614,10 +1614,13 @@ static int packed_object_info_with_index_pos(struct packed_git *p, off_t obj_off
 	 * a "real" type later if the caller is interested.
 	 */
 	if (oi->contentp) {
-		*oi->contentp = cache_or_unpack_entry(p->repo, p, obj_offset, oi->sizep,
-						      &type);
+		size_t size_st = 0;
+		*oi->contentp = cache_or_unpack_entry(p->repo, p, obj_offset,
+						      &size_st, &type);
 		if (!*oi->contentp)
 			type = OBJ_BAD;
+		else if (oi->sizep)
+			*oi->sizep = cast_size_t_to_ulong(size_st);
 	} else if (oi->sizep || oi->typep || oi->delta_base_oid) {
 		type = unpack_object_header(p, &w_curs, &curpos, &size);
 	}
@@ -1735,7 +1738,7 @@ int packed_object_info(struct packed_git *p, off_t obj_offset,
 static void *unpack_compressed_entry(struct packed_git *p,
 				    struct pack_window **w_curs,
 				    off_t curpos,
-				    unsigned long size)
+				    size_t size)
 {
 	int st;
 	git_zstream stream;
@@ -1790,11 +1793,11 @@ int do_check_packed_object_crc;
 struct unpack_entry_stack_ent {
 	off_t obj_offset;
 	off_t curpos;
-	unsigned long size;
+	size_t size;
 };
 
 void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
-		   enum object_type *final_type, unsigned long *final_size)
+		   enum object_type *final_type, size_t *final_size)
 {
 	struct pack_window *w_curs = NULL;
 	off_t curpos = obj_offset;
@@ -1911,7 +1914,7 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
 		void *delta_data;
 		void *base = data;
 		void *external_base = NULL;
-		unsigned long delta_size, base_size = size;
+		size_t delta_size, base_size = size;
 		int i;
 		off_t base_obj_offset = obj_offset;
 
@@ -1928,6 +1931,7 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
 			struct object_id base_oid;
 			if (!(offset_to_pack_pos(p, obj_offset, &pos))) {
 				struct object_info oi = OBJECT_INFO_INIT;
+				unsigned long bsz_ul = 0;
 
 				nth_packed_object_id(&base_oid, p,
 						     pack_pos_to_index(p, pos));
@@ -1938,11 +1942,13 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
 				mark_bad_packed_object(p, &base_oid);
 
 				oi.typep = &type;
-				oi.sizep = &base_size;
+				oi.sizep = &bsz_ul;
 				oi.contentp = &base;
 				if (odb_read_object_info_extended(r->objects, &base_oid,
 								  &oi, 0) < 0)
 					base = NULL;
+				else
+					base_size = bsz_ul;
 
 				external_base = base;
 			}
diff --git a/packfile.h b/packfile.h
index 49d6bdecf6..0b5ae3f9fc 100644
--- a/packfile.h
+++ b/packfile.h
@@ -455,7 +455,8 @@ off_t nth_packed_object_offset(const struct packed_git *, uint32_t n);
 off_t find_pack_entry_one(const struct object_id *oid, struct packed_git *);
 
 int is_pack_valid(struct packed_git *);
-void *unpack_entry(struct repository *r, struct packed_git *, off_t, enum object_type *, unsigned long *);
+void *unpack_entry(struct repository *r, struct packed_git *, off_t,
+		   enum object_type *, size_t *);
 unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, size_t *sizep);
 unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
 int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, size_t *);
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v2 3/7] pack-objects(check_pack_inflate()): use size_t instead of unsigned long
From: Johannes Schindelin via GitGitGadget @ 2026-06-15 11:52 UTC (permalink / raw)
  To: git
  Cc: Kristofer Karlsson, Patrick Steinhardt, Johannes Schindelin,
	Johannes Schindelin
In-Reply-To: <pull.2137.v2.git.1781524349.gitgitgadget@gmail.com>

From: Johannes Schindelin <johannes.schindelin@gmx.de>

`write_reuse_object()` learned to track its packed-object size as
`size_t` in 606c192380 (odb, packfile: use size_t for streaming
object sizes, 2026-05-08), but the comparison sink it feeds,
`check_pack_inflate()`, still takes the expected decompressed size
as `unsigned long`. The call site bridges the mismatch with
`cast_size_t_to_ulong()`, which on Windows turns a >4 GiB object
into an immediate die().

That function only uses `expect` once: as the right-hand side of a
`stream.total_out == expect` equality test against zlib's counter.
zlib's own `total_out` counter is `uLong` and is therefore still
32-bit-bound on Windows. Widening `expect` to `size_t` cannot fix that,
but it is a strict improvement nonetheless: instead of dying outright,
an oversized object now simply makes the equality fail and lets
`write_reuse_object()` fall back to `write_no_reuse_object()`, which
decompresses and re-deflates the content (and which the larger
pack-objects widening series targets separately).

Drop the `cast_size_t_to_ulong()` shim at the call site now that
the receiving parameter speaks the same type as `entry_size`.

Assisted-by: Opus 4.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/pack-objects.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 50675481e1..56d1bb498d 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -453,7 +453,7 @@ static int check_pack_inflate(struct packed_git *p,
 		struct pack_window **w_curs,
 		off_t offset,
 		off_t len,
-		unsigned long expect)
+		size_t expect)
 {
 	git_zstream stream;
 	unsigned char fakebuf[4096], *in;
@@ -671,8 +671,7 @@ static off_t write_reuse_object(struct hashfile *f, struct object_entry *entry,
 	datalen -= entry->in_pack_header_size;
 
 	if (!pack_to_stdout && p->index_version == 1 &&
-	    check_pack_inflate(p, &w_curs, offset, datalen,
-			       cast_size_t_to_ulong(entry_size))) {
+	    check_pack_inflate(p, &w_curs, offset, datalen, entry_size)) {
 		error(_("corrupt packed object for %s"),
 		      oid_to_hex(&entry->idx.oid));
 		unuse_pack(&w_curs);
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v2 2/7] patch-delta: use size_t for sizes
From: Johannes Schindelin via GitGitGadget @ 2026-06-15 11:52 UTC (permalink / raw)
  To: git
  Cc: Kristofer Karlsson, Patrick Steinhardt, Johannes Schindelin,
	Johannes Schindelin
In-Reply-To: <pull.2137.v2.git.1781524349.gitgitgadget@gmail.com>

From: Johannes Schindelin <johannes.schindelin@gmx.de>

`patch_delta()` takes the source and delta sizes by value and writes
back the reconstructed target size through an `unsigned long *`.  That
datatype cannot represent a value that exceeds 4 GiB on systems where
`unsigned long` is 32-bit (notably 64-bit Windows builds), though, even
though the delta encoding itself, the on-disk layout, and the in-memory
buffers happily carry such sizes. A `size_t` companion to
`get_delta_hdr_size()`, `get_delta_hdr_size_sz()`, was introduced in
17fa077596 (delta, packfile: use size_t for delta header sizes,
2026-05-08) precisely so that `patch_delta()` could be widened without
changing the on-the-wire decoding helper's signature.

Widen `patch_delta()`'s three size parameters to `size_t` and switch
its internal use of `get_delta_hdr_size()` to the `_sz` variant.
Then propagate the wider type through the callers.

Assisted-by: Opus 4.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 apply.c                  |  2 +-
 builtin/index-pack.c     |  4 ++--
 builtin/unpack-objects.c |  2 +-
 delta.h                  |  6 +++---
 packfile.c               |  4 +---
 patch-delta.c            | 12 ++++++------
 t/helper/test-delta.c    | 10 ++++++----
 7 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/apply.c b/apply.c
index 249248d4f2..3cf544e9a9 100644
--- a/apply.c
+++ b/apply.c
@@ -3232,7 +3232,7 @@ static int apply_binary_fragment(struct apply_state *state,
 				 struct patch *patch)
 {
 	struct fragment *fragment = patch->fragments;
-	unsigned long len;
+	size_t len;
 	void *dst;
 
 	if (!fragment)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index cf0bd8280d..3c4474e681 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -71,7 +71,7 @@ struct base_data {
 	/* Not initialized by make_base(). */
 	struct list_head list;
 	void *data;
-	unsigned long size;
+	size_t size;
 };
 
 /*
@@ -1048,7 +1048,7 @@ static struct base_data *resolve_delta(struct object_entry *delta_obj,
 {
 	void *delta_data, *result_data;
 	struct base_data *result;
-	unsigned long result_size;
+	size_t result_size;
 
 	if (show_stat) {
 		int i = delta_obj - objects;
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 59e9b8711e..e7a50c493c 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -314,7 +314,7 @@ static void resolve_delta(unsigned nr, enum object_type type,
 			  void *delta, unsigned long delta_size)
 {
 	void *result;
-	unsigned long result_size;
+	size_t result_size;
 
 	result = patch_delta(base, base_size,
 			     delta, delta_size,
diff --git a/delta.h b/delta.h
index fad68cfc45..bb149dc82b 100644
--- a/delta.h
+++ b/delta.h
@@ -75,9 +75,9 @@ diff_delta(const void *src_buf, unsigned long src_bufsize,
  * *trg_bufsize is updated with its size.  On failure a NULL pointer is
  * returned.  The returned buffer must be freed by the caller.
  */
-void *patch_delta(const void *src_buf, unsigned long src_size,
-		  const void *delta_buf, unsigned long delta_size,
-		  unsigned long *dst_size);
+void *patch_delta(const void *src_buf, size_t src_size,
+		  const void *delta_buf, size_t delta_size,
+		  size_t *dst_size);
 
 /* the smallest possible delta size is 4 bytes */
 #define DELTA_SIZE_MIN	4
diff --git a/packfile.c b/packfile.c
index 89366abfe3..e202f48837 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1964,10 +1964,8 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
 			      (uintmax_t)curpos, p->pack_name);
 			data = NULL;
 		} else {
-			unsigned long sz;
 			data = patch_delta(base, base_size, delta_data,
-					   delta_size, &sz);
-			size = sz;
+					   delta_size, &size);
 
 			/*
 			 * We could not apply the delta; warn the user, but
diff --git a/patch-delta.c b/patch-delta.c
index b5c8594db6..44cda97994 100644
--- a/patch-delta.c
+++ b/patch-delta.c
@@ -12,13 +12,13 @@
 #include "git-compat-util.h"
 #include "delta.h"
 
-void *patch_delta(const void *src_buf, unsigned long src_size,
-		  const void *delta_buf, unsigned long delta_size,
-		  unsigned long *dst_size)
+void *patch_delta(const void *src_buf, size_t src_size,
+		  const void *delta_buf, size_t delta_size,
+		  size_t *dst_size)
 {
 	const unsigned char *data, *top;
 	unsigned char *dst_buf, *out, cmd;
-	unsigned long size;
+	size_t size;
 
 	if (delta_size < DELTA_SIZE_MIN)
 		return NULL;
@@ -27,12 +27,12 @@ void *patch_delta(const void *src_buf, unsigned long src_size,
 	top = (const unsigned char *) delta_buf + delta_size;
 
 	/* make sure the orig file size matches what we expect */
-	size = get_delta_hdr_size(&data, top);
+	size = get_delta_hdr_size_sz(&data, top);
 	if (size != src_size)
 		return NULL;
 
 	/* now the result size */
-	size = get_delta_hdr_size(&data, top);
+	size = get_delta_hdr_size_sz(&data, top);
 	dst_buf = xmallocz(size);
 
 	out = dst_buf;
diff --git a/t/helper/test-delta.c b/t/helper/test-delta.c
index 52ea00c937..8223a60229 100644
--- a/t/helper/test-delta.c
+++ b/t/helper/test-delta.c
@@ -21,7 +21,7 @@ int cmd__delta(int argc, const char **argv)
 	int fd;
 	struct strbuf from = STRBUF_INIT, data = STRBUF_INIT;
 	char *out_buf;
-	unsigned long out_size;
+	size_t out_size;
 
 	if (argc != 5 || (strcmp(argv[1], "-d") && strcmp(argv[1], "-p")))
 		usage(usage_str);
@@ -31,11 +31,13 @@ int cmd__delta(int argc, const char **argv)
 	if (strbuf_read_file(&data, argv[3], 0) < 0)
 		die_errno("unable to read '%s'", argv[3]);
 
-	if (argv[1][1] == 'd')
+	if (argv[1][1] == 'd') {
+		unsigned long delta_size;
 		out_buf = diff_delta(from.buf, from.len,
 				     data.buf, data.len,
-				     &out_size, 0);
-	else
+				     &delta_size, 0);
+		out_size = delta_size;
+	} else
 		out_buf = patch_delta(from.buf, from.len,
 				      data.buf, data.len,
 				      &out_size);
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v2 1/7] compat/msvc: use _chsize_s for ftruncate
From: Johannes Schindelin via GitGitGadget @ 2026-06-15 11:52 UTC (permalink / raw)
  To: git
  Cc: Kristofer Karlsson, Patrick Steinhardt, Johannes Schindelin,
	Johannes Schindelin
In-Reply-To: <pull.2137.v2.git.1781524349.gitgitgadget@gmail.com>

From: Johannes Schindelin <johannes.schindelin@gmx.de>

On Windows, `unsigned long` and `long` are 32 bits even on 64-bit
builds. The MSVC compatibility header has shimmed `ftruncate()` with

	#define ftruncate _chsize

ever since `compat/msvc-posix.h` was introduced. `_chsize()` takes a
32-bit `long` for the new length, which silently truncates files (and
the requested size) to 2 GiB. That is enough to make t7508 test 126
"git add fails gracefully with 4 GiB and 8 GiB files" fail under
MSVC: `test-tool truncate` creates a sparse 4 GiB or 8 GiB file via
the shimmed `ftruncate()`, and the test never gets off the ground.

`_chsize_s()` is the modern replacement, accepts a 64-bit `__int64`
length, and is the only sensible target on Windows. The catch is that
it does not follow the POSIX `-1` + `errno` convention: it returns
`0` on success and an errno value (a small positive integer) on
failure. A plain `#define ftruncate _chsize_s` would therefore
silently break callers that test the return value as `< 0` or against
`-1`, of which there are several: `http.c`, `parallel-checkout.c`,
and `t/helper/test-truncate.c` among them.

Introduce a `static inline` wrapper that calls `_chsize_s()`, copies
its errno return into `errno`, and translates the result to the
familiar `-1` / `0` convention, then point `ftruncate` at the
wrapper. Place the wrapper after `#include "mingw-posix.h"` so the
`off_t` parameter resolves to the already-widened `off64_t` rather
than the 32-bit `_off_t` from `compat/vcbuild/include/unistd.h`.

MinGW is unaffected: its `ftruncate()` already takes `off_t` and
routes through `ftruncate64()` when `_FILE_OFFSET_BITS=64`, which is
the default in our build.

Assisted-by: Opus 4.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/msvc-posix.h | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/compat/msvc-posix.h b/compat/msvc-posix.h
index c500b8b4aa..7ce39b8d3f 100644
--- a/compat/msvc-posix.h
+++ b/compat/msvc-posix.h
@@ -16,7 +16,6 @@
 #define __attribute__(x)
 #define strcasecmp   _stricmp
 #define strncasecmp  _strnicmp
-#define ftruncate    _chsize
 #define strtoull     _strtoui64
 #define strtoll      _strtoi64
 
@@ -30,4 +29,27 @@ typedef int sigset_t;
 
 #include "mingw-posix.h"
 
+/*
+ * MSVC's `_chsize()` takes a 32-bit `long` and silently truncates files
+ * to 2 GiB. `_chsize_s()` accepts a 64-bit length but returns 0 on
+ * success or an errno value on failure, rather than the -1/errno
+ * convention POSIX `ftruncate()` callers expect. Wrap it so callers
+ * that test the return value as `< 0` or against `-1` keep working.
+ *
+ * Note: this declaration must follow `#include "mingw-posix.h"` so
+ * `off_t` resolves to `off64_t` and the parameter type matches the
+ * underlying `_chsize_s()` width.
+ */
+static inline int msvc_ftruncate(int fd, off_t length)
+{
+	int err = _chsize_s(fd, length);
+
+	if (err) {
+		errno = err;
+		return -1;
+	}
+	return 0;
+}
+#define ftruncate msvc_ftruncate
+
 #endif /* COMPAT_MSVC_POSIX_H */
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v2 0/7] More work supporting objects larger than 4GB on Windows
From: Johannes Schindelin via GitGitGadget @ 2026-06-15 11:52 UTC (permalink / raw)
  To: git; +Cc: Kristofer Karlsson, Patrick Steinhardt, Johannes Schindelin
In-Reply-To: <pull.2137.git.1780570272.gitgitgadget@gmail.com>

This patch series tries to address the problems pointed out by the expensive
tests that now run in CI: t5608 and t7508 verify various aspects about
objects larger than 4GB, which Git does not currently handle correctly when
run on a platform where size_t is 64-bit and unsigned long is 32-bit.

Changes vs v1:

 * Rebased onto master, which merged ps/odb-source-loose (with which these
   patches previously conflicted rather badly).
 * Removed superfluous size_t s variables (thanks, Patrick!).

Johannes Schindelin (7):
  compat/msvc: use _chsize_s for ftruncate
  patch-delta: use size_t for sizes
  pack-objects(check_pack_inflate()): use size_t instead of unsigned
    long
  packfile: widen unpack_entry()'s size out-parameter to size_t
  pack-objects: use size_t for in-core object sizes
  packfile,delta: drop the `cast_size_t_to_ulong()` wrappers
  odb: use size_t for object_info.sizep and the size APIs

 apply.c                       |  8 ++--
 archive.c                     |  4 +-
 attr.c                        |  2 +-
 bisect.c                      |  2 +-
 blame.c                       | 15 +++++--
 builtin/cat-file.c            | 61 ++++++++++++++---------------
 builtin/difftool.c            |  2 +-
 builtin/fast-export.c         |  7 +++-
 builtin/fast-import.c         | 29 ++++++++++----
 builtin/fsck.c                |  2 +-
 builtin/grep.c                | 12 +++---
 builtin/index-pack.c          | 10 ++---
 builtin/log.c                 |  2 +-
 builtin/ls-files.c            |  2 +-
 builtin/ls-tree.c             |  4 +-
 builtin/merge-tree.c          |  6 +--
 builtin/mktag.c               |  2 +-
 builtin/notes.c               |  6 +--
 builtin/pack-objects.c        | 73 +++++++++++++++++++++--------------
 builtin/repo.c                |  4 +-
 builtin/tag.c                 |  4 +-
 builtin/unpack-file.c         |  2 +-
 builtin/unpack-objects.c      |  8 ++--
 bundle.c                      |  2 +-
 combine-diff.c                |  4 +-
 commit.c                      | 10 ++---
 compat/msvc-posix.h           | 24 +++++++++++-
 config.c                      |  2 +-
 delta.h                       | 20 +++-------
 diff.c                        |  5 ++-
 dir.c                         |  2 +-
 entry.c                       |  4 +-
 fmt-merge-msg.c               |  4 +-
 fsck.c                        |  2 +-
 grep.c                        |  4 +-
 http-push.c                   |  2 +-
 list-objects-filter.c         |  2 +-
 mailmap.c                     |  2 +-
 match-trees.c                 |  4 +-
 merge-blobs.c                 |  6 +--
 merge-blobs.h                 |  2 +-
 merge-ort.c                   |  2 +-
 notes-cache.c                 |  2 +-
 notes-merge.c                 |  2 +-
 notes.c                       |  8 ++--
 object-file.c                 |  6 +--
 object.c                      |  2 +-
 odb.c                         | 12 +++---
 odb.h                         | 10 ++---
 odb/source-loose.c            | 12 +-----
 odb/streaming.c               | 13 +------
 pack-bitmap.c                 |  4 +-
 pack-check.c                  |  5 +--
 pack-objects.h                |  2 +-
 packfile.c                    | 54 ++++++++++----------------
 packfile.h                    |  5 ++-
 patch-delta.c                 |  8 ++--
 path-walk.c                   |  2 +-
 protocol-caps.c               |  5 ++-
 read-cache.c                  |  6 +--
 ref-filter.c                  |  2 +-
 reflog.c                      |  2 +-
 rerere.c                      |  2 +-
 submodule-config.c            |  2 +-
 t/helper/test-delta.c         | 10 +++--
 t/helper/test-pack-deltas.c   |  3 +-
 t/helper/test-partial-clone.c |  2 +-
 t/unit-tests/u-odb-inmemory.c |  2 +-
 tag.c                         |  4 +-
 tree-walk.c                   | 10 +++--
 tree.c                        |  2 +-
 xdiff-interface.c             |  2 +-
 72 files changed, 300 insertions(+), 271 deletions(-)


base-commit: ea97ad8d017de0c9037451a78008a0fd60abea0c
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2137%2Fdscho%2Fobjects-larger-than-4gb-on-windows-pt2-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2137/dscho/objects-larger-than-4gb-on-windows-pt2-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/2137

Range-diff vs v1:

 1:  de9fc5c455 = 1:  531bca775c compat/msvc: use _chsize_s for ftruncate
 2:  1fd7646ca1 = 2:  66a642c39e patch-delta: use size_t for sizes
 3:  ddb75326cd = 3:  271a5299e3 pack-objects(check_pack_inflate()): use size_t instead of unsigned long
 4:  bdebc36f21 = 4:  5c329535df packfile: widen unpack_entry()'s size out-parameter to size_t
 5:  68750ba2d1 = 5:  01b9209b26 pack-objects: use size_t for in-core object sizes
 6:  460d733fee = 6:  12c142f8ab packfile,delta: drop the `cast_size_t_to_ulong()` wrappers
 7:  f3aeae983a ! 7:  37d030d867 odb: use size_t for object_info.sizep and the size APIs
     @@ builtin/cat-file.c: static int cat_one_file(int opt, const char *exp_type, const
       	struct object_info oi = OBJECT_INFO_INIT;
       	unsigned flags = OBJECT_INFO_LOOKUP_REPLACE;
      @@ builtin/cat-file.c: static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
     - 		if (use_mailmap && (type == OBJ_COMMIT || type == OBJ_TAG)) {
     - 			size_t s = size;
     - 			buf = replace_idents_using_mailmap(buf, &s);
     + 		if (odb_read_object_info_extended(the_repository->objects, &oid, &oi, flags) < 0)
     + 			die("git cat-file: could not get object info");
     + 
     +-		if (use_mailmap && (type == OBJ_COMMIT || type == OBJ_TAG)) {
     +-			size_t s = size;
     +-			buf = replace_idents_using_mailmap(buf, &s);
      -			size = cast_size_t_to_ulong(s);
     -+			size = s;
     - 		}
     +-		}
     ++		if (use_mailmap && (type == OBJ_COMMIT || type == OBJ_TAG))
     ++			buf = replace_idents_using_mailmap(buf, &size);
       
       		printf("%"PRIuMAX"\n", (uintmax_t)size);
     + 		ret = 0;
      @@ builtin/cat-file.c: static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
       		break;
       
     @@ builtin/cat-file.c: static int cat_one_file(int opt, const char *exp_type, const
       
       	case 'p':
      @@ builtin/cat-file.c: static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
     - 		if (use_mailmap) {
     - 			size_t s = size;
     - 			buf = replace_idents_using_mailmap(buf, &s);
     + 		if (!buf)
     + 			die("Cannot read object %s", obj_name);
     + 
     +-		if (use_mailmap) {
     +-			size_t s = size;
     +-			buf = replace_idents_using_mailmap(buf, &s);
      -			size = cast_size_t_to_ulong(s);
     -+			size = s;
     - 		}
     +-		}
     ++		if (use_mailmap)
     ++			buf = replace_idents_using_mailmap(buf, &size);
       
       		/* otherwise just spit out the data */
     + 		break;
      @@ builtin/cat-file.c: static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
     - 		if (use_mailmap) {
     - 			size_t s = size;
     - 			buf = replace_idents_using_mailmap(buf, &s);
     + 		buf = odb_read_object_peeled(the_repository->objects, &oid,
     + 					     exp_type_id, &size, NULL);
     + 
     +-		if (use_mailmap) {
     +-			size_t s = size;
     +-			buf = replace_idents_using_mailmap(buf, &s);
      -			size = cast_size_t_to_ulong(s);
     -+			size = s;
     - 		}
     +-		}
     ++		if (use_mailmap)
     ++			buf = replace_idents_using_mailmap(buf, &size);
       		break;
       	}
     + 	default:
      @@ builtin/cat-file.c: cleanup:
       struct expand_data {
       	struct object_id oid;
     @@ builtin/cat-file.c: static void print_object_or_die(struct batch_options *opt, s
       
       		contents = odb_read_object(the_repository->objects, oid,
      @@ builtin/cat-file.c: static void print_object_or_die(struct batch_options *opt, struct expand_data *d
     - 		if (use_mailmap) {
     - 			size_t s = size;
     - 			contents = replace_idents_using_mailmap(contents, &s);
     + 		if (!contents)
     + 			die("object %s disappeared", oid_to_hex(oid));
     + 
     +-		if (use_mailmap) {
     +-			size_t s = size;
     +-			contents = replace_idents_using_mailmap(contents, &s);
      -			size = cast_size_t_to_ulong(s);
     -+			size = s;
     - 		}
     +-		}
     ++		if (use_mailmap)
     ++			contents = replace_idents_using_mailmap(contents, &size);
       
       		if (type != data->type)
     + 			die("object %s changed type!?", oid_to_hex(oid));
      @@ builtin/cat-file.c: static void batch_object_write(const char *obj_name,
     + 		}
     + 
     + 		if (use_mailmap && (data->type == OBJ_COMMIT || data->type == OBJ_TAG)) {
     +-			size_t s = data->size;
     + 			char *buf = NULL;
     + 
     + 			buf = odb_read_object(the_repository->objects, &data->oid,
     + 					      &data->type, &data->size);
       			if (!buf)
       				die(_("unable to read %s"), oid_to_hex(&data->oid));
     - 			buf = replace_idents_using_mailmap(buf, &s);
     +-			buf = replace_idents_using_mailmap(buf, &s);
      -			data->size = cast_size_t_to_ulong(s);
     -+			data->size = s;
     ++			buf = replace_idents_using_mailmap(buf, &data->size);
       
       			free(buf);
       		}
     @@ builtin/log.c: static int show_blob_object(const struct object_id *oid, struct r
      
       ## builtin/ls-files.c ##
      @@ builtin/ls-files.c: static void expand_objectsize(struct repository *repo, struct strbuf *line,
     - 			      const enum object_type type, unsigned int padded)
     - {
     + 	size_t len;
     + 
       	if (type == OBJ_BLOB) {
      -		unsigned long size;
      +		size_t size;
     @@ builtin/ls-files.c: static void expand_objectsize(struct repository *repo, struc
      
       ## builtin/ls-tree.c ##
      @@ builtin/ls-tree.c: static void expand_objectsize(struct strbuf *line, const struct object_id *oid,
     - 			      const enum object_type type, unsigned int padded)
     - {
     + 	size_t len;
     + 
       	if (type == OBJ_BLOB) {
      -		unsigned long size;
      +		size_t size;
     @@ notes.c: static void format_note(struct notes_tree *t, const struct object_id *o
       	if (!t)
      
       ## object-file.c ##
     -@@ object-file.c: static int parse_loose_header(const char *hdr, struct object_info *oi)
     +@@ object-file.c: int parse_loose_header(const char *hdr, struct object_info *oi)
       	}
       
       	if (oi->sizep)
     @@ object-file.c: static int parse_loose_header(const char *hdr, struct object_info
       
       	/*
       	 * The length must be followed by a zero byte
     -@@ object-file.c: static int read_object_info_from_path(struct odb_source *source,
     - 	void *map = NULL;
     - 	git_zstream stream, *stream_to_end = NULL;
     - 	char hdr[MAX_HEADER_LEN];
     --	unsigned long size_scratch;
     -+	size_t size_scratch;
     - 	enum object_type type_scratch;
     - 	struct stat st;
     - 
      @@ object-file.c: int force_object_loose(struct odb_source *source,
     - {
     + 	struct odb_source_files *files = odb_source_files_downcast(source);
       	const struct git_hash_algo *compat = source->odb->repo->compat_hash_algo;
       	void *buf;
      -	unsigned long len;
     @@ object-file.c: int read_loose_object(struct repository *repo,
       
       	fd = git_open(path);
       	if (fd >= 0)
     -@@ object-file.c: int odb_source_loose_read_object_stream(struct odb_read_stream **out,
     - 	struct object_info oi = OBJECT_INFO_INIT;
     - 	struct odb_loose_read_stream *st;
     - 	unsigned long mapsize;
     --	unsigned long size_ul;
     - 	void *mapped;
     - 
     - 	mapped = odb_source_loose_map_object(source, oid, &mapsize);
     -@@ object-file.c: int odb_source_loose_read_object_stream(struct odb_read_stream **out,
     - 		goto error;
     - 	}
     - 
     --	/*
     --	 * object_info.sizep is unsigned long* (32-bit on Windows), but
     --	 * st->base.size is size_t (64-bit). Use temporary variable.
     --	 * Note: loose objects >4GB would still truncate here, but such
     --	 * large loose objects are uncommon (they'd normally be packed).
     --	 */
     --	oi.sizep = &size_ul;
     -+	oi.sizep = &st->base.size;
     - 	oi.typep = &st->base.type;
     - 
     - 	if (parse_loose_header(st->hdr, &oi) < 0 || st->base.type < 0)
     - 		goto error;
     --	st->base.size = size_ul;
     - 
     - 	st->mapped = mapped;
     - 	st->mapsize = mapsize;
      
       ## object.c ##
      @@ object.c: struct object *parse_object_with_flags(struct repository *r,
     @@ odb.h: int odb_read_object_info_extended(struct object_database *odb,
       enum odb_has_object_flags {
       	/* Retry packed storage after checking packed and loose storage */
      
     + ## odb/source-loose.c ##
     +@@ odb/source-loose.c: static int read_object_info_from_path(struct odb_source_loose *loose,
     + 	void *map = NULL;
     + 	git_zstream stream, *stream_to_end = NULL;
     + 	char hdr[MAX_HEADER_LEN];
     +-	unsigned long size_scratch;
     ++	size_t size_scratch;
     + 	enum object_type type_scratch;
     + 	struct stat st;
     + 
     +@@ odb/source-loose.c: static int odb_source_loose_read_object_stream(struct odb_read_stream **out,
     + 	struct object_info oi = OBJECT_INFO_INIT;
     + 	struct odb_loose_read_stream *st;
     + 	unsigned long mapsize;
     +-	unsigned long size_ul;
     + 	void *mapped;
     + 
     + 	mapped = odb_source_loose_map_object(loose, oid, &mapsize);
     +@@ odb/source-loose.c: static int odb_source_loose_read_object_stream(struct odb_read_stream **out,
     + 		goto error;
     + 	}
     + 
     +-	/*
     +-	 * object_info.sizep is unsigned long* (32-bit on Windows), but
     +-	 * st->base.size is size_t (64-bit). Use temporary variable.
     +-	 * Note: loose objects >4GB would still truncate here, but such
     +-	 * large loose objects are uncommon (they'd normally be packed).
     +-	 */
     +-	oi.sizep = &size_ul;
     ++	oi.sizep = &st->base.size;
     + 	oi.typep = &st->base.type;
     + 
     + 	if (parse_loose_header(st->hdr, &oi) < 0 || st->base.type < 0)
     + 		goto error;
     +-	st->base.size = size_ul;
     + 
     + 	st->mapped = mapped;
     + 	st->mapsize = mapsize;
     +
       ## odb/streaming.c ##
      @@ odb/streaming.c: static int open_istream_incore(struct odb_read_stream **out,
       		.base.read = read_istream_incore,

-- 
gitgitgadget

^ permalink raw reply

* Re: [PATCH] commit-graph: use timestamp_t for max parent generation accumulator
From: Derrick Stolee @ 2026-06-15 11:44 UTC (permalink / raw)
  To: Patrick Steinhardt, Elijah Newren via GitGitGadget; +Cc: git, Elijah Newren
In-Reply-To: <ai-zzWn9Ls6-j9h8@pks.im>

On 6/15/26 4:11 AM, Patrick Steinhardt wrote:
> On Sun, Jun 14, 2026 at 06:57:50AM +0000, Elijah Newren via GitGitGadget wrote:
>>      commit-graph: use timestamp_t for max parent generation accumulator
>>      
>>      We found a few repositories in the wild with commits whose authors were
>>      apparently on a computer in the year 2120 when they recorded their
>>      commits. Apparently, in a century from now, some folks are going to have
>>      a really weird timezone as well (-13068837), though the timezone doesn't
>>      factor into this patch at all.

>> @@ -1669,7 +1669,7 @@ static void compute_reachable_generation_numbers(
>>   			struct commit *current = list->item;
>>   			struct commit_list *parent;
>>   			int all_parents_computed = 1;
>> -			uint32_t max_gen = 0;
>> +			timestamp_t max_gen = 0;
>>   
>>   			for (parent = current->parents; parent; parent = parent->next) {
>>   				repo_parse_commit(info->r, parent->item);
> 
> This looks obviously correct.

I agree. I was surprised this was the only necessary change, but
your message clearly describes how the timing of the patch that
delivered this change contributed to the mismatch.

Thanks,
-Stolee


^ permalink raw reply

* Re: [PATCH] builtin/history: unuse the commit buffer after use
From: Patrick Steinhardt @ 2026-06-15  9:48 UTC (permalink / raw)
  To: Kaartic Sivaraam; +Cc: Git mailing list
In-Reply-To: <20260614141600.620272-1-kaartic.sivaraam@gmail.com>

On Sun, Jun 14, 2026 at 02:15:40PM +0000, Kaartic Sivaraam wrote:
> While running `git history reword` using a Git built with `SANITIZE` flag set
> to `address,leak`, we could observe the following leak being reported:

Huh, curious. That seems to hint that we're missing test coverage for
this specific scenario, as our test suite doesn't detect this leak.

[snip]
> A deeper investigation on this reveals the following as the root cause.
> 
> As part of rewording a commit in `git history`, we get the commit message
> buffer in the `commit_tree_ext` function. This in turn obtains the buffer
> from `repo_logmsg_reencode`. Given how `commit_tree_ext` is invoking the
> function with the last two parameters as NULL, we are clearly not expecting
> a reencode to happen. In this case, the buffer that we receive from
> `repo_logmsg_reencode` ends up always being obtained from a call to
> `repo_get_commit_buffer`.
> 
> This buffer is expected to be released with an accompanying call to
> `repo_unuse_commit_buffer` which takes care of freeing it. This call
> is missing in the `commit_tree_ext` flow thus resulting in the leak.

So this doesn't really read specific at all, and I would have expected
us to hit this leak. Puzzling.

> Fix this by ensuring we call `repo_unuse_commit_buffer` on the
> original_message buffer.
> 
> Signed-off-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
> ---
> I must mention that I also noticed the following comment in `commit_tree_ext`:
> 
> »       /* We retain authorship of the original commit. */
> »       original_message = repo_logmsg_reencode(repo, commit_with_message, NULL, NULL);
> 
> ... but I'm not quite sure why we don't unuse the buffer after its purpose is
> done. Kindly englighten me in case I missed something.

Did you maybe confuse "authorship" with "ownership" while reading the
comment? The comment only mentions that we retain the original "Author"
commit metadata, it doesn't refer to ownership of the underlying
objects.

> diff --git a/builtin/history.c b/builtin/history.c
> index 091465a59e..0e9259b5d7 100644
> --- a/builtin/history.c
> +++ b/builtin/history.c
> @@ -154,6 +154,7 @@ static int commit_tree_ext(struct repository *repo,
>  	free_commit_extra_headers(original_extra_headers);
>  	strbuf_release(&commit_message);
>  	free(original_author);
> +	repo_unuse_commit_buffer(repo, commit_with_message, original_message);
>  	return ret;
>  }

Yup, this makes sense to me.

Thanks!

Patrick

^ permalink raw reply

* Re: [PATCH v14 3/6] branch: prepare delete_branches for a bulk caller
From: Phillip Wood @ 2026-06-15  9:47 UTC (permalink / raw)
  To: Harald Nordgren via GitGitGadget, git
  Cc: Kristoffer Haugsbakk, Johannes Sixt, Harald Nordgren
In-Reply-To: <259113e304c4085c2bd90cce3a40c965744d5a00.1780999917.git.gitgitgadget@gmail.com>

Hi Harald

On 09/06/2026 11:11, Harald Nordgren via GitGitGadget wrote:
>
> @@ -240,7 +245,7 @@ static int delete_branches(int argc, const char **argv, int kinds,
>   	int i;
>   	int ret = 0;
>   	int remote_branch = 0;
> -	int force, quiet;
> +	int force, quiet, dry_run, no_head_fallback;

As with the previous patch it would be safer to initialize the new 
variables where they are declared.

>   	for_each_string_list_item(item, &refs_to_delete) {
>   		char *describe_ref = item->util;
>   		char *name = item->string;
> -		if (!refs_ref_exists(get_main_ref_store(the_repository), name)) {
> +		if (dry_run) {
> +			if (!quiet)
> +				printf(remote_branch
> +					? _("Would delete remote-tracking branch %s (was %s).\n")
> +					: _("Would delete branch %s (was %s).\n"),

I wondered what the "was %s" was about but it prints the symref target 
or oid of the ref.

Thanks

Phillip

> +					name + branch_name_pos, describe_ref);
> +		} else if (!refs_ref_exists(get_main_ref_store(the_repository), name)) {
>   			char *refname = name + branch_name_pos;
>   			if (!quiet)
>   				printf(remote_branch


^ permalink raw reply

* Re: [PATCH v14 1/6] branch: add --forked filter for --list mode
From: Phillip Wood @ 2026-06-15  9:46 UTC (permalink / raw)
  To: Harald Nordgren via GitGitGadget, git
  Cc: Kristoffer Haugsbakk, Johannes Sixt, Harald Nordgren
In-Reply-To: <7383872f4b2f422ec36b11ab5fb31cce08e6106a.1780999917.git.gitgitgadget@gmail.com>

Hi Harald

On 09/06/2026 11:11, Harald Nordgren via GitGitGadget wrote:
> From: Harald Nordgren <haraldnordgren@gmail.com>
> 
> Add a --forked option to "git branch" list mode that lists only
> branches whose configured upstream matches <branch>. The argument
> can be a ref (e.g. "origin/main", "master") or a shell glob
> (e.g. "origin/*"), and may be repeated to widen the filter.
> 
> It is an ordinary list filter, so it combines with the others:
> 
>      git branch --merged origin/main --forked 'origin/*'
> 
> lists branches forked from origin that are already merged into
> origin/main, and --no-merged inverts the question.
> 
> This is the building block for --prune-merged, which deletes the
> listed branches once they have landed on their upstream.
> 
> Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
> ---
>   Documentation/git-branch.adoc | 10 +++-
>   builtin/branch.c              | 18 ++++++-
>   ref-filter.c                  | 70 ++++++++++++++++++++++++++
>   ref-filter.h                  | 10 ++++
>   t/t3200-branch.sh             | 92 +++++++++++++++++++++++++++++++++++
>   5 files changed, 197 insertions(+), 3 deletions(-)

It's nice to see that moving the code into the ref-filter.c has reduced 
the overall number of additions by ~50 lines. The documentation and 
implementation look fine though I have a couple of thoughts:

  - Previous iterations supported "origin" as a short hand for the branch
    origin/HEAD points to. That was nice because it means we can use the
    same syntax for "git checkout -b" and "git branch --forked". It
    would probably be a good idea to support it.

  - We could probably be a bit smarter about the way we handle patterns
    by copying what dwim_ref() does to support things like
    remotes/origin/* but I don't think we need to do that now.

> diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
> index e7829c2c4b..4e7deddc04 100755
> --- a/t/t3200-branch.sh
> +++ b/t/t3200-branch.sh
> @@ -1717,4 +1717,96 @@ test_expect_success 'errors if given a bad branch name' '
>   	test_cmp expect actual
>   '
>   
> +test_expect_success '--forked: setup' '
> +	test_create_repo forked-upstream &&
> +	test_commit -C forked-upstream base &&
> +	git -C forked-upstream branch one base &&
> +	git -C forked-upstream branch two base &&
> +
> +	test_create_repo forked-other &&
> +	test_commit -C forked-other other-base &&
> +	git -C forked-other branch foreign other-base &&
> +
> +	git clone forked-upstream forked &&
> +	git -C forked remote add other ../forked-other &&

We can use "add -f" to fetch here rather than doing it separately.

> +	git -C forked fetch other &&
> +	git -C forked branch local-base &&
> +	git -C forked branch --track local-one origin/one &&
> +	git -C forked branch --track local-two origin/two &&
> +	git -C forked branch --track local-foreign other/foreign &&
> +	git -C forked branch detached &&

Normally we use "detached" to mean no branch, lets read on and see how 
this is used ...

> +	git -C forked branch --track local-trunk local-base
> +'
> +
> +test_expect_success '--forked <upstream-tracking-branch> filters by upstream' '
> +	git -C forked branch --forked origin/one --format="%(refname:short)" >actual &&

origin/one and origin/two point to the same commit, so this demonstrates 
that we're checking the branch names, not the topology which is good. 
All of the local branches point at their upstream which isn't very 
realistic - I wonder if we should add some local commits?

The tests all look sensible, but there is no coverage for combining 
--forked with branch names as in

     git branch --forked <arg> <branch>

Thanks

Phillip


> +	echo local-one >expect &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success '--forked <glob> filters by wildmatch' '
> +	git -C forked branch --forked "origin/*" --format="%(refname:short)" >actual &&
> +	cat >expect <<-\EOF &&
> +	local-one
> +	local-two
> +	main
> +	EOF
> +	test_cmp expect actual
> +'
> +
> +test_expect_success '--forked <local-branch> matches branches with local upstream' '
> +	git -C forked branch --forked local-base --format="%(refname:short)" >actual &&
> +	echo local-trunk >expect &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success '--forked can be repeated to widen the filter' '
> +	git -C forked branch --forked origin/one --forked other/foreign --format="%(refname:short)" >actual &&
> +	cat >expect <<-\EOF &&
> +	local-foreign
> +	local-one
> +	EOF
> +	test_cmp expect actual
> +'
> +
> +test_expect_success '--forked combines literal and glob arguments' '
> +	git -C forked branch --forked local-base --forked "other/*" --format="%(refname:short)" >actual &&
> +	cat >expect <<-\EOF &&
> +	local-foreign
> +	local-trunk
> +	EOF
> +	test_cmp expect actual
> +'
> +
> +test_expect_success '--forked "*/*" covers every remote-tracking upstream' '
> +	git -C forked branch --forked "*/*" --format="%(refname:short)" >actual &&
> +	cat >expect <<-\EOF &&
> +	local-foreign
> +	local-one
> +	local-two
> +	main
> +	EOF
> +	test_cmp expect actual
> +'
> +
> +test_expect_success '--forked composes with --no-merged' '
> +	test_when_finished "git -C forked checkout detached" &&
> +	git -C forked checkout local-one &&
> +	test_commit -C forked local-only &&
> +	git -C forked branch --forked "origin/*" --no-merged origin/one \
> +		--format="%(refname:short)" >actual &&
> +	echo local-one >expect &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success '--forked rejects unknown branch/pattern' '
> +	test_must_fail git -C forked branch --forked nope 2>err &&
> +	test_grep "not a valid branch or pattern" err
> +'
> +
> +test_expect_success '--forked requires a value' '
> +	test_must_fail git -C forked branch --forked 2>err &&
> +	test_grep "requires a value" err
> +'
> +
>   test_done


^ permalink raw reply

* Re: [PATCH v14 4/6] branch: add --prune-merged <branch>
From: Phillip Wood @ 2026-06-15  9:46 UTC (permalink / raw)
  To: Harald Nordgren via GitGitGadget, git
  Cc: Kristoffer Haugsbakk, Johannes Sixt, Harald Nordgren
In-Reply-To: <9924373da0a0598cabe4f08f3bc4200833679171.1780999917.git.gitgitgadget@gmail.com>

Hi Harald

On 09/06/2026 11:11, Harald Nordgren via GitGitGadget wrote:
> From: Harald Nordgren <haraldnordgren@gmail.com>
> 
> 	git branch --prune-merged <branch>...

Please see my comments on the previous version about the naming of this 
option. I really think we need to start a discussion to find a better 
name for this option as the other options to delete a branch are named 
"delete" rather than "prune" and this does not remove the branches 
listed by "--merge"

> deletes the local branches that "--forked <branch>" would list,
> keeping only those whose tip is reachable from their configured
> upstream: the work has already landed on the upstream they track,
> so the local copy is no longer needed.
> 
> Reachability is read from local refs; nothing is fetched. Run
> "git fetch" first if you want fresh upstream refs.

I don't  think this sentence adds anything - git never fetches unless 
the user explicitly asks it to.

> 
> Three kinds of branches are spared:
> 
>    * any branch checked out in any worktree;
>    * any branch whose upstream no longer resolves locally, since a
>      missing upstream is not by itself a sign of integration;
>    * any branch whose push destination equals its upstream
>      (<branch>@{push} is the same as <branch>@{upstream}), such as
>      a local "main" that tracks and pushes to "origin/main". Right
>      after a pull it just looks "fully merged", so it is left
>      alone. Only branches that push somewhere other than their
>      upstream, typically topics in a fork workflow, are candidates.
> 
> Branches that are not yet merged into their upstream are reported
> as a short warning and skipped, so one unmerged topic does not
> abort the whole sweep.

I'm not sure about this warning - the user has asked us to delete the 
branches whose upstreams match those passed on the commandline and that 
have been merged so do they really want to hear about the ones that have 
not been merged? It might be useful to have a way to list those that 
have not been merged in the future.

> Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
> ---
>   Documentation/git-branch.adoc |  24 ++++
>   builtin/branch.c              |  67 +++++++++++-
>   t/t3200-branch.sh             | 201 ++++++++++++++++++++++++++++++++++
>   3 files changed, 290 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/git-branch.adoc b/Documentation/git-branch.adoc
> index 62ebab6051..fdaccc9662 100644
> --- a/Documentation/git-branch.adoc
> +++ b/Documentation/git-branch.adoc
> @@ -25,6 +25,7 @@ git branch (-m|-M) [<old-branch>] <new-branch>
>   git branch (-c|-C) [<old-branch>] <new-branch>
>   git branch (-d|-D) [-r] <branch-name>...
>   git branch --edit-description [<branch-name>]
> +git branch --prune-merged <branch>...
>   
>   DESCRIPTION
>   -----------
> @@ -201,6 +202,29 @@ This option is only applicable in non-verbose mode.
>   	Print the name of the current branch. In detached `HEAD` state,
>   	nothing is printed.
>   
> +`--prune-merged <branch>...`::
> +	Delete the local branches that `--forked` would list for the
> +	given _<branch>_ arguments, but only those whose tip is
> +	reachable from their configured upstream. In other words, the
> +	work on the branch has already landed on the upstream it
> +	tracks, so the local copy is no longer needed. Several
> +	_<branch>_ patterns may be given, e.g. `git branch
> +	--prune-merged origin/main 'feature*'`.
> ++
> +Reachability is checked against whatever the upstream refs say
> +locally; nothing is fetched. Run `git fetch` first if you want
> +the upstream refs refreshed.
Maybe

Reachability is checked against the remote-tracking branch. Run `git 
fetch` first if you want update the remote-tracking branch.

> ++
> +A branch is left alone if any of the following holds:

s/left alone/not deleted/

> +its upstream no longer resolves locally; it is checked out in any

s/upstream no longer resolves locally/upstream remote-tracking branch no 
longer exists/

> +worktree; or its push destination (`<branch>@{push}`) equals its
> +upstream (`<branch>@{upstream}`), so it cannot be distinguished
> +from a freshly pulled trunk that just looks "fully merged".

What's a "freshly pulled trunk"? "trunk" does not appear in gitglossary(7)

> ++
> +Branches refused by the "fully merged" safety check are listed as
> +warnings and skipped; pass them to `git branch -D` explicitly if
> +you want them gone.

s/them gone/to delete them/

> +
>   `-v`::
>   `-vv`::
>   `--verbose`::
> diff --git a/builtin/branch.c b/builtin/branch.c
> index 2cc5a8cde0..af37a0ceb7 100644
> --- a/builtin/branch.c
> +++ b/builtin/branch.c
> @@ -38,6 +38,7 @@ static const char * const builtin_branch_usage[] = {
>   	N_("git branch [<options>] (-c | -C) [<old-branch>] <new-branch>"),
>   	N_("git branch [<options>] [-r | -a] [--points-at]"),
>   	N_("git branch [<options>] [-r | -a] [--format]"),
> +	N_("git branch [<options>] --prune-merged <branch>..."),
>   	NULL
>   };
>   
> @@ -715,6 +716,61 @@ static int parse_opt_forked(const struct option *opt, const char *arg, int unset
>   	return 0;
>   }
>   
> +static int prune_merged_branches(int argc, const char **argv,
> +				 int quiet)
> +{
> +	struct ref_store *refs = get_main_ref_store(the_repository);
> +	struct ref_filter filter = REF_FILTER_INIT;
> +	struct ref_array candidates;
> +	struct strvec deletable = STRVEC_INIT;
> +	int i, ret = 0;
> +
> +	if (!argc)
> +		die(_("--prune-merged requires at least one <branch>"));
> +
> +	for (i = 0; i < argc; i++)
> +		if (ref_filter_forked_add(&filter, argv[i]) < 0)
> +			die(_("'%s' is not a valid branch or pattern"), argv[i]);
> +
> +	filter.kind = FILTER_REFS_BRANCHES;
> +	memset(&candidates, 0, sizeof(candidates));

It would be nicer to add "= { 0 }" to the declaration of candidates above.

> +	filter_refs(&candidates, &filter, filter.kind);
> +
> +	for (i = 0; i < candidates.nr; i++) {
> +		const char *full_name = candidates.items[i]->refname;
> +		const char *short_name;
> +		struct branch *branch;
> +		const char *upstream, *push;
> +
> +		if (!skip_prefix(full_name, "refs/heads/", &short_name))
> +			continue;

If we've set filter.kind = FILTER_REFS_BRANCHS how can this condition fail?

> +		if (branch_checked_out(full_name))
> +			continue;
> +
> +		branch = branch_get(short_name);
> +		upstream = branch ? branch_get_upstream(branch, NULL) : NULL;

How can branch be NULL? Don't we require branch_get() to succeed in 
order to filter it?

> +		if (!upstream || !refs_ref_exists(refs, upstream))
> +			continue;
> +		push = branch ? branch_get_push(branch, NULL) : NULL;
> +		if (!push || !strcmp(push, upstream))
> +			continue;

By the time we've reached this point we know that 
branch@{upstream}exists and does not match branch@{push} - good

> +		strvec_push(&deletable, short_name);
> +	}
> +
> +	if (deletable.nr)
> +		ret = delete_branches(deletable.nr, deletable.v,
> +				      FILTER_REFS_BRANCHES,
> +				      DELETE_BRANCH_WARN_ONLY |
> +				      DELETE_BRANCH_NO_HEAD_FALLBACK |
> +				      (quiet ? DELETE_BRANCH_QUIET : 0));

Here we delete the branches - good.
> +		OPT_BOOL(0, "prune-merged", &prune_merged,
> +			N_("delete local branches whose upstream matches <branch> and is merged")),

s/is/are/

Sorry I didn't get round to reviewing these last week, I'll try and take 
a look at the tests and the other patches tomorrow

Thanks

Phillip

> diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
> index 4e7deddc04..27ea1319bb 100755
> --- a/t/t3200-branch.sh
> +++ b/t/t3200-branch.sh
> @@ -1809,4 +1809,205 @@ test_expect_success '--forked requires a value' '
>   	test_grep "requires a value" err
>   '
>   
> +test_expect_success '--prune-merged: setup' '
> +	test_create_repo pm-upstream &&
> +	test_commit -C pm-upstream base &&
> +	git -C pm-upstream checkout -b next &&
> +	test_commit -C pm-upstream one-commit &&
> +	test_commit -C pm-upstream two-commit &&
> +	git -C pm-upstream branch one HEAD~ &&
> +	git -C pm-upstream branch two HEAD &&
> +	git -C pm-upstream branch wip main &&
> +	git -C pm-upstream checkout main &&
> +	test_create_repo pm-fork
> +'
> +
> +test_expect_success '--prune-merged deletes branches integrated into upstream' '
> +	test_when_finished "rm -rf pm-merged" &&
> +	git clone pm-upstream pm-merged &&
> +	git -C pm-merged remote add fork ../pm-fork &&
> +	test_config -C pm-merged remote.pushDefault fork &&
> +	test_config -C pm-merged push.default current &&
> +	git -C pm-merged branch one one-commit &&
> +	git -C pm-merged branch --set-upstream-to=origin/next one &&
> +	git -C pm-merged branch two two-commit &&
> +	git -C pm-merged branch --set-upstream-to=origin/next two &&
> +
> +	git -C pm-merged branch --prune-merged "origin/*" &&
> +
> +	test_must_fail git -C pm-merged rev-parse --verify refs/heads/one &&
> +	test_must_fail git -C pm-merged rev-parse --verify refs/heads/two
> +'
> +
> +test_expect_success '--prune-merged accepts a literal upstream' '
> +	test_when_finished "rm -rf pm-literal" &&
> +	git clone pm-upstream pm-literal &&
> +	git -C pm-literal remote add fork ../pm-fork &&
> +	test_config -C pm-literal remote.pushDefault fork &&
> +	test_config -C pm-literal push.default current &&
> +	git -C pm-literal branch one one-commit &&
> +	git -C pm-literal branch --set-upstream-to=origin/next one &&
> +
> +	git -C pm-literal branch --prune-merged origin/next &&
> +
> +	test_must_fail git -C pm-literal rev-parse --verify refs/heads/one
> +'
> +
> +test_expect_success '--prune-merged unions multiple <branch> arguments' '
> +	test_when_finished "rm -rf pm-union" &&
> +	git clone pm-upstream pm-union &&
> +	git -C pm-union remote add fork ../pm-fork &&
> +	test_config -C pm-union remote.pushDefault fork &&
> +	test_config -C pm-union push.default current &&
> +	git -C pm-union branch one one-commit &&
> +	git -C pm-union branch --set-upstream-to=origin/next one &&
> +	git -C pm-union branch two base &&
> +	git -C pm-union branch --set-upstream-to=origin/main two &&
> +	git -C pm-union checkout --detach &&
> +
> +	git -C pm-union branch --prune-merged origin/next origin/main &&
> +
> +	test_must_fail git -C pm-union rev-parse --verify refs/heads/one &&
> +	test_must_fail git -C pm-union rev-parse --verify refs/heads/two
> +'
> +
> +test_expect_success '--prune-merged accepts a local upstream' '
> +	test_when_finished "rm -rf pm-local" &&
> +	git clone pm-upstream pm-local &&
> +	git -C pm-local remote add fork ../pm-fork &&
> +	test_config -C pm-local remote.pushDefault fork &&
> +	test_config -C pm-local push.default current &&
> +	git -C pm-local checkout -b trunk &&
> +	git -C pm-local branch one one-commit &&
> +	git -C pm-local branch --set-upstream-to=trunk one &&
> +	git -C pm-local merge --ff-only one-commit &&
> +
> +	git -C pm-local branch --prune-merged trunk &&
> +
> +	test_must_fail git -C pm-local rev-parse --verify refs/heads/one
> +'
> +
> +test_expect_success '--prune-merged warns instead of erroring on un-integrated commits' '
> +	test_when_finished "rm -rf pm-unmerged" &&
> +	git clone pm-upstream pm-unmerged &&
> +	git -C pm-unmerged remote add fork ../pm-fork &&
> +	test_config -C pm-unmerged remote.pushDefault fork &&
> +	test_config -C pm-unmerged push.default current &&
> +	git -C pm-unmerged checkout -b wip origin/wip &&
> +	git -C pm-unmerged branch --set-upstream-to=origin/next wip &&
> +	test_commit -C pm-unmerged local-only &&
> +	git -C pm-unmerged checkout - &&
> +
> +	git -C pm-unmerged branch --prune-merged "origin/*" 2>err &&
> +	test_grep "not fully merged" err &&
> +	test_grep ! "If you are sure you want to delete it" err &&
> +	git -C pm-unmerged rev-parse --verify refs/heads/wip
> +'
> +
> +test_expect_success '--prune-merged is silent about not-merged-to-HEAD' '
> +	test_when_finished "rm -rf pm-nohead" &&
> +	git clone pm-upstream pm-nohead &&
> +	git -C pm-nohead remote add fork ../pm-fork &&
> +	test_config -C pm-nohead remote.pushDefault fork &&
> +	test_config -C pm-nohead push.default current &&
> +	git -C pm-nohead branch topic one-commit &&
> +	git -C pm-nohead branch --set-upstream-to=origin/next topic &&
> +
> +	git -C pm-nohead branch --prune-merged "origin/*" 2>err &&
> +
> +	test_grep ! "not yet merged to HEAD" err &&
> +	test_must_fail git -C pm-nohead rev-parse --verify refs/heads/topic
> +'
> +
> +test_expect_success '--prune-merged skips branches whose upstream is gone' '
> +	test_when_finished "rm -rf pm-upstream-gone" &&
> +	git clone pm-upstream pm-upstream-gone &&
> +	git -C pm-upstream-gone remote add fork ../pm-fork &&
> +	test_config -C pm-upstream-gone remote.pushDefault fork &&
> +	test_config -C pm-upstream-gone push.default current &&
> +	git -C pm-upstream-gone branch one one-commit &&
> +	git -C pm-upstream-gone branch --set-upstream-to=origin/next one &&
> +
> +	git -C pm-upstream-gone update-ref -d refs/remotes/origin/next &&
> +	git -C pm-upstream-gone branch --prune-merged "origin/*" &&
> +
> +	git -C pm-upstream-gone rev-parse --verify refs/heads/one
> +'
> +
> +test_expect_success '--prune-merged never deletes the checked-out branch' '
> +	test_when_finished "rm -rf pm-head" &&
> +	git clone pm-upstream pm-head &&
> +	git -C pm-head remote add fork ../pm-fork &&
> +	test_config -C pm-head remote.pushDefault fork &&
> +	test_config -C pm-head push.default current &&
> +	git -C pm-head checkout -b one one-commit &&
> +	git -C pm-head branch --set-upstream-to=origin/next one &&
> +
> +	git -C pm-head branch --prune-merged "origin/*" &&
> +
> +	git -C pm-head rev-parse --verify refs/heads/one
> +'
> +
> +test_expect_success '--prune-merged spares branches that push back to their upstream' '
> +	test_when_finished "rm -rf pm-push-eq" &&
> +	git clone pm-upstream pm-push-eq &&
> +	git -C pm-push-eq checkout --detach &&
> +
> +	git -C pm-push-eq branch --prune-merged "origin/*" &&
> +
> +	git -C pm-push-eq rev-parse --verify refs/heads/main
> +'
> +
> +test_expect_success '--prune-merged spares a per-branch pushRemote==upstream remote' '
> +	test_when_finished "rm -rf pm-push-branch" &&
> +	git clone pm-upstream pm-push-branch &&
> +	git -C pm-push-branch remote add fork ../pm-fork &&
> +	test_config -C pm-push-branch remote.pushDefault fork &&
> +	test_config -C pm-push-branch push.default current &&
> +	test_config -C pm-push-branch branch.main.pushRemote origin &&
> +	git -C pm-push-branch checkout --detach &&
> +
> +	git -C pm-push-branch branch --prune-merged "origin/*" &&
> +
> +	git -C pm-push-branch rev-parse --verify refs/heads/main
> +'
> +
> +test_expect_success '--prune-merged prunes when @{push} differs from @{upstream}' '
> +	test_when_finished "rm -rf pm-push-diff" &&
> +	git clone pm-upstream pm-push-diff &&
> +	git -C pm-push-diff remote add fork ../pm-fork &&
> +	test_config -C pm-push-diff remote.pushDefault fork &&
> +	test_config -C pm-push-diff push.default current &&
> +	git -C pm-push-diff branch topic one-commit &&
> +	git -C pm-push-diff branch --set-upstream-to=origin/next topic &&
> +	git -C pm-push-diff checkout --detach &&
> +
> +	git -C pm-push-diff branch --prune-merged "origin/*" &&
> +
> +	test_must_fail git -C pm-push-diff rev-parse --verify refs/heads/topic
> +'
> +
> +test_expect_success '--prune-merged requires at least one <branch>' '
> +	test_must_fail git -C forked branch --prune-merged 2>err &&
> +	test_grep "requires at least one <branch>" err
> +'
> +
> +test_expect_success '--prune-merged takes positional <branch> arguments' '
> +	test_when_finished "rm -rf pm-positional" &&
> +	git clone pm-upstream pm-positional &&
> +	git -C pm-positional remote add fork ../pm-fork &&
> +	test_config -C pm-positional remote.pushDefault fork &&
> +	test_config -C pm-positional push.default current &&
> +	git -C pm-positional branch one one-commit &&
> +	git -C pm-positional branch --set-upstream-to=origin/next one &&
> +	git -C pm-positional branch two base &&
> +	git -C pm-positional branch --set-upstream-to=origin/main two &&
> +	git -C pm-positional checkout --detach &&
> +
> +	git -C pm-positional branch --prune-merged origin/next origin/main &&
> +
> +	test_must_fail git -C pm-positional rev-parse --verify refs/heads/one &&
> +	test_must_fail git -C pm-positional rev-parse --verify refs/heads/two
> +'
> +
>   test_done


^ permalink raw reply

* Re: [PATCH 7/7] odb: use size_t for object_info.sizep and the size APIs
From: Johannes Schindelin @ 2026-06-15  9:29 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Johannes Schindelin via GitGitGadget, git, Kristofer Karlsson
In-Reply-To: <aibJZ8EXoQSD2lsB@pks.im>

Hi Patrick,

On Mon, 15 Jun 2026, Patrick Steinhardt wrote:

> On Thu, Jun 04, 2026 at 10:51:12AM +0000, Johannes Schindelin via GitGitGadget wrote:
> > diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> > index fa45f774d7..fa6e396ddc 100644
> > --- a/builtin/cat-file.c
> > +++ b/builtin/cat-file.c
> > @@ -120,7 +120,7 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
> >  	struct object_id oid;
> >  	enum object_type type;
> >  	char *buf;
> > -	unsigned long size;
> > +	size_t size;
> >  	struct object_context obj_context = {0};
> >  	struct object_info oi = OBJECT_INFO_INIT;
> >  	unsigned flags = OBJECT_INFO_LOOKUP_REPLACE;
> > @@ -166,7 +166,7 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
> >  		if (use_mailmap && (type == OBJ_COMMIT || type == OBJ_TAG)) {
> >  			size_t s = size;
> >  			buf = replace_idents_using_mailmap(buf, &s);
> > -			size = cast_size_t_to_ulong(s);
> > +			size = s;
> >  		}
> >  
> >  		printf("%"PRIuMAX"\n", (uintmax_t)size);
> 
> Can't we drop this local variable completely and instead supply `&size`
> directly?

Well spotted!

> > @@ -219,7 +225,7 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
> >  		if (use_mailmap) {
> >  			size_t s = size;
> >  			buf = replace_idents_using_mailmap(buf, &s);
> > -			size = cast_size_t_to_ulong(s);
> > +			size = s;
> >  		}
> >  
> >  		/* otherwise just spit out the data */
> > @@ -266,7 +272,7 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
> >  		if (use_mailmap) {
> >  			size_t s = size;
> >  			buf = replace_idents_using_mailmap(buf, &s);
> > -			size = cast_size_t_to_ulong(s);
> > +			size = s;
> >  		}
> >  		break;
> >  	}
> > @@ -446,7 +455,7 @@ static void print_object_or_die(struct batch_options *opt, struct expand_data *d
> >  		if (use_mailmap) {
> >  			size_t s = size;
> >  			contents = replace_idents_using_mailmap(contents, &s);
> > -			size = cast_size_t_to_ulong(s);
> > +			size = s;
> >  		}
> >  
> >  		if (type != data->type)
> 
> Likewise for these three instances.

I totally agree.

> > @@ -555,7 +564,7 @@ static void batch_object_write(const char *obj_name,
> >  			if (!buf)
> >  				die(_("unable to read %s"), oid_to_hex(&data->oid));
> >  			buf = replace_idents_using_mailmap(buf, &s);
> > -			data->size = cast_size_t_to_ulong(s);
> > +			data->size = s;
> >  
> >  			free(buf);
> >  		}
> 
> And I think this site here can be adapted, as well.

Indeed!

> > diff --git a/diff.c b/diff.c
> > index 5a584fa1d5..816b89dc6c 100644
> > --- a/diff.c
> > +++ b/diff.c
> > @@ -4594,8 +4594,9 @@ int diff_populate_filespec(struct repository *r,
> >  		}
> >  	}
> >  	else {
> > +		size_t size_st = 0;
> >  		struct object_info info = {
> > -			.sizep = &s->size
> > +			.sizep = &size_st
> >  		};
> >  
> >  		if (!(size_only || check_binary))
> > @@ -4617,6 +4618,7 @@ int diff_populate_filespec(struct repository *r,
> >  			die("unable to read %s", oid_to_hex(&s->oid));
> >  
> >  object_read:
> > +		s->size = cast_size_t_to_ulong(size_st);
> >  		if (size_only || check_binary) {
> >  			if (size_only)
> >  				return 0;
> > @@ -4631,6 +4633,7 @@ object_read:
> >  			if (odb_read_object_info_extended(r->objects, &s->oid, &info,
> >  							  OBJECT_INFO_LOOKUP_REPLACE))
> >  				die("unable to read %s", oid_to_hex(&s->oid));
> > +			s->size = cast_size_t_to_ulong(size_st);
> >  		}
> >  		s->should_free = 1;
> >  	}
> 
> The flow in this function is quite weird if you ask me, but that's a
> preexisting issue. This does look correct to me, even if it's awkward.

Yes, on all four accounts.

Ciao,
Johannes

^ permalink raw reply

* Re: [PATCH 4/7] packfile: widen unpack_entry()'s size out-parameter to size_t
From: Johannes Schindelin @ 2026-06-15  9:29 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Johannes Schindelin via GitGitGadget, git, Kristofer Karlsson
In-Reply-To: <aibJW3h4PaYhOqFb@pks.im>

Hi Patrick,

On Mon, 15 Jun 2026, Patrick Steinhardt wrote:

> On Thu, Jun 04, 2026 at 10:51:09AM +0000, Johannes Schindelin via GitGitGadget wrote:
> > diff --git a/builtin/fast-import.c b/builtin/fast-import.c
> > index 82bc6dcc00..3dff898c43 100644
> > --- a/builtin/fast-import.c
> > +++ b/builtin/fast-import.c
> > @@ -1239,6 +1239,8 @@ static void *gfi_unpack_entry(
> >  	unsigned long *sizep)
> >  {
> >  	enum object_type type;
> > +	size_t size_st = 0;
> > +	void *data;
> >  	struct packed_git *p = all_packs[oe->pack_id];
> >  	if (p == pack_data && p->pack_size < (pack_size + the_hash_algo->rawsz)) {
> >  		/* The object is stored in the packfile we are writing to
> > @@ -1260,7 +1262,10 @@ static void *gfi_unpack_entry(
> >  		 */
> >  		p->pack_size = pack_size + the_hash_algo->rawsz;
> >  	}
> > -	return unpack_entry(the_repository, p, oe->idx.offset, &type, sizep);
> > +	data = unpack_entry(the_repository, p, oe->idx.offset, &type, &size_st);
> > +	if (sizep)
> > +		*sizep = cast_size_t_to_ulong(size_st);
> > +	return data;
> >  }
> 
> Nit, please feel free to ignore: do we want to add a NEEDSWORK comment
> here?

Hehe... My mind translates the `cast_size_t_to_ulong()` function to
"NEEDSWORK!" already ;-)

Ciao,
Johannes

^ permalink raw reply

* Re: [PATCH 3/7] pack-objects(check_pack_inflate()): use size_t instead of unsigned long
From: Johannes Schindelin @ 2026-06-15  9:29 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Johannes Schindelin via GitGitGadget, git, Kristofer Karlsson
In-Reply-To: <aibJVSrKPCfDVXw7@pks.im>

Hi Patrick,

On Mon, 15 Jun 2026, Patrick Steinhardt wrote:

> On Thu, Jun 04, 2026 at 10:51:08AM +0000, Johannes Schindelin via GitGitGadget wrote:
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> > 
> > `write_reuse_object()` learned to track its packed-object size as
> > `size_t` in 606c192380 (odb, packfile: use size_t for streaming
> > object sizes, 2026-05-08), but the comparison sink it feeds,
> > `check_pack_inflate()`, still takes the expected decompressed size
> > as `unsigned long`. The call site bridges the mismatch with
> > `cast_size_t_to_ulong()`, which on Windows turns a >4 GiB object
> > into an immediate die().
> > 
> > That function only uses `expect` once: as the right-hand side of a
> > `stream.total_out == expect` equality test against zlib's counter.
> > zlib's own `total_out` counter is `uLong` and is therefore still
> > 32-bit-bound on Windows. Widening `expect` to `size_t` cannot fix that,
> > but it is a strict improvement nonetheless: instead of dying outright,
> > an oversized object now simply makes the equality fail and lets
> > `write_reuse_object()` fall back to `write_no_reuse_object()`, which
> > decompresses and re-deflates the content (and which the larger
> > pack-objects widening series targets separately).
> 
> Hm. I wonder whether it's possible to reset `stream.total_out` on every
> iteration and instead have a local `size_t` variable that we use to
> track the total number of inflated bytes?

Possible? Yes. Appropriate? Unlikely. We would now pretend to have
inflated less bytes, _just_ to appease a data type limitation that we
already worked around in d05d666977 (git-zlib: handle data streams larger
than 4GB, 2026-05-08).

Ciao,
Johannes

^ permalink raw reply

* Re: [PATCH 2/7] patch-delta: use size_t for sizes
From: Johannes Schindelin @ 2026-06-15  9:29 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Johannes Schindelin via GitGitGadget, git, Kristofer Karlsson
In-Reply-To: <aibJTHKsmqe_EJHc@pks.im>

Hi Patrick,

On Mon, 15 Jun 2026, Patrick Steinhardt wrote:

> On Thu, Jun 04, 2026 at 10:51:07AM +0000, Johannes Schindelin via GitGitGadget wrote:
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> > 
> > `patch_delta()` takes the source and delta sizes by value and writes
> > back the reconstructed target size through an `unsigned long *`.  That
> > datatype cannot represent a value that exceeds 4 GiB on systems where
> > `unsigned long` is 32-bit (notably 64-bit Windows builds), though, even
> > though the delta encoding itself, the on-disk layout, and the in-memory
> > buffers happily carry such sizes. A `size_t` companion to
> > `get_delta_hdr_size()`, `get_delta_hdr_size_sz()`, was introduced in
> > 17fa077596 (delta, packfile: use size_t for delta header sizes,
> > 2026-05-08) precisely so that `patch_delta()` could be widened without
> > changing the on-the-wire decoding helper's signature.
> > 
> > Widen `patch_delta()`'s three size parameters to `size_t` and switch
> > its internal use of `get_delta_hdr_size()` to the `_sz` variant.
> > Then propagate the wider type through the callers.
> 
> Does `get_delta_hdr_size()` have any remaining callers after this patch
> series? I currently only spot two such callers, and you convert both of
> them in this patch.

As you noticed later on in the review: No, there are no such callers left,
and the `_sz` variant gets renamed, concluding the incremental migration
of that function from `unsigned long` to `size_t`.

> And can we reasonably add a test case that exercises this change?

Not reasonably, no. This would require constructing another artificial
_large_ object, this time with an unpacked Git object with a size >=4GB
that needs to be transmogrified into a different object.

Better leave the verification of this patch to static analysis (GCC or
Clang have become quite good at spotting things like this; Coverity would
be, too, if it ever comes back up from its "upgrades to the Scan servers",
https://web.archive.org/web/20260516152422/https://scan.coverity.com/
seems to be the start date of this update).

> 
> > diff --git a/packfile.c b/packfile.c
> > index 89366abfe3..e202f48837 100644
> > --- a/packfile.c
> > +++ b/packfile.c
> > @@ -1964,10 +1964,8 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
> >  			      (uintmax_t)curpos, p->pack_name);
> >  			data = NULL;
> >  		} else {
> > -			unsigned long sz;
> >  			data = patch_delta(base, base_size, delta_data,
> > -					   delta_size, &sz);
> > -			size = sz;
> > +					   delta_size, &size);
> 
> Nice that we get rid of this awkward construct.

Awkward, but necessary to allow for an incremental, reviewable conversion
;-)

Ciao,
Johannes

^ permalink raw reply

* Re: [PATCH v3] update-ref: add --rename option
From: Patrick Steinhardt @ 2026-06-15  8:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqqqzmbhikj.fsf@gitster.g>

On Fri, Jun 12, 2026 at 08:41:48AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > A slight tangent: this is part of why I really don't like commands that
> > determine their mode via flags: you now have to worry about every
> > combination of flags and whether they even make sense. With subcommands
> > we at least only have to worry about the set of flags that directly
> > apply to that given subcommand.
> >
> > Makes me wonder whether I should have a look at extending git-refs(1)
> > further:
> >
> >     git refs delete <ref> [<oldvalue>]
> >     git refs update <ref> <newvalue> [<oldvalue>]
> >     git refs rename <ref> <oldname> <newname>
> >
> > I always wanted to do this eventually so that we have one top-level
> > command that knows how to do "everything refs".
> 
> That may indeed be a better direction to go, but isn't update-ref
> the "everything refs" command already?

Well, it doesn't handle reading references, which is something that
git-refs(1) already knows to do.

Patrick

^ permalink raw reply

* Re: [PATCH v4 0/3] compat/posix.h: enable UNUSED warning messages for Clang
From: Patrick Steinhardt @ 2026-06-15  8:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Dominik Loidolt, git, asedeno, asedeno, avarab
In-Reply-To: <xmqqse6qe6oo.fsf@gitster.g>

On Sat, Jun 13, 2026 at 09:39:03AM -0700, Junio C Hamano wrote:
> Dominik Loidolt <dominik.loidolt@univie.ac.at> writes:
> 
> > This series enables the intended UNUSED warning message with Clang by
> > adding a dedicated Clang version check. It also cleans up the nearby
> > GIT_GNUC_PREREQ() and UNUSED macros.
> >
> > Changes since v3:
> > - split style-only cleanups into their own patch
> > - fix the UNUSED preprocessor indentation style
> > - simplify the GIT_GNUC_PREREQ() comparison commit message
> > - keep the Clang-specific note in the patch that adds GIT_CLANG_PREREQ()
> >
> > Thanks,
> >  Dominik
> >
> > Dominik Loidolt (3):
> >   compat/posix.h: enable UNUSED warning messages for Clang
> >   compat/posix.h: clean up GIT_GNUC_PREREQ() and UNUSED
> >   compat/posix.h: simplify GIT_GNUC_PREREQ() comparison
> 
> Looking good and all the points Patrick raised during the review of
> the previous round seem to have been addressed nicely.
> 
> Will replace.  Shall we mark it for 'next' now?

Yeah, I'm happy with this version. Thanks!

Patrick

^ permalink raw reply

* Re: [PATCH v2] gitattributes: fix eol attribute for Perl scripts
From: Patrick Steinhardt @ 2026-06-15  8:37 UTC (permalink / raw)
  To: Koutian Wu via GitGitGadget; +Cc: git, Koutian Wu
In-Reply-To: <pull.2151.v2.git.1781510039164.gitgitgadget@gmail.com>

On Mon, Jun 15, 2026 at 07:53:58AM +0000, Koutian Wu via GitGitGadget wrote:
> Range-diff vs v1:
> 
>  1:  92ba4d499d ! 1:  f4b4ca30c7 gitattributes: fix eol attribute for Perl scripts
>      @@
>        ## Metadata ##
>      -Author: ktwu01 <ktwu01@gmail.com>
>      +Author: Koutian Wu <ktwu01@gmail.com>
>       
>        ## Commit message ##
>           gitattributes: fix eol attribute for Perl scripts
>      @@ Commit message
>           Use eol=lf instead, matching the neighboring *.perl and *.pm rules, so
>           Perl scripts are checked out with LF line endings.
>       
>      -    Signed-off-by: ktwu01 <ktwu01@gmail.com>
>      +    Signed-off-by: Koutian Wu <ktwu01@gmail.com>
>       
>        ## .gitattributes ##
>       @@

Thanks, this version looks good to me!

Patrick

^ permalink raw reply

* [PATCH v2 2/2] rebase: add --squash to fold a range
From: Harald Nordgren via GitGitGadget @ 2026-06-15  8:37 UTC (permalink / raw)
  To: git; +Cc: Harald Nordgren, Harald Nordgren
In-Reply-To: <pull.2337.v2.git.git.1781512625.gitgitgadget@gmail.com>

From: Harald Nordgren <haraldnordgren@gmail.com>

Folding a series of commits into one required either an interactive
rebase where each commit after the first was hand-edited to "fixup", or
a "git reset --soft" to the merge base followed by "git commit --amend".

Add "git rebase --squash [<upstream>]" to do this directly. It keeps
the first commit in the range as a "pick" and turns every later commit
into a "fixup", so the whole range collapses into a single commit that
reuses the first commit's message. With no <upstream> argument the range
is "@{upstream}..HEAD", folding all unpushed commits into one.

The option implies the merge backend, so it works on its own without
--autosquash. Fold the commits in their original order, so that any
fixup!/squash! commits already present in the range are folded in as
well. Reject --rebase-merges since a merge commit cannot be folded into
another commit.

Inspired-by: Sergey Chernov <serega.morph@gmail.com>
Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
---
 Documentation/git-rebase.adoc |  11 ++++
 builtin/rebase.c              |  16 ++++-
 sequencer.c                   |  24 ++++++-
 sequencer.h                   |   2 +-
 t/t3415-rebase-autosquash.sh  | 117 ++++++++++++++++++++++++++++++++++
 5 files changed, 165 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-rebase.adoc b/Documentation/git-rebase.adoc
index f6c22d1598..4244288148 100644
--- a/Documentation/git-rebase.adoc
+++ b/Documentation/git-rebase.adoc
@@ -602,6 +602,16 @@ option can be used to override that setting.
 +
 See also INCOMPATIBLE OPTIONS below.
 
+--squash::
+	Keep the first commit in the range as a `pick` and change every later
+	commit to a `fixup`, so the whole range is folded into a single commit
+	that reuses the first commit's message.  With no `<upstream>` argument
+	this folds all commits since `@{upstream}` into one.  The commits are
+	folded in their original order, so any `fixup!`/`squash!` commits
+	already in the range are folded in as well.  Cannot be combined with
+	`--rebase-merges`, as a merge commit cannot be folded into another
+	commit.
+
 --autostash::
 --no-autostash::
 	Automatically create a temporary stash entry before the operation
@@ -652,6 +662,7 @@ are incompatible with the following options:
  * --strategy
  * --strategy-option
  * --autosquash
+ * --squash
  * --rebase-merges
  * --interactive
  * --exec
diff --git a/builtin/rebase.c b/builtin/rebase.c
index fa4f5d9306..2df9f04728 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -118,6 +118,7 @@ struct rebase_options {
 	int allow_rerere_autoupdate;
 	int keep_empty;
 	int autosquash;
+	int squash;
 	char *gpg_sign_opt;
 	int autostash;
 	int committer_date_is_author_date;
@@ -329,7 +330,8 @@ static int do_interactive_rebase(struct rebase_options *opts, unsigned flags)
 	ret = complete_action(the_repository, &replay, flags,
 		shortrevisions, opts->onto_name, opts->onto,
 		&opts->orig_head->object.oid, &opts->exec,
-		opts->autosquash, opts->update_refs, &todo_list);
+		opts->autosquash, opts->squash, opts->update_refs,
+		&todo_list);
 
 cleanup:
 	replay_opts_release(&replay);
@@ -1205,6 +1207,8 @@ int cmd_rebase(int argc,
 		OPT_BOOL(0, "autosquash", &options.autosquash,
 			 N_("move commits that begin with "
 			    "squash!/fixup! under -i")),
+		OPT_BOOL(0, "squash", &options.squash,
+			 N_("fold all commits in the range into the first one")),
 		OPT_BOOL(0, "update-refs", &options.update_refs,
 			 N_("update branches that point to commits "
 			    "that are being rebased")),
@@ -1471,7 +1475,8 @@ int cmd_rebase(int argc,
 	if ((options.flags & REBASE_INTERACTIVE_EXPLICIT) ||
 	    (options.action != ACTION_NONE) ||
 	    (options.exec.nr > 0) ||
-	    options.autosquash == 1) {
+	    options.autosquash == 1 ||
+	    options.squash) {
 		allow_preemptive_ff = 0;
 	}
 	if (options.committer_date_is_author_date || options.ignore_date)
@@ -1594,6 +1599,13 @@ int cmd_rebase(int argc,
 	options.rebase_merges = (options.rebase_merges >= 0) ? options.rebase_merges :
 				((options.config_rebase_merges >= 0) ? options.config_rebase_merges : 0);
 
+	if (options.squash && options.rebase_merges)
+		die(_("options '%s' and '%s' cannot be used together"),
+		    "--squash", "--rebase-merges");
+
+	if (options.squash)
+		imply_merge(&options, "--squash");
+
 	if (options.autosquash == 1) {
 		imply_merge(&options, "--autosquash");
 	} else if (options.autosquash == -1) {
diff --git a/sequencer.c b/sequencer.c
index 57855b0066..bb42b40796 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -6554,11 +6554,29 @@ static int todo_list_add_update_ref_commands(struct todo_list *todo_list)
 	return 0;
 }
 
+static void todo_list_fixup_all_but_first(struct todo_list *todo_list)
+{
+	int i, seen_first = 0;
+
+	for (i = 0; i < todo_list->nr; i++) {
+		struct todo_item *item = todo_list->items + i;
+
+		if (!item->commit || item->command == TODO_DROP)
+			continue;
+		if (!seen_first) {
+			seen_first = 1;
+			item->command = TODO_PICK;
+			continue;
+		}
+		item->command = TODO_FIXUP;
+	}
+}
+
 int complete_action(struct repository *r, struct replay_opts *opts, unsigned flags,
 		    const char *shortrevisions, const char *onto_name,
 		    struct commit *onto, const struct object_id *orig_head,
 		    struct string_list *commands, unsigned autosquash,
-		    unsigned update_refs,
+		    unsigned squash, unsigned update_refs,
 		    struct todo_list *todo_list)
 {
 	char shortonto[GIT_MAX_HEXSZ + 1];
@@ -6581,7 +6599,9 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
 	if (update_refs && todo_list_add_update_ref_commands(todo_list))
 		return -1;
 
-	if (autosquash && todo_list_rearrange_squash(todo_list))
+	if (squash)
+		todo_list_fixup_all_but_first(todo_list);
+	else if (autosquash && todo_list_rearrange_squash(todo_list))
 		return -1;
 
 	if (commands->nr)
diff --git a/sequencer.h b/sequencer.h
index 3164bd437d..1d5a164f02 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -196,7 +196,7 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
 		    const char *shortrevisions, const char *onto_name,
 		    struct commit *onto, const struct object_id *orig_head,
 		    struct string_list *commands, unsigned autosquash,
-		    unsigned update_refs,
+		    unsigned squash, unsigned update_refs,
 		    struct todo_list *todo_list);
 int todo_list_rearrange_squash(struct todo_list *todo_list);
 
diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh
index 8964d1cc88..ce9abe5147 100755
--- a/t/t3415-rebase-autosquash.sh
+++ b/t/t3415-rebase-autosquash.sh
@@ -511,4 +511,121 @@ test_expect_success 'pick and fixup respect commit.cleanup' '
 	test_commit_message HEAD -m "something"
 '
 
+test_expect_success '--squash folds the range into the first commit' '
+	git reset --hard base &&
+	test_commit --no-tag fold1 file_fold a &&
+	test_commit --no-tag fold2 file_fold b &&
+	test_commit --no-tag fold3 file_fold c &&
+	git rebase --squash HEAD~3 &&
+	test_cmp_rev base HEAD~1 &&
+	test_commit_message HEAD -m "fold1" &&
+	echo c >expect &&
+	test_cmp expect file_fold
+'
+
+test_expect_success '--squash folds smoothly when a fixup! commit is in the series' '
+	git reset --hard base &&
+	test_commit --no-tag foldA file_fold a &&
+	test_commit --no-tag foldB file_fold b &&
+	git commit --allow-empty --fixup HEAD~1 &&
+	git rebase --squash HEAD~3 &&
+	test_cmp_rev base HEAD~1 &&
+	test_commit_message HEAD -m "foldA" &&
+	echo b >expect &&
+	test_cmp expect file_fold
+'
+
+test_expect_success '--squash picks the first commit even if it is a fixup!' '
+	git reset --hard base &&
+	test_commit --no-tag fixupbase file_fix a &&
+	git commit --allow-empty --fixup HEAD &&
+	test_commit --no-tag fixuptail file_fix b &&
+	git rebase --squash HEAD~3 &&
+	test_cmp_rev base HEAD~1 &&
+	echo b >expect &&
+	test_cmp expect file_fix
+'
+
+test_expect_success '--squash with a single commit in range is a no-op' '
+	git reset --hard base &&
+	test_commit --no-tag solo file_solo a &&
+	git rev-parse HEAD >expect &&
+	git rebase --squash HEAD~1 &&
+	git rev-parse HEAD >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success '--squash with an empty range succeeds' '
+	git reset --hard base &&
+	git rebase --squash HEAD &&
+	test_cmp_rev base HEAD
+'
+
+test_expect_success '--squash skips a dropped commit in the range' '
+	git reset --hard base &&
+	test_commit --no-tag fixdrop1 file_drop a &&
+	git commit --allow-empty -m "empty in the middle" &&
+	test_commit --no-tag fixdrop3 file_drop b &&
+	git rebase --squash --empty=drop HEAD~3 &&
+	test_cmp_rev base HEAD~1 &&
+	test_commit_message HEAD -m "fixdrop1" &&
+	echo b >expect &&
+	test_cmp expect file_drop
+'
+
+test_expect_success '--squash folds a merge commit in the middle of the range' '
+	git reset --hard base &&
+	test_commit --no-tag mid-first &&
+	git checkout -b mid-side &&
+	test_commit --no-tag mid-merged &&
+	git checkout - &&
+	git merge --no-ff -m "merge mid-side" mid-side &&
+	test_commit --no-tag mid-last &&
+	git rebase --squash base &&
+	test_cmp_rev base HEAD~1 &&
+	test_commit_message HEAD -m "mid-first" &&
+	test_path_is_file mid-merged.t
+'
+
+test_expect_success '--squash keeps the first flattened commit when a merge sorts first' '
+	git reset --hard base &&
+	git checkout -b head-side &&
+	test_commit --no-tag head-merged &&
+	git checkout - &&
+	git merge --no-ff -m "merge head-side" head-side &&
+	test_commit --no-tag head-last &&
+	git rebase --squash base &&
+	test_cmp_rev base HEAD~1 &&
+	test_commit_message HEAD -m "head-merged" &&
+	test_path_is_file head-merged.t
+'
+
+test_expect_success '--squash takes precedence over --autosquash' '
+	git reset --hard base &&
+	test_commit --no-tag combo-first &&
+	test_commit --no-tag combo-mid &&
+	git commit --allow-empty --fixup HEAD~1 &&
+	test_commit --no-tag combo-last &&
+	git rebase --autosquash --squash base &&
+	test_cmp_rev base HEAD~1 &&
+	test_commit_message HEAD -m "combo-first"
+'
+
+test_expect_success '--squash folds the range with rebase.autosquash set' '
+	test_config rebase.autosquash true &&
+	git reset --hard base &&
+	test_commit --no-tag cfg-first &&
+	test_commit --no-tag cfg-last &&
+	git rebase --squash base &&
+	test_cmp_rev base HEAD~1 &&
+	test_commit_message HEAD -m "cfg-first"
+'
+
+test_expect_success '--squash and --rebase-merges cannot be combined' '
+	git reset --hard base &&
+	test_must_fail git rebase --rebase-merges --squash HEAD~1 2>err &&
+	test_grep "cannot be used together" err &&
+	test_path_is_missing .git/rebase-merge
+'
+
 test_done
-- 
gitgitgadget

^ permalink raw reply related

* [PATCH v2 1/2] t3415: remove prepare-commit-msg hook after use
From: Harald Nordgren via GitGitGadget @ 2026-06-15  8:37 UTC (permalink / raw)
  To: git; +Cc: Harald Nordgren, Harald Nordgren
In-Reply-To: <pull.2337.v2.git.git.1781512625.gitgitgadget@gmail.com>

From: Harald Nordgren <haraldnordgren@gmail.com>

The "pick and fixup respect commit.cleanup" test left its
prepare-commit-msg hook in place, leaking it into later tests. Remove it
with test_when_finished.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
---
 t/t3415-rebase-autosquash.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh
index 5033411a43..8964d1cc88 100755
--- a/t/t3415-rebase-autosquash.sh
+++ b/t/t3415-rebase-autosquash.sh
@@ -490,6 +490,7 @@ test_expect_success 'pick and fixup respect commit.cleanup' '
 	git reset --hard base &&
 	test_commit --no-tag "fixup! second commit" file1 fixup &&
 	test_commit something &&
+	test_when_finished "rm -f .git/hooks/prepare-commit-msg" &&
 	write_script .git/hooks/prepare-commit-msg <<-\EOF &&
 	printf "\n# Prepared\n" >> "$1"
 	EOF
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v2 0/2] rebase: add --squash to fold a range into its first commit
From: Harald Nordgren via GitGitGadget @ 2026-06-15  8:37 UTC (permalink / raw)
  To: git; +Cc: Harald Nordgren
In-Reply-To: <pull.2337.git.git.1781465141.gitgitgadget@gmail.com>

Rename to rebase --squash.

Harald Nordgren (2):
  t3415: remove prepare-commit-msg hook after use
  rebase: add --squash to fold a range

 Documentation/git-rebase.adoc |  11 ++++
 builtin/rebase.c              |  16 ++++-
 sequencer.c                   |  24 ++++++-
 sequencer.h                   |   2 +-
 t/t3415-rebase-autosquash.sh  | 118 ++++++++++++++++++++++++++++++++++
 5 files changed, 166 insertions(+), 5 deletions(-)


base-commit: ea97ad8d017de0c9037451a78008a0fd60abea0c
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2337%2FHaraldNordgren%2Frebase-fixup-fold-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2337/HaraldNordgren/rebase-fixup-fold-v2
Pull-Request: https://github.com/git/git/pull/2337

Range-diff vs v1:

 1:  c55b9cd6f7 = 1:  c55b9cd6f7 t3415: remove prepare-commit-msg hook after use
 2:  bd1bc62aa8 ! 2:  22d4276ff5 rebase: add --fixup-all to fold a range
     @@ Metadata
      Author: Harald Nordgren <haraldnordgren@gmail.com>
      
       ## Commit message ##
     -    rebase: add --fixup-all to fold a range
     +    rebase: add --squash to fold a range
      
          Folding a series of commits into one required either an interactive
          rebase where each commit after the first was hand-edited to "fixup", or
          a "git reset --soft" to the merge base followed by "git commit --amend".
      
     -    Add "git rebase --autosquash --fixup-all [<upstream>]" to do this
     -    directly. It keeps the first commit in the range as a "pick" and turns
     -    every later commit into a "fixup", so the whole range collapses into a
     -    single commit that reuses the first commit's message. With no <upstream>
     -    argument the range is "@{upstream}..HEAD", folding all unpushed commits
     -    into one.
     +    Add "git rebase --squash [<upstream>]" to do this directly. It keeps
     +    the first commit in the range as a "pick" and turns every later commit
     +    into a "fixup", so the whole range collapses into a single commit that
     +    reuses the first commit's message. With no <upstream> argument the range
     +    is "@{upstream}..HEAD", folding all unpushed commits into one.
      
     -    Fold the commits in their original order, so that any fixup!/squash!
     -    commits already present in the range are folded in as well. Allow the
     -    flag only together with --autosquash, and reject --rebase-merges since a
     -    merge commit cannot be folded into another commit.
     +    The option implies the merge backend, so it works on its own without
     +    --autosquash. Fold the commits in their original order, so that any
     +    fixup!/squash! commits already present in the range are folded in as
     +    well. Reject --rebase-merges since a merge commit cannot be folded into
     +    another commit.
      
     +    Inspired-by: Sergey Chernov <serega.morph@gmail.com>
          Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
      
       ## Documentation/git-rebase.adoc ##
     @@ Documentation/git-rebase.adoc: option can be used to override that setting.
       +
       See also INCOMPATIBLE OPTIONS below.
       
     -+--fixup-all::
     -+	Valid only when used with `--autosquash`.  Keep the first commit in
     -+	the range as a `pick` and change every later commit to a `fixup`, so
     -+	the whole range is folded into a single commit that reuses the first
     -+	commit's message.  With no `<upstream>` argument this folds all commits
     -+	since `@{upstream}` into one.  The commits are folded in their original
     -+	order, so any `fixup!`/`squash!` commits already in the range are folded
     -+	in as well.  Cannot be combined with `--rebase-merges`, as a merge
     -+	commit cannot be folded into another commit.
     ++--squash::
     ++	Keep the first commit in the range as a `pick` and change every later
     ++	commit to a `fixup`, so the whole range is folded into a single commit
     ++	that reuses the first commit's message.  With no `<upstream>` argument
     ++	this folds all commits since `@{upstream}` into one.  The commits are
     ++	folded in their original order, so any `fixup!`/`squash!` commits
     ++	already in the range are folded in as well.  Cannot be combined with
     ++	`--rebase-merges`, as a merge commit cannot be folded into another
     ++	commit.
      +
       --autostash::
       --no-autostash::
     @@ Documentation/git-rebase.adoc: are incompatible with the following options:
        * --strategy
        * --strategy-option
        * --autosquash
     -+ * --fixup-all
     ++ * --squash
        * --rebase-merges
        * --interactive
        * --exec
     @@ builtin/rebase.c: struct rebase_options {
       	int allow_rerere_autoupdate;
       	int keep_empty;
       	int autosquash;
     -+	int fixup_all;
     ++	int squash;
       	char *gpg_sign_opt;
       	int autostash;
       	int committer_date_is_author_date;
     @@ builtin/rebase.c: static int do_interactive_rebase(struct rebase_options *opts,
       		shortrevisions, opts->onto_name, opts->onto,
       		&opts->orig_head->object.oid, &opts->exec,
      -		opts->autosquash, opts->update_refs, &todo_list);
     -+		opts->autosquash, opts->fixup_all, opts->update_refs,
     ++		opts->autosquash, opts->squash, opts->update_refs,
      +		&todo_list);
       
       cleanup:
     @@ builtin/rebase.c: int cmd_rebase(int argc,
       		OPT_BOOL(0, "autosquash", &options.autosquash,
       			 N_("move commits that begin with "
       			    "squash!/fixup! under -i")),
     -+		OPT_BOOL(0, "fixup-all", &options.fixup_all,
     ++		OPT_BOOL(0, "squash", &options.squash,
      +			 N_("fold all commits in the range into the first one")),
       		OPT_BOOL(0, "update-refs", &options.update_refs,
       			 N_("update branches that point to commits "
       			    "that are being rebased")),
     +@@ builtin/rebase.c: int cmd_rebase(int argc,
     + 	if ((options.flags & REBASE_INTERACTIVE_EXPLICIT) ||
     + 	    (options.action != ACTION_NONE) ||
     + 	    (options.exec.nr > 0) ||
     +-	    options.autosquash == 1) {
     ++	    options.autosquash == 1 ||
     ++	    options.squash) {
     + 		allow_preemptive_ff = 0;
     + 	}
     + 	if (options.committer_date_is_author_date || options.ignore_date)
      @@ builtin/rebase.c: int cmd_rebase(int argc,
       	options.rebase_merges = (options.rebase_merges >= 0) ? options.rebase_merges :
       				((options.config_rebase_merges >= 0) ? options.config_rebase_merges : 0);
       
     -+	if (options.fixup_all && options.autosquash != 1)
     -+		die(_("--fixup-all requires --autosquash"));
     -+
     -+	if (options.fixup_all && options.rebase_merges)
     ++	if (options.squash && options.rebase_merges)
      +		die(_("options '%s' and '%s' cannot be used together"),
     -+		    "--fixup-all", "--rebase-merges");
     ++		    "--squash", "--rebase-merges");
     ++
     ++	if (options.squash)
     ++		imply_merge(&options, "--squash");
      +
       	if (options.autosquash == 1) {
       		imply_merge(&options, "--autosquash");
     @@ sequencer.c: static int todo_list_add_update_ref_commands(struct todo_list *todo
       		    struct commit *onto, const struct object_id *orig_head,
       		    struct string_list *commands, unsigned autosquash,
      -		    unsigned update_refs,
     -+		    unsigned fixup_all, unsigned update_refs,
     ++		    unsigned squash, unsigned update_refs,
       		    struct todo_list *todo_list)
       {
       	char shortonto[GIT_MAX_HEXSZ + 1];
     @@ sequencer.c: int complete_action(struct repository *r, struct replay_opts *opts,
       		return -1;
       
      -	if (autosquash && todo_list_rearrange_squash(todo_list))
     -+	if (fixup_all)
     ++	if (squash)
      +		todo_list_fixup_all_but_first(todo_list);
      +	else if (autosquash && todo_list_rearrange_squash(todo_list))
       		return -1;
     @@ sequencer.h: int complete_action(struct repository *r, struct replay_opts *opts,
       		    struct commit *onto, const struct object_id *orig_head,
       		    struct string_list *commands, unsigned autosquash,
      -		    unsigned update_refs,
     -+		    unsigned fixup_all, unsigned update_refs,
     ++		    unsigned squash, unsigned update_refs,
       		    struct todo_list *todo_list);
       int todo_list_rearrange_squash(struct todo_list *todo_list);
       
     @@ t/t3415-rebase-autosquash.sh: test_expect_success 'pick and fixup respect commit
       	test_commit_message HEAD -m "something"
       '
       
     -+test_expect_success '--fixup-all folds the range into the first commit' '
     ++test_expect_success '--squash folds the range into the first commit' '
      +	git reset --hard base &&
      +	test_commit --no-tag fold1 file_fold a &&
      +	test_commit --no-tag fold2 file_fold b &&
      +	test_commit --no-tag fold3 file_fold c &&
     -+	git rebase --autosquash --fixup-all HEAD~3 &&
     ++	git rebase --squash HEAD~3 &&
      +	test_cmp_rev base HEAD~1 &&
      +	test_commit_message HEAD -m "fold1" &&
      +	echo c >expect &&
      +	test_cmp expect file_fold
      +'
      +
     -+test_expect_success '--fixup-all folds smoothly when a fixup! commit is in the series' '
     ++test_expect_success '--squash folds smoothly when a fixup! commit is in the series' '
      +	git reset --hard base &&
      +	test_commit --no-tag foldA file_fold a &&
      +	test_commit --no-tag foldB file_fold b &&
      +	git commit --allow-empty --fixup HEAD~1 &&
     -+	git rebase --autosquash --fixup-all HEAD~3 &&
     ++	git rebase --squash HEAD~3 &&
      +	test_cmp_rev base HEAD~1 &&
      +	test_commit_message HEAD -m "foldA" &&
      +	echo b >expect &&
      +	test_cmp expect file_fold
      +'
      +
     -+test_expect_success '--fixup-all picks the first commit even if it is a fixup!' '
     ++test_expect_success '--squash picks the first commit even if it is a fixup!' '
      +	git reset --hard base &&
      +	test_commit --no-tag fixupbase file_fix a &&
      +	git commit --allow-empty --fixup HEAD &&
      +	test_commit --no-tag fixuptail file_fix b &&
     -+	git rebase --autosquash --fixup-all HEAD~3 &&
     ++	git rebase --squash HEAD~3 &&
      +	test_cmp_rev base HEAD~1 &&
      +	echo b >expect &&
      +	test_cmp expect file_fix
      +'
      +
     -+test_expect_success '--fixup-all with a single commit in range is a no-op' '
     ++test_expect_success '--squash with a single commit in range is a no-op' '
      +	git reset --hard base &&
      +	test_commit --no-tag solo file_solo a &&
      +	git rev-parse HEAD >expect &&
     -+	git rebase --autosquash --fixup-all HEAD~1 &&
     ++	git rebase --squash HEAD~1 &&
      +	git rev-parse HEAD >actual &&
      +	test_cmp expect actual
      +'
      +
     -+test_expect_success '--fixup-all with an empty range succeeds' '
     ++test_expect_success '--squash with an empty range succeeds' '
      +	git reset --hard base &&
     -+	git rebase --autosquash --fixup-all HEAD &&
     ++	git rebase --squash HEAD &&
      +	test_cmp_rev base HEAD
      +'
      +
     -+test_expect_success '--fixup-all skips a dropped commit in the range' '
     ++test_expect_success '--squash skips a dropped commit in the range' '
      +	git reset --hard base &&
      +	test_commit --no-tag fixdrop1 file_drop a &&
      +	git commit --allow-empty -m "empty in the middle" &&
      +	test_commit --no-tag fixdrop3 file_drop b &&
     -+	git rebase --autosquash --empty=drop --fixup-all HEAD~3 &&
     ++	git rebase --squash --empty=drop HEAD~3 &&
      +	test_cmp_rev base HEAD~1 &&
      +	test_commit_message HEAD -m "fixdrop1" &&
      +	echo b >expect &&
      +	test_cmp expect file_drop
      +'
      +
     -+test_expect_success '--fixup-all folds a merge commit in the middle of the range' '
     ++test_expect_success '--squash folds a merge commit in the middle of the range' '
      +	git reset --hard base &&
      +	test_commit --no-tag mid-first &&
      +	git checkout -b mid-side &&
     @@ t/t3415-rebase-autosquash.sh: test_expect_success 'pick and fixup respect commit
      +	git checkout - &&
      +	git merge --no-ff -m "merge mid-side" mid-side &&
      +	test_commit --no-tag mid-last &&
     -+	git rebase --autosquash --fixup-all base &&
     ++	git rebase --squash base &&
      +	test_cmp_rev base HEAD~1 &&
      +	test_commit_message HEAD -m "mid-first" &&
      +	test_path_is_file mid-merged.t
      +'
      +
     -+test_expect_success '--fixup-all keeps the first flattened commit when a merge sorts first' '
     ++test_expect_success '--squash keeps the first flattened commit when a merge sorts first' '
      +	git reset --hard base &&
      +	git checkout -b head-side &&
      +	test_commit --no-tag head-merged &&
      +	git checkout - &&
      +	git merge --no-ff -m "merge head-side" head-side &&
      +	test_commit --no-tag head-last &&
     -+	git rebase --autosquash --fixup-all base &&
     ++	git rebase --squash base &&
      +	test_cmp_rev base HEAD~1 &&
      +	test_commit_message HEAD -m "head-merged" &&
      +	test_path_is_file head-merged.t
      +'
      +
     -+test_expect_success '--fixup-all requires --autosquash' '
     ++test_expect_success '--squash takes precedence over --autosquash' '
      +	git reset --hard base &&
     -+	test_must_fail git rebase --fixup-all HEAD~1 2>err &&
     -+	test_grep "fixup-all requires --autosquash" err &&
     -+	test_must_fail git rebase --no-autosquash --fixup-all HEAD~1 2>err &&
     -+	test_grep "fixup-all requires --autosquash" err
     ++	test_commit --no-tag combo-first &&
     ++	test_commit --no-tag combo-mid &&
     ++	git commit --allow-empty --fixup HEAD~1 &&
     ++	test_commit --no-tag combo-last &&
     ++	git rebase --autosquash --squash base &&
     ++	test_cmp_rev base HEAD~1 &&
     ++	test_commit_message HEAD -m "combo-first"
     ++'
     ++
     ++test_expect_success '--squash folds the range with rebase.autosquash set' '
     ++	test_config rebase.autosquash true &&
     ++	git reset --hard base &&
     ++	test_commit --no-tag cfg-first &&
     ++	test_commit --no-tag cfg-last &&
     ++	git rebase --squash base &&
     ++	test_cmp_rev base HEAD~1 &&
     ++	test_commit_message HEAD -m "cfg-first"
      +'
      +
     -+test_expect_success '--fixup-all and --rebase-merges cannot be combined' '
     ++test_expect_success '--squash and --rebase-merges cannot be combined' '
      +	git reset --hard base &&
     -+	test_must_fail git rebase --autosquash --rebase-merges \
     -+		--fixup-all HEAD~1 2>err &&
     ++	test_must_fail git rebase --rebase-merges --squash HEAD~1 2>err &&
      +	test_grep "cannot be used together" err &&
      +	test_path_is_missing .git/rebase-merge
      +'

-- 
gitgitgadget

^ permalink raw reply

* Re: [PATCH 5/6] hash-object: add another >4GB/LLP64 test case
From: Patrick Steinhardt @ 2026-06-15  8:35 UTC (permalink / raw)
  To: Philip Oakley via GitGitGadget; +Cc: git, Johannes Schindelin, Philip Oakley
In-Reply-To: <f48d570bba87f7604158646873b998725a4a9db9.1780593313.git.gitgitgadget@gmail.com>

On Thu, Jun 04, 2026 at 05:15:11PM +0000, Philip Oakley via GitGitGadget wrote:
> diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
> index 59efee3aff..f2722380ee 100755
> --- a/t/t1007-hash-object.sh
> +++ b/t/t1007-hash-object.sh
> @@ -277,4 +277,12 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
>  	test_cmp expect actual
>  '
>  
> +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
> +		'files over 4GB hash correctly' '
> +	{ test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
> +	test_oid large5GB >expect &&
> +	git hash-object -- big >actual &&
> +	test_cmp expect actual
> +'

Same comment here.

Nit: I feel like we could've easily introduced all of these tests in the
first commit.

Patrick

^ permalink raw reply

* Re: [PATCH 4/6] hash-object --stdin: verify that it works with >4GB/LLP64
From: Patrick Steinhardt @ 2026-06-15  8:35 UTC (permalink / raw)
  To: Philip Oakley via GitGitGadget; +Cc: git, Johannes Schindelin, Philip Oakley
In-Reply-To: <ba629a3f03d59b6d20f1199ec86c140b0db63308.1780593313.git.gitgitgadget@gmail.com>

On Thu, Jun 04, 2026 at 05:15:10PM +0000, Philip Oakley via GitGitGadget wrote:
> diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
> index 10382a815e..59efee3aff 100755
> --- a/t/t1007-hash-object.sh
> +++ b/t/t1007-hash-object.sh
> @@ -269,4 +269,12 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
>  	test_cmp expect actual
>  '
>  
> +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
> +		'files over 4GB hash correctly via --stdin' '
> +	{ test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
> +	test_oid large5GB >expect &&
> +	git hash-object --stdin <big >actual &&
> +	test_cmp expect actual
> +'

Same comment here: can we drop the `!LONG_IS_64BIT` prereq?

Patrick

^ permalink raw reply

* Re: [PATCH 2/6] object-file.c: use size_t for header lengths
From: Patrick Steinhardt @ 2026-06-15  8:35 UTC (permalink / raw)
  To: Philip Oakley via GitGitGadget; +Cc: git, Johannes Schindelin, Philip Oakley
In-Reply-To: <809d83e46fb46baeb5d0dfcd12eb7fc63580eec4.1780593313.git.gitgitgadget@gmail.com>

On Thu, Jun 04, 2026 at 05:15:08PM +0000, Philip Oakley via GitGitGadget wrote:
> From: Philip Oakley <philipoakley@iee.email>
> 
> Continue walking the code path for the >4GB `hash-object --literally`
> test. The `hash_object_file_literally()` function internally uses both
> `hash_object_file()` and `write_object_file_prepare()`. Both function
> signatures use `unsigned long` rather than `size_t` for the mem buffer
> sizes. Use `size_t` instead, for LLP64 compatibility.
> 
> While at it, convert those function's object's header buffer length to
> `size_t` for consistency. The value is already upcast to `uintmax_t` for
> print format compatibility.

One thing I was wondering is whether we should rather migrate to a size
that is consistent across different platforms. We could e.g. `typedef
uint64_t objsize_t` and then use that going forward.

I guess the question though is whether that'd buy us anything. In other
words, are there any platforms that we care about where `size_t` is only
32 bit wide? And would such platforms even be able to handle such large
objects?

Patrick

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox