git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/22] Memory leak fixes (pt.4)
@ 2024-08-06  8:59 Patrick Steinhardt
  2024-08-06  8:59 ` [PATCH 01/22] remote: plug memory leak when aliasing URLs Patrick Steinhardt
                   ` (26 more replies)
  0 siblings, 27 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  8:59 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 4615 bytes --]

Hi,

the third set of memory leak fixes was merged to `next`, so this is the
next part of more or less random memory leak fixes all over the place.
With this series, we're at ~155 leaking test suites. Naturally, I've
already got v5 in the pipeline, which brings us down to ~120.

The series is built on top of 406f326d27 (The second batch, 2024-08-01)
with ps/leakfixes-part-3 at f30bfafcd4 (commit-reach: fix trivial memory
leak when computing reachability, 2024-08-01) merged into it.

Thanks!

Patrick

Patrick Steinhardt (22):
  remote: plug memory leak when aliasing URLs
  git: fix leaking system paths
  object-file: fix memory leak when reading corrupted headers
  object-name: fix leaking symlink paths in object context
  bulk-checkin: fix leaking state TODO
  read-cache: fix leaking hashfile when writing index fails
  submodule-config: fix leaking name enrty when traversing submodules
  config: fix leaking comment character config
  builtin/rebase: fix leaking `commit.gpgsign` value
  builtin/notes: fix leaking `struct notes_tree` when merging notes
  builtin/fast-import: plug trivial memory leaks
  builtin/fast-export: fix leaking diff options
  builtin/fast-export: plug leaking tag names
  merge-ort: unconditionally release attributes index
  sequencer: release todo list on error paths
  unpack-trees: clear index when not propagating it
  diff: fix leak when parsing invalid ignore regex option
  builtin/format-patch: fix various trivial memory leaks
  userdiff: fix leaking memory for configured diff drivers
  builtin/log: fix leak when showing converted blob contents
  diff: free state populated via options
  builtin/diff: free symmetric diff members

 builtin/diff.c                        | 10 ++-
 builtin/fast-export.c                 | 19 ++++--
 builtin/fast-import.c                 |  8 ++-
 builtin/log.c                         | 13 +++-
 builtin/notes.c                       |  9 ++-
 builtin/rebase.c                      |  8 +++
 bulk-checkin.c                        |  2 +
 config.c                              |  2 +
 csum-file.c                           |  2 +-
 csum-file.h                           | 10 +++
 diff.c                                | 16 ++++-
 environment.c                         |  3 +-
 environment.h                         |  1 +
 git.c                                 | 12 +++-
 merge-ort.c                           |  3 +-
 object-file.c                         |  1 +
 object-name.c                         |  1 +
 range-diff.c                          |  6 +-
 read-cache.c                          | 97 ++++++++++++++++-----------
 remote.c                              |  2 +
 sequencer.c                           | 65 +++++++++++++-----
 submodule-config.c                    | 18 +++--
 t/t0210-trace2-normal.sh              |  2 +-
 t/t1006-cat-file.sh                   |  1 +
 t/t1050-large.sh                      |  1 +
 t/t1450-fsck.sh                       |  1 +
 t/t1601-index-bogus.sh                |  2 +
 t/t2107-update-index-basic.sh         |  1 +
 t/t3310-notes-merge-manual-resolve.sh |  1 +
 t/t3311-notes-merge-fanout.sh         |  1 +
 t/t3404-rebase-interactive.sh         |  1 +
 t/t3435-rebase-gpg-sign.sh            |  1 +
 t/t3507-cherry-pick-conflict.sh       |  1 +
 t/t3510-cherry-pick-sequence.sh       |  1 +
 t/t3705-add-sparse-checkout.sh        |  1 +
 t/t4013-diff-various.sh               |  1 +
 t/t4014-format-patch.sh               |  1 +
 t/t4018-diff-funcname.sh              |  1 +
 t/t4030-diff-textconv.sh              |  2 +
 t/t4042-diff-textconv-caching.sh      |  2 +
 t/t4048-diff-combined-binary.sh       |  1 +
 t/t4064-diff-oidfind.sh               |  2 +
 t/t4065-diff-anchored.sh              |  1 +
 t/t4068-diff-symmetric-merge-base.sh  |  1 +
 t/t4069-remerge-diff.sh               |  1 +
 t/t4108-apply-threeway.sh             |  1 +
 t/t4209-log-pickaxe.sh                |  2 +
 t/t6421-merge-partial-clone.sh        |  1 +
 t/t6428-merge-conflicts-sparse.sh     |  1 +
 t/t7008-filter-branch-null-sha1.sh    |  1 +
 t/t7030-verify-tag.sh                 |  1 +
 t/t7817-grep-sparse-checkout.sh       |  1 +
 t/t9300-fast-import.sh                |  1 +
 t/t9304-fast-import-marks.sh          |  2 +
 t/t9351-fast-export-anonymize.sh      |  1 +
 unpack-trees.c                        |  2 +
 userdiff.c                            | 38 ++++++++---
 userdiff.h                            |  4 ++
 58 files changed, 289 insertions(+), 103 deletions(-)

-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* [PATCH 01/22] remote: plug memory leak when aliasing URLs
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
@ 2024-08-06  8:59 ` Patrick Steinhardt
  2024-08-06  8:59 ` [PATCH 02/22] git: fix leaking system paths Patrick Steinhardt
                   ` (25 subsequent siblings)
  26 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  8:59 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1916 bytes --]

When we have a `url.*.insteadOf` configuration, then we end up aliasing
URLs when populating remotes. One place where this happens is in
`alias_all_urls()`, where we loop through all remotes and then alias
each of their URLs. The actual aliasing logic is then contained in
`alias_url()`, which returns an allocated string that contains the new
URL. This URL replaces the old URL that we have in the strvec that
contanis all remote URLs.

We replace the remote URLs via `strvec_replace()`, which does not hand
over ownership of the new string to the vector. Still, we didn't free
the aliased URL and thus have a memory leak here. Fix it by freeing the
aliased string.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 remote.c                 | 2 ++
 t/t0210-trace2-normal.sh | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/remote.c b/remote.c
index f43cf5e7a4..3b898edd23 100644
--- a/remote.c
+++ b/remote.c
@@ -499,6 +499,7 @@ static void alias_all_urls(struct remote_state *remote_state)
 			if (alias)
 				strvec_replace(&remote_state->remotes[i]->pushurl,
 					       j, alias);
+			free(alias);
 		}
 		add_pushurl_aliases = remote_state->remotes[i]->pushurl.nr == 0;
 		for (j = 0; j < remote_state->remotes[i]->url.nr; j++) {
@@ -512,6 +513,7 @@ static void alias_all_urls(struct remote_state *remote_state)
 			if (alias)
 				strvec_replace(&remote_state->remotes[i]->url,
 					       j, alias);
+			free(alias);
 		}
 	}
 }
diff --git a/t/t0210-trace2-normal.sh b/t/t0210-trace2-normal.sh
index c312657a12..b9adc94aab 100755
--- a/t/t0210-trace2-normal.sh
+++ b/t/t0210-trace2-normal.sh
@@ -2,7 +2,7 @@
 
 test_description='test trace2 facility (normal target)'
 
-TEST_PASSES_SANITIZE_LEAK=false
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # Turn off any inherited trace2 settings for this test.
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 02/22] git: fix leaking system paths
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
  2024-08-06  8:59 ` [PATCH 01/22] remote: plug memory leak when aliasing URLs Patrick Steinhardt
@ 2024-08-06  8:59 ` Patrick Steinhardt
  2024-08-07  4:02   ` James Liu
  2024-08-06  8:59 ` [PATCH 03/22] object-file: fix memory leak when reading corrupted headers Patrick Steinhardt
                   ` (24 subsequent siblings)
  26 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  8:59 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1480 bytes --]

Git has some flags to make it output system paths as they have been
compiled into Git. This is done by calling `system_path()`, which
returns an allocated string. This string isn't ever free'd though,
creating a memory leak.

Plug those leaks. While they are surfaced by t0211, there are more
memory leaks looming exposed by that test suite and it thus does not yet
pass with the memory leak checker enabled.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 git.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/git.c b/git.c
index e35af9b0e5..5eab88b472 100644
--- a/git.c
+++ b/git.c
@@ -173,15 +173,21 @@ static int handle_options(const char ***argv, int *argc, int *envchanged)
 				exit(0);
 			}
 		} else if (!strcmp(cmd, "--html-path")) {
-			puts(system_path(GIT_HTML_PATH));
+			char *path = system_path(GIT_HTML_PATH);
+			puts(path);
+			free(path);
 			trace2_cmd_name("_query_");
 			exit(0);
 		} else if (!strcmp(cmd, "--man-path")) {
-			puts(system_path(GIT_MAN_PATH));
+			char *path = system_path(GIT_MAN_PATH);
+			puts(path);
+			free(path);
 			trace2_cmd_name("_query_");
 			exit(0);
 		} else if (!strcmp(cmd, "--info-path")) {
-			puts(system_path(GIT_INFO_PATH));
+			char *path = system_path(GIT_INFO_PATH);
+			puts(path);
+			free(path);
 			trace2_cmd_name("_query_");
 			exit(0);
 		} else if (!strcmp(cmd, "-p") || !strcmp(cmd, "--paginate")) {
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 03/22] object-file: fix memory leak when reading corrupted headers
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
  2024-08-06  8:59 ` [PATCH 01/22] remote: plug memory leak when aliasing URLs Patrick Steinhardt
  2024-08-06  8:59 ` [PATCH 02/22] git: fix leaking system paths Patrick Steinhardt
@ 2024-08-06  8:59 ` Patrick Steinhardt
  2024-08-06  8:59 ` [PATCH 04/22] object-name: fix leaking symlink paths in object context Patrick Steinhardt
                   ` (23 subsequent siblings)
  26 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  8:59 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1385 bytes --]

When reading corrupt object headers in `read_loose_object()`, then we
bail out immediately. This causes a memory leak though because we would
have already initialized the zstream in `unpack_loose_header()`, and it
is the callers responsibility to finish the zstream even on error. While
this feels weird, other callsites do it correctly already.

Fix this leak by ending the zstream even on errors. We may want to
revisit this interface in the future such that the callee handles this
for us already when there was an error.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 object-file.c   | 1 +
 t/t1450-fsck.sh | 1 +
 2 files changed, 2 insertions(+)

diff --git a/object-file.c b/object-file.c
index 065103be3e..7c65c435cd 100644
--- a/object-file.c
+++ b/object-file.c
@@ -2954,6 +2954,7 @@ int read_loose_object(const char *path,
 	if (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
 				NULL) != ULHR_OK) {
 		error(_("unable to unpack header of %s"), path);
+		git_inflate_end(&stream);
 		goto out;
 	}
 
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 8a456b1142..280cbf3e03 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -6,6 +6,7 @@ test_description='git fsck random collection of tests
 * (main) A
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success setup '
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 04/22] object-name: fix leaking symlink paths in object context
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (2 preceding siblings ...)
  2024-08-06  8:59 ` [PATCH 03/22] object-file: fix memory leak when reading corrupted headers Patrick Steinhardt
@ 2024-08-06  8:59 ` Patrick Steinhardt
  2024-08-06  8:59 ` [PATCH 05/22] bulk-checkin: fix leaking state TODO Patrick Steinhardt
                   ` (22 subsequent siblings)
  26 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  8:59 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1000 bytes --]

The object context may be populated with symlink contents when reading a
symlink, but the associated strbuf doesn't ever get released when
releasing the object context, causing a memory leak. Plug it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 object-name.c       | 1 +
 t/t1006-cat-file.sh | 1 +
 2 files changed, 2 insertions(+)

diff --git a/object-name.c b/object-name.c
index 240a93e7ce..e39fa50e47 100644
--- a/object-name.c
+++ b/object-name.c
@@ -1765,6 +1765,7 @@ int strbuf_check_branch_ref(struct strbuf *sb, const char *name)
 void object_context_release(struct object_context *ctx)
 {
 	free(ctx->path);
+	strbuf_release(&ctx->symlink_path);
 }
 
 /*
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index ff9bf213aa..d36cd7c086 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -2,6 +2,7 @@
 
 test_description='git cat-file'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_cmdmode_usage () {
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 05/22] bulk-checkin: fix leaking state TODO
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (3 preceding siblings ...)
  2024-08-06  8:59 ` [PATCH 04/22] object-name: fix leaking symlink paths in object context Patrick Steinhardt
@ 2024-08-06  8:59 ` Patrick Steinhardt
  2024-08-06  9:00 ` [PATCH 06/22] read-cache: fix leaking hashfile when writing index fails Patrick Steinhardt
                   ` (21 subsequent siblings)
  26 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  8:59 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 2939 bytes --]

When flushing a bulk-checking to disk we also reset the `struct
bulk_checkin_packfile` state. But while we free some of its members,
others aren't being free'd, leading to memory leaks:

  - The temporary packfile name is not getting freed.

  - The `struct hashfile` only gets freed in case we end up calling
    `finalize_hashfile()`. There are code paths though where that is not
    the case, namely when nothing has been written. For this, we need to
    make `free_hashfile()` public.

Fix those leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 bulk-checkin.c   |  2 ++
 csum-file.c      |  2 +-
 csum-file.h      | 10 ++++++++++
 t/t1050-large.sh |  1 +
 4 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/bulk-checkin.c b/bulk-checkin.c
index da8673199b..9089c214fa 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -61,6 +61,7 @@ static void flush_bulk_checkin_packfile(struct bulk_checkin_packfile *state)
 
 	if (state->nr_written == 0) {
 		close(state->f->fd);
+		free_hashfile(state->f);
 		unlink(state->pack_tmp_name);
 		goto clear_exit;
 	} else if (state->nr_written == 1) {
@@ -83,6 +84,7 @@ static void flush_bulk_checkin_packfile(struct bulk_checkin_packfile *state)
 		free(state->written[i]);
 
 clear_exit:
+	free(state->pack_tmp_name);
 	free(state->written);
 	memset(state, 0, sizeof(*state));
 
diff --git a/csum-file.c b/csum-file.c
index 8abbf01325..7e0ece1305 100644
--- a/csum-file.c
+++ b/csum-file.c
@@ -56,7 +56,7 @@ void hashflush(struct hashfile *f)
 	}
 }
 
-static void free_hashfile(struct hashfile *f)
+void free_hashfile(struct hashfile *f)
 {
 	free(f->buffer);
 	free(f->check_buffer);
diff --git a/csum-file.h b/csum-file.h
index 566e05cbd2..ca553eba17 100644
--- a/csum-file.h
+++ b/csum-file.h
@@ -46,6 +46,16 @@ int hashfile_truncate(struct hashfile *, struct hashfile_checkpoint *);
 struct hashfile *hashfd(int fd, const char *name);
 struct hashfile *hashfd_check(const char *name);
 struct hashfile *hashfd_throughput(int fd, const char *name, struct progress *tp);
+
+/*
+ * Free the hashfile without flushing its contents to disk. This only
+ * needs to be called when not calling `finalize_hashfile()`.
+ */
+void free_hashfile(struct hashfile *f);
+
+/*
+ * Finalize the hashfile by flushing data to disk and free'ing it.
+ */
 int finalize_hashfile(struct hashfile *, unsigned char *, enum fsync_component, unsigned int);
 void hashwrite(struct hashfile *, const void *, unsigned int);
 void hashflush(struct hashfile *f);
diff --git a/t/t1050-large.sh b/t/t1050-large.sh
index c71932b024..ed638f6644 100755
--- a/t/t1050-large.sh
+++ b/t/t1050-large.sh
@@ -3,6 +3,7 @@
 
 test_description='adding and checking out large blobs'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'core.bigFileThreshold must be non-negative' '
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 06/22] read-cache: fix leaking hashfile when writing index fails
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (4 preceding siblings ...)
  2024-08-06  8:59 ` [PATCH 05/22] bulk-checkin: fix leaking state TODO Patrick Steinhardt
@ 2024-08-06  9:00 ` Patrick Steinhardt
  2024-08-07  7:01   ` James Liu
  2024-08-06  9:00 ` [PATCH 07/22] submodule-config: fix leaking name enrty when traversing submodules Patrick Steinhardt
                   ` (20 subsequent siblings)
  26 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  9:00 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 7443 bytes --]

In `do_write_index()`, we use a `struct hashfile` to write the index
with a trailer hash. In case the write fails though, we never clean up
the allocated `hashfile` state and thus leak memory.

Refactor the code to have a common exit path where we can free this and
other allocated memory. While at it, refactor our use of `strbuf`s such
that we reuse the same buffer to avoid some unneeded allocations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 read-cache.c                       | 97 ++++++++++++++++++------------
 t/t1601-index-bogus.sh             |  2 +
 t/t2107-update-index-basic.sh      |  1 +
 t/t7008-filter-branch-null-sha1.sh |  1 +
 4 files changed, 62 insertions(+), 39 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 48bf24f87c..36821fe5b5 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2840,8 +2840,9 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	int csum_fsync_flag;
 	int ieot_entries = 1;
 	struct index_entry_offset_table *ieot = NULL;
-	int nr, nr_threads;
 	struct repository *r = istate->repo;
+	struct strbuf sb = STRBUF_INIT;
+	int nr, nr_threads, ret;
 
 	f = hashfd(tempfile->fd, tempfile->filename.buf);
 
@@ -2962,8 +2963,8 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	strbuf_release(&previous_name_buf);
 
 	if (err) {
-		free(ieot);
-		return err;
+		ret = err;
+		goto out;
 	}
 
 	offset = hashfile_total(f);
@@ -2985,20 +2986,20 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	 * index.
 	 */
 	if (ieot) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		write_ieot_extension(&sb, ieot);
 		err = write_index_ext_header(f, eoie_c, CACHE_EXT_INDEXENTRYOFFSETTABLE, sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		free(ieot);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 
 	if (write_extensions & WRITE_SPLIT_INDEX_EXTENSION &&
 	    istate->split_index) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		if (istate->sparse_index)
 			die(_("cannot write split index for a sparse index"));
@@ -3007,59 +3008,66 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 			write_index_ext_header(f, eoie_c, CACHE_EXT_LINK,
 					       sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 	if (write_extensions & WRITE_CACHE_TREE_EXTENSION &&
 	    !drop_cache_tree && istate->cache_tree) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		cache_tree_write(&sb, istate->cache_tree);
 		err = write_index_ext_header(f, eoie_c, CACHE_EXT_TREE, sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 	if (write_extensions & WRITE_RESOLVE_UNDO_EXTENSION &&
 	    istate->resolve_undo) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		resolve_undo_write(&sb, istate->resolve_undo);
 		err = write_index_ext_header(f, eoie_c, CACHE_EXT_RESOLVE_UNDO,
 					     sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 	if (write_extensions & WRITE_UNTRACKED_CACHE_EXTENSION &&
 	    istate->untracked) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		write_untracked_extension(&sb, istate->untracked);
 		err = write_index_ext_header(f, eoie_c, CACHE_EXT_UNTRACKED,
 					     sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 	if (write_extensions & WRITE_FSMONITOR_EXTENSION &&
 	    istate->fsmonitor_last_update) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		write_fsmonitor_extension(&sb, istate);
 		err = write_index_ext_header(f, eoie_c, CACHE_EXT_FSMONITOR, sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 	if (istate->sparse_index) {
-		if (write_index_ext_header(f, eoie_c, CACHE_EXT_SPARSE_DIRECTORIES, 0) < 0)
-			return -1;
+		if (write_index_ext_header(f, eoie_c, CACHE_EXT_SPARSE_DIRECTORIES, 0) < 0) {
+			ret = -1;
+			goto out;
+		}
 	}
 
 	/*
@@ -3069,14 +3077,15 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	 * when loading the shared index.
 	 */
 	if (eoie_c) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		write_eoie_extension(&sb, eoie_c, offset);
 		err = write_index_ext_header(f, NULL, CACHE_EXT_ENDOFINDEXENTRIES, sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 
 	csum_fsync_flag = 0;
@@ -3085,13 +3094,16 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 
 	finalize_hashfile(f, istate->oid.hash, FSYNC_COMPONENT_INDEX,
 			  CSUM_HASH_IN_STREAM | csum_fsync_flag);
+	f = NULL;
 
 	if (close_tempfile_gently(tempfile)) {
-		error(_("could not close '%s'"), get_tempfile_path(tempfile));
-		return -1;
+		ret = error(_("could not close '%s'"), get_tempfile_path(tempfile));
+		goto out;
+	}
+	if (stat(get_tempfile_path(tempfile), &st)) {
+		ret = -1;
+		goto out;
 	}
-	if (stat(get_tempfile_path(tempfile), &st))
-		return -1;
 	istate->timestamp.sec = (unsigned int)st.st_mtime;
 	istate->timestamp.nsec = ST_MTIME_NSEC(st);
 	trace_performance_since(start, "write index, changed mask = %x", istate->cache_changed);
@@ -3105,7 +3117,14 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	trace2_data_intmax("index", the_repository, "write/cache_nr",
 			   istate->cache_nr);
 
-	return 0;
+	ret = 0;
+
+out:
+	if (f)
+		free_hashfile(f);
+	strbuf_release(&sb);
+	free(ieot);
+	return ret;
 }
 
 void set_alternate_index_output(const char *name)
diff --git a/t/t1601-index-bogus.sh b/t/t1601-index-bogus.sh
index 4171f1e141..5dcc101882 100755
--- a/t/t1601-index-bogus.sh
+++ b/t/t1601-index-bogus.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test handling of bogus index entries'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'create tree with null sha1' '
diff --git a/t/t2107-update-index-basic.sh b/t/t2107-update-index-basic.sh
index cc72ead79f..f0eab13f96 100755
--- a/t/t2107-update-index-basic.sh
+++ b/t/t2107-update-index-basic.sh
@@ -5,6 +5,7 @@ test_description='basic update-index tests
 Tests for command-line parsing and basic operation.
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'update-index --nonsense fails' '
diff --git a/t/t7008-filter-branch-null-sha1.sh b/t/t7008-filter-branch-null-sha1.sh
index 93fbc92b8d..0ce8fd2c89 100755
--- a/t/t7008-filter-branch-null-sha1.sh
+++ b/t/t7008-filter-branch-null-sha1.sh
@@ -2,6 +2,7 @@
 
 test_description='filter-branch removal of trees with null sha1'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup: base commits' '
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 07/22] submodule-config: fix leaking name enrty when traversing submodules
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (5 preceding siblings ...)
  2024-08-06  9:00 ` [PATCH 06/22] read-cache: fix leaking hashfile when writing index fails Patrick Steinhardt
@ 2024-08-06  9:00 ` Patrick Steinhardt
  2024-08-06  9:00 ` [PATCH 08/22] config: fix leaking comment character config Patrick Steinhardt
                   ` (19 subsequent siblings)
  26 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  9:00 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 2361 bytes --]

We traverse through submodules in the tree via `tree_entry()`, passing
to it a `struct name_entry` that it is supposed to populate with the
tree entry's contents. We unnecessarily allocate this variable instead
of passing a variable that is allocated on the stack, and the ultimately
don't even free that variable. This is unnecessary and leaks memory.

Convert the variable to instead be allocated on the stack to plug the
memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 submodule-config.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/submodule-config.c b/submodule-config.c
index 9b0bb0b9f4..c8f2bb2bdd 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -899,27 +899,25 @@ static void traverse_tree_submodules(struct repository *r,
 {
 	struct tree_desc tree;
 	struct submodule_tree_entry *st_entry;
-	struct name_entry *name_entry;
+	struct name_entry name_entry;
 	char *tree_path = NULL;
 
-	name_entry = xmalloc(sizeof(*name_entry));
-
 	fill_tree_descriptor(r, &tree, treeish_name);
-	while (tree_entry(&tree, name_entry)) {
+	while (tree_entry(&tree, &name_entry)) {
 		if (prefix)
 			tree_path =
-				mkpathdup("%s/%s", prefix, name_entry->path);
+				mkpathdup("%s/%s", prefix, name_entry.path);
 		else
-			tree_path = xstrdup(name_entry->path);
+			tree_path = xstrdup(name_entry.path);
 
-		if (S_ISGITLINK(name_entry->mode) &&
+		if (S_ISGITLINK(name_entry.mode) &&
 		    is_tree_submodule_active(r, root_tree, tree_path)) {
 			ALLOC_GROW(out->entries, out->entry_nr + 1,
 				   out->entry_alloc);
 			st_entry = &out->entries[out->entry_nr++];
 
 			st_entry->name_entry = xmalloc(sizeof(*st_entry->name_entry));
-			*st_entry->name_entry = *name_entry;
+			*st_entry->name_entry = name_entry;
 			st_entry->submodule =
 				submodule_from_path(r, root_tree, tree_path);
 			st_entry->repo = xmalloc(sizeof(*st_entry->repo));
@@ -927,9 +925,9 @@ static void traverse_tree_submodules(struct repository *r,
 						root_tree))
 				FREE_AND_NULL(st_entry->repo);
 
-		} else if (S_ISDIR(name_entry->mode))
+		} else if (S_ISDIR(name_entry.mode))
 			traverse_tree_submodules(r, root_tree, tree_path,
-						 &name_entry->oid, out);
+						 &name_entry.oid, out);
 		free(tree_path);
 	}
 }
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 08/22] config: fix leaking comment character config
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (6 preceding siblings ...)
  2024-08-06  9:00 ` [PATCH 07/22] submodule-config: fix leaking name enrty when traversing submodules Patrick Steinhardt
@ 2024-08-06  9:00 ` Patrick Steinhardt
  2024-08-07  7:11   ` James Liu
  2024-08-06  9:00 ` [PATCH 09/22] builtin/rebase: fix leaking `commit.gpgsign` value Patrick Steinhardt
                   ` (18 subsequent siblings)
  26 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  9:00 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 2283 bytes --]

When the comment line character has been specified multiple times in the
configuration, then `git_default_core_config()` will cause a memory leak
because it unconditionally copies the string into `comment_line_str`
without free'ing the previous value. In fact, it can't easily free the
value in the first place because it may contain a string constant.

Refactor the code so that we initialize the value with another array.
This allows us to free the value in case the string is not pointing to
that constant array anymore.

This memory leak is being hit in t3404. As there are still other memory
leaks in that file we cannot yet mark it as passing with leak checking
enabled.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 config.c      | 2 ++
 environment.c | 3 ++-
 environment.h | 1 +
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/config.c b/config.c
index 6421894614..63e0211c7d 100644
--- a/config.c
+++ b/config.c
@@ -1596,6 +1596,8 @@ static int git_default_core_config(const char *var, const char *value,
 		else if (value[0]) {
 			if (strchr(value, '\n'))
 				return error(_("%s cannot contain newline"), var);
+			if (comment_line_str != comment_line_str_default)
+				free((char *) comment_line_str);
 			comment_line_str = xstrdup(value);
 			auto_comment_line_char = 0;
 		} else
diff --git a/environment.c b/environment.c
index 5cea2c9f54..8297c6e37b 100644
--- a/environment.c
+++ b/environment.c
@@ -113,7 +113,8 @@ int protect_ntfs = PROTECT_NTFS_DEFAULT;
  * The character that begins a commented line in user-editable file
  * that is subject to stripspace.
  */
-const char *comment_line_str = "#";
+const char comment_line_str_default[] = "#";
+const char *comment_line_str = comment_line_str_default;
 int auto_comment_line_char;
 
 /* Parallel index stat data preload? */
diff --git a/environment.h b/environment.h
index e9f01d4d11..5e5d9a8045 100644
--- a/environment.h
+++ b/environment.h
@@ -8,6 +8,7 @@ struct strvec;
  * The character that begins a commented line in user-editable file
  * that is subject to stripspace.
  */
+extern const char comment_line_str_default[];
 extern const char *comment_line_str;
 extern int auto_comment_line_char;
 
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 09/22] builtin/rebase: fix leaking `commit.gpgsign` value
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (7 preceding siblings ...)
  2024-08-06  9:00 ` [PATCH 08/22] config: fix leaking comment character config Patrick Steinhardt
@ 2024-08-06  9:00 ` Patrick Steinhardt
  2024-08-07  7:32   ` James Liu
  2024-08-08 10:07   ` Phillip Wood
  2024-08-06  9:00 ` [PATCH 10/22] builtin/notes: fix leaking `struct notes_tree` when merging notes Patrick Steinhardt
                   ` (17 subsequent siblings)
  26 siblings, 2 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  9:00 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 3085 bytes --]

In `get_replay_opts()`, we unconditionally override the `gpg_sign` field
that already got populated by `sequencer_init_config()` in case the user
has "commit.gpgsign" set in their config. It is kind of dubious whether
this is the correct thing to do or a bug. What is clear though is that
this creates a memory leak.

Let's mark this assignment with a TODO comment to figure out whether
this needs to be fixed or not. Meanwhile though, let's plug the memory
leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/rebase.c              | 8 ++++++++
 sequencer.c                   | 1 +
 t/t3404-rebase-interactive.sh | 1 +
 t/t3435-rebase-gpg-sign.sh    | 1 +
 t/t7030-verify-tag.sh         | 1 +
 5 files changed, 12 insertions(+)

diff --git a/builtin/rebase.c b/builtin/rebase.c
index e3a8e74cfc..f65316a023 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -186,7 +186,15 @@ static struct replay_opts get_replay_opts(const struct rebase_options *opts)
 	replay.committer_date_is_author_date =
 					opts->committer_date_is_author_date;
 	replay.ignore_date = opts->ignore_date;
+
+	/*
+	 * TODO: Is it really intentional that we unconditionally override
+	 * `replay.gpg_sign` even if it has already been initialized via the
+	 * configuration?
+	 */
+	free(replay.gpg_sign);
 	replay.gpg_sign = xstrdup_or_null(opts->gpg_sign_opt);
+
 	replay.reflog_action = xstrdup(opts->reflog_action);
 	if (opts->strategy)
 		replay.strategy = xstrdup_or_null(opts->strategy);
diff --git a/sequencer.c b/sequencer.c
index 0291920f0b..cade9b0ca8 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -303,6 +303,7 @@ static int git_sequencer_config(const char *k, const char *v,
 	}
 
 	if (!strcmp(k, "commit.gpgsign")) {
+		free(opts->gpg_sign);
 		opts->gpg_sign = git_config_bool(k, v) ? xstrdup("") : NULL;
 		return 0;
 	}
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index f92baad138..f171af3061 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -26,6 +26,7 @@ Initial setup:
  touch file "conflict".
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 . "$TEST_DIRECTORY"/lib-rebase.sh
diff --git a/t/t3435-rebase-gpg-sign.sh b/t/t3435-rebase-gpg-sign.sh
index 6aa2aeb628..6e329fea7c 100755
--- a/t/t3435-rebase-gpg-sign.sh
+++ b/t/t3435-rebase-gpg-sign.sh
@@ -8,6 +8,7 @@ test_description='test rebase --[no-]gpg-sign'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY/lib-rebase.sh"
 . "$TEST_DIRECTORY/lib-gpg.sh"
diff --git a/t/t7030-verify-tag.sh b/t/t7030-verify-tag.sh
index 6f526c37c2..effa826744 100755
--- a/t/t7030-verify-tag.sh
+++ b/t/t7030-verify-tag.sh
@@ -4,6 +4,7 @@ test_description='signed tag tests'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY/lib-gpg.sh"
 
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 10/22] builtin/notes: fix leaking `struct notes_tree` when merging notes
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (8 preceding siblings ...)
  2024-08-06  9:00 ` [PATCH 09/22] builtin/rebase: fix leaking `commit.gpgsign` value Patrick Steinhardt
@ 2024-08-06  9:00 ` Patrick Steinhardt
  2024-08-06  9:00 ` [PATCH 11/22] builtin/fast-import: plug trivial memory leaks Patrick Steinhardt
                   ` (16 subsequent siblings)
  26 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  9:00 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 2955 bytes --]

We allocate a `struct notes_tree` in `merge_commit()` which we then
initialize via `init_notes()`. It's not really necessary to allocate the
structure though given that we never pass ownership to the caller.
Furthermore, the allocation leads to a memory leak because despite its
name, `free_notes()` doesn't free the `notes_tree` but only clears it.

Fix this issue by converting the code to use an on-stack variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/notes.c                       | 9 ++++-----
 t/t3310-notes-merge-manual-resolve.sh | 1 +
 t/t3311-notes-merge-fanout.sh         | 1 +
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/builtin/notes.c b/builtin/notes.c
index d9c356e354..81cbaeec6b 100644
--- a/builtin/notes.c
+++ b/builtin/notes.c
@@ -807,7 +807,7 @@ static int merge_commit(struct notes_merge_options *o)
 {
 	struct strbuf msg = STRBUF_INIT;
 	struct object_id oid, parent_oid;
-	struct notes_tree *t;
+	struct notes_tree t = {0};
 	struct commit *partial;
 	struct pretty_print_context pretty_ctx;
 	void *local_ref_to_free;
@@ -830,8 +830,7 @@ static int merge_commit(struct notes_merge_options *o)
 	else
 		oidclr(&parent_oid, the_repository->hash_algo);
 
-	CALLOC_ARRAY(t, 1);
-	init_notes(t, "NOTES_MERGE_PARTIAL", combine_notes_overwrite, 0);
+	init_notes(&t, "NOTES_MERGE_PARTIAL", combine_notes_overwrite, 0);
 
 	o->local_ref = local_ref_to_free =
 		refs_resolve_refdup(get_main_ref_store(the_repository),
@@ -839,7 +838,7 @@ static int merge_commit(struct notes_merge_options *o)
 	if (!o->local_ref)
 		die(_("failed to resolve NOTES_MERGE_REF"));
 
-	if (notes_merge_commit(o, t, partial, &oid))
+	if (notes_merge_commit(o, &t, partial, &oid))
 		die(_("failed to finalize notes merge"));
 
 	/* Reuse existing commit message in reflog message */
@@ -853,7 +852,7 @@ static int merge_commit(struct notes_merge_options *o)
 			is_null_oid(&parent_oid) ? NULL : &parent_oid,
 			0, UPDATE_REFS_DIE_ON_ERR);
 
-	free_notes(t);
+	free_notes(&t);
 	strbuf_release(&msg);
 	ret = merge_abort(o);
 	free(local_ref_to_free);
diff --git a/t/t3310-notes-merge-manual-resolve.sh b/t/t3310-notes-merge-manual-resolve.sh
index 597df5ebc0..04866b89be 100755
--- a/t/t3310-notes-merge-manual-resolve.sh
+++ b/t/t3310-notes-merge-manual-resolve.sh
@@ -5,6 +5,7 @@
 
 test_description='Test notes merging with manual conflict resolution'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # Set up a notes merge scenario with different kinds of conflicts
diff --git a/t/t3311-notes-merge-fanout.sh b/t/t3311-notes-merge-fanout.sh
index 5b675417e9..ce4144db0f 100755
--- a/t/t3311-notes-merge-fanout.sh
+++ b/t/t3311-notes-merge-fanout.sh
@@ -5,6 +5,7 @@
 
 test_description='Test notes merging at various fanout levels'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 verify_notes () {
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 11/22] builtin/fast-import: plug trivial memory leaks
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (9 preceding siblings ...)
  2024-08-06  9:00 ` [PATCH 10/22] builtin/notes: fix leaking `struct notes_tree` when merging notes Patrick Steinhardt
@ 2024-08-06  9:00 ` Patrick Steinhardt
  2024-08-06  9:00 ` [PATCH 12/22] builtin/fast-export: fix leaking diff options Patrick Steinhardt
                   ` (15 subsequent siblings)
  26 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  9:00 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 2478 bytes --]

Plug some trivial memory leaks in git-fast-import(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/fast-import.c        | 8 ++++++--
 t/t9300-fast-import.sh       | 1 +
 t/t9304-fast-import-marks.sh | 2 ++
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index d21c4053a7..6dfeb01665 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -206,8 +206,8 @@ static unsigned int object_entry_alloc = 5000;
 static struct object_entry_pool *blocks;
 static struct hashmap object_table;
 static struct mark_set *marks;
-static const char *export_marks_file;
-static const char *import_marks_file;
+static char *export_marks_file;
+static char *import_marks_file;
 static int import_marks_file_from_stream;
 static int import_marks_file_ignore_missing;
 static int import_marks_file_done;
@@ -3274,6 +3274,7 @@ static void option_import_marks(const char *marks,
 			read_marks();
 	}
 
+	free(import_marks_file);
 	import_marks_file = make_fast_import_path(marks);
 	import_marks_file_from_stream = from_stream;
 	import_marks_file_ignore_missing = ignore_missing;
@@ -3316,6 +3317,7 @@ static void option_active_branches(const char *branches)
 
 static void option_export_marks(const char *marks)
 {
+	free(export_marks_file);
 	export_marks_file = make_fast_import_path(marks);
 }
 
@@ -3357,6 +3359,8 @@ static void option_rewrite_submodules(const char *arg, struct string_list *list)
 	free(f);
 
 	string_list_insert(list, s)->util = ms;
+
+	free(s);
 }
 
 static int parse_one_option(const char *option)
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 1e68426852..3b3c371740 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -7,6 +7,7 @@ test_description='test git fast-import utility'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-diff.sh ;# test-lib chdir's into trash
 
diff --git a/t/t9304-fast-import-marks.sh b/t/t9304-fast-import-marks.sh
index 410a871c52..1f776a80f3 100755
--- a/t/t9304-fast-import-marks.sh
+++ b/t/t9304-fast-import-marks.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test exotic situations with marks'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup dump of basic history' '
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 12/22] builtin/fast-export: fix leaking diff options
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (10 preceding siblings ...)
  2024-08-06  9:00 ` [PATCH 11/22] builtin/fast-import: plug trivial memory leaks Patrick Steinhardt
@ 2024-08-06  9:00 ` Patrick Steinhardt
  2024-08-06  9:00 ` [PATCH 13/22] builtin/fast-export: plug leaking tag names Patrick Steinhardt
                   ` (14 subsequent siblings)
  26 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  9:00 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1100 bytes --]

Before caling `handle_commit()` in a loop, we set `diffopt.no_free` such
that its contents aren't getting freed inside of `handle_commit()`. We
never unset that flag though, which means that it'll ultimately leak
when calling `release_revisions()`.

Fix this by unsetting the flag after the loop.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/fast-export.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 4b6e8c6832..fe92d2436c 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -1278,9 +1278,11 @@ int cmd_fast_export(int argc, const char **argv, const char *prefix)
 	revs.diffopt.format_callback = show_filemodify;
 	revs.diffopt.format_callback_data = &paths_of_changed_objects;
 	revs.diffopt.flags.recursive = 1;
+
 	revs.diffopt.no_free = 1;
 	while ((commit = get_revision(&revs)))
 		handle_commit(commit, &revs, &paths_of_changed_objects);
+	revs.diffopt.no_free = 0;
 
 	handle_tags_and_duplicates(&extra_refs);
 	handle_tags_and_duplicates(&tag_refs);
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 13/22] builtin/fast-export: plug leaking tag names
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (11 preceding siblings ...)
  2024-08-06  9:00 ` [PATCH 12/22] builtin/fast-export: fix leaking diff options Patrick Steinhardt
@ 2024-08-06  9:00 ` Patrick Steinhardt
  2024-08-07  8:31   ` James Liu
  2024-08-06  9:00 ` [PATCH 14/22] merge-ort: unconditionally release attributes index Patrick Steinhardt
                   ` (13 subsequent siblings)
  26 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  9:00 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 3874 bytes --]

When resolving revisions in `get_tags_and_duplicates()`, we only
partially manage the lifetime of `full_name`. In fact, managing its
lifetime properly is almost impossible because we put direct pointers to
that variable into multiple lists without duplicating the string. The
consequence is that these strings will ultimately leak.

Refactor the code to make the lists we put those names into duplicate
the memory. This allows us to properly free the string as required and
thus plugs the memory leak.

While this requires us to allocate more data overall, it shouldn't be
all that bad given that the number of allocations corresponds with the
number of command line parameters, which typically aren't all that many.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/fast-export.c            | 17 ++++++++++++-----
 t/t9351-fast-export-anonymize.sh |  1 +
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index fe92d2436c..f253b79322 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -42,8 +42,8 @@ static int full_tree;
 static int reference_excluded_commits;
 static int show_original_ids;
 static int mark_tags;
-static struct string_list extra_refs = STRING_LIST_INIT_NODUP;
-static struct string_list tag_refs = STRING_LIST_INIT_NODUP;
+static struct string_list extra_refs = STRING_LIST_INIT_DUP;
+static struct string_list tag_refs = STRING_LIST_INIT_DUP;
 static struct refspec refspecs = REFSPEC_INIT_FETCH;
 static int anonymize;
 static struct hashmap anonymized_seeds;
@@ -901,7 +901,7 @@ static void handle_tag(const char *name, struct tag *tag)
 	free(buf);
 }
 
-static struct commit *get_commit(struct rev_cmdline_entry *e, char *full_name)
+static struct commit *get_commit(struct rev_cmdline_entry *e, const char *full_name)
 {
 	switch (e->item->type) {
 	case OBJ_COMMIT:
@@ -932,14 +932,16 @@ static void get_tags_and_duplicates(struct rev_cmdline_info *info)
 		struct rev_cmdline_entry *e = info->rev + i;
 		struct object_id oid;
 		struct commit *commit;
-		char *full_name;
+		char *full_name = NULL;
 
 		if (e->flags & UNINTERESTING)
 			continue;
 
 		if (repo_dwim_ref(the_repository, e->name, strlen(e->name),
-				  &oid, &full_name, 0) != 1)
+				  &oid, &full_name, 0) != 1) {
+			free(full_name);
 			continue;
+		}
 
 		if (refspecs.nr) {
 			char *private;
@@ -955,6 +957,7 @@ static void get_tags_and_duplicates(struct rev_cmdline_info *info)
 			warning("%s: Unexpected object of type %s, skipping.",
 				e->name,
 				type_name(e->item->type));
+			free(full_name);
 			continue;
 		}
 
@@ -963,10 +966,12 @@ static void get_tags_and_duplicates(struct rev_cmdline_info *info)
 			break;
 		case OBJ_BLOB:
 			export_blob(&commit->object.oid);
+			free(full_name);
 			continue;
 		default: /* OBJ_TAG (nested tags) is already handled */
 			warning("Tag points to object of unexpected type %s, skipping.",
 				type_name(commit->object.type));
+			free(full_name);
 			continue;
 		}
 
@@ -979,6 +984,8 @@ static void get_tags_and_duplicates(struct rev_cmdline_info *info)
 
 		if (!*revision_sources_at(&revision_sources, commit))
 			*revision_sources_at(&revision_sources, commit) = full_name;
+		else
+			free(full_name);
 	}
 
 	string_list_sort(&extra_refs);
diff --git a/t/t9351-fast-export-anonymize.sh b/t/t9351-fast-export-anonymize.sh
index 156a647484..c0d9d7be75 100755
--- a/t/t9351-fast-export-anonymize.sh
+++ b/t/t9351-fast-export-anonymize.sh
@@ -4,6 +4,7 @@ test_description='basic tests for fast-export --anonymize'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup simple repo' '
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 14/22] merge-ort: unconditionally release attributes index
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (12 preceding siblings ...)
  2024-08-06  9:00 ` [PATCH 13/22] builtin/fast-export: plug leaking tag names Patrick Steinhardt
@ 2024-08-06  9:00 ` Patrick Steinhardt
  2024-08-06  9:00 ` [PATCH 15/22] sequencer: release todo list on error paths Patrick Steinhardt
                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  9:00 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 3197 bytes --]

We conditionally release the index used for reading gitattributes in
merge-ort based on whether or the index has been populated. This check
uses `cache_nr` as a condition. This isn't sufficient though, as the
variable may be zero even when some other parts of the index have been
populated. This leads to memory leaks when sparse checkouts are in use,
as we may not end up releasing the sparse checkout patterns.

Fix this issue by unconditionally releasing the index.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 merge-ort.c                       | 3 +--
 t/t3507-cherry-pick-conflict.sh   | 1 +
 t/t6421-merge-partial-clone.sh    | 1 +
 t/t6428-merge-conflicts-sparse.sh | 1 +
 t/t7817-grep-sparse-checkout.sh   | 1 +
 5 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index e9d01ac7f7..3752c7e595 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -689,8 +689,7 @@ static void clear_or_reinit_internal_opts(struct merge_options_internal *opti,
 	 */
 	strmap_clear_func(&opti->conflicted, 0);
 
-	if (opti->attr_index.cache_nr) /* true iff opt->renormalize */
-		discard_index(&opti->attr_index);
+	discard_index(&opti->attr_index);
 
 	/* Free memory used by various renames maps */
 	for (i = MERGE_SIDE1; i <= MERGE_SIDE2; ++i) {
diff --git a/t/t3507-cherry-pick-conflict.sh b/t/t3507-cherry-pick-conflict.sh
index f3947b400a..10e9c91dbb 100755
--- a/t/t3507-cherry-pick-conflict.sh
+++ b/t/t3507-cherry-pick-conflict.sh
@@ -13,6 +13,7 @@ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 TEST_CREATE_REPO_NO_TEMPLATE=1
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 pristine_detach () {
diff --git a/t/t6421-merge-partial-clone.sh b/t/t6421-merge-partial-clone.sh
index 711b709e75..020375c805 100755
--- a/t/t6421-merge-partial-clone.sh
+++ b/t/t6421-merge-partial-clone.sh
@@ -26,6 +26,7 @@ test_description="limiting blob downloads when merging with partial clones"
 #                     underscore notation is to differentiate different
 #                     files that might be renamed into each other's paths.)
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-merge.sh
 
diff --git a/t/t6428-merge-conflicts-sparse.sh b/t/t6428-merge-conflicts-sparse.sh
index 9919c3fa7c..8a79bc2e92 100755
--- a/t/t6428-merge-conflicts-sparse.sh
+++ b/t/t6428-merge-conflicts-sparse.sh
@@ -22,6 +22,7 @@ test_description="merge cases"
 #                     underscore notation is to differentiate different
 #                     files that might be renamed into each other's paths.)
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-merge.sh
 
diff --git a/t/t7817-grep-sparse-checkout.sh b/t/t7817-grep-sparse-checkout.sh
index eb59564565..0ba7817fb7 100755
--- a/t/t7817-grep-sparse-checkout.sh
+++ b/t/t7817-grep-sparse-checkout.sh
@@ -33,6 +33,7 @@ should leave the following structure in the working tree:
 But note that sub2 should have the SKIP_WORKTREE bit set.
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup' '
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 15/22] sequencer: release todo list on error paths
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (13 preceding siblings ...)
  2024-08-06  9:00 ` [PATCH 14/22] merge-ort: unconditionally release attributes index Patrick Steinhardt
@ 2024-08-06  9:00 ` Patrick Steinhardt
  2024-08-08 10:08   ` Phillip Wood
  2024-08-06  9:00 ` [PATCH 16/22] unpack-trees: clear index when not propagating it Patrick Steinhardt
                   ` (11 subsequent siblings)
  26 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  9:00 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 3408 bytes --]

We're not releasing the `todo_list` in `sequencer_pick_revisions()` when
hitting an error path. Restructure the function to have a common exit
path such that we can easily clean up the list and thus plug this memory
leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 sequencer.c                     | 64 +++++++++++++++++++++++----------
 t/t3510-cherry-pick-sequence.sh |  1 +
 2 files changed, 47 insertions(+), 18 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index cade9b0ca8..fec3c5e846 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -5490,8 +5490,10 @@ int sequencer_pick_revisions(struct repository *r,
 	int i, res;
 
 	assert(opts->revs);
-	if (read_and_refresh_cache(r, opts))
-		return -1;
+	if (read_and_refresh_cache(r, opts)) {
+		res = -1;
+		goto out;
+	}
 
 	for (i = 0; i < opts->revs->pending.nr; i++) {
 		struct object_id oid;
@@ -5506,11 +5508,14 @@ int sequencer_pick_revisions(struct repository *r,
 				enum object_type type = oid_object_info(r,
 									&oid,
 									NULL);
-				return error(_("%s: can't cherry-pick a %s"),
+				res = error(_("%s: can't cherry-pick a %s"),
 					name, type_name(type));
+				goto out;
 			}
-		} else
-			return error(_("%s: bad revision"), name);
+		} else {
+			res = error(_("%s: bad revision"), name);
+			goto out;
+		}
 	}
 
 	/*
@@ -5525,14 +5530,23 @@ int sequencer_pick_revisions(struct repository *r,
 	    opts->revs->no_walk &&
 	    !opts->revs->cmdline.rev->flags) {
 		struct commit *cmit;
-		if (prepare_revision_walk(opts->revs))
-			return error(_("revision walk setup failed"));
+
+		if (prepare_revision_walk(opts->revs)) {
+			res = error(_("revision walk setup failed"));
+			goto out;
+		}
+
 		cmit = get_revision(opts->revs);
-		if (!cmit)
-			return error(_("empty commit set passed"));
+		if (!cmit) {
+			res = error(_("empty commit set passed"));
+			goto out;
+		}
+
 		if (get_revision(opts->revs))
 			BUG("unexpected extra commit from walk");
-		return single_pick(r, cmit, opts);
+
+		res = single_pick(r, cmit, opts);
+		goto out;
 	}
 
 	/*
@@ -5542,16 +5556,30 @@ int sequencer_pick_revisions(struct repository *r,
 	 */
 
 	if (walk_revs_populate_todo(&todo_list, opts) ||
-			create_seq_dir(r) < 0)
-		return -1;
-	if (repo_get_oid(r, "HEAD", &oid) && (opts->action == REPLAY_REVERT))
-		return error(_("can't revert as initial commit"));
-	if (save_head(oid_to_hex(&oid)))
-		return -1;
-	if (save_opts(opts))
-		return -1;
+			create_seq_dir(r) < 0) {
+		res = -1;
+		goto out;
+	}
+
+	if (repo_get_oid(r, "HEAD", &oid) && (opts->action == REPLAY_REVERT)) {
+		res = error(_("can't revert as initial commit"));
+		goto out;
+	}
+
+	if (save_head(oid_to_hex(&oid))) {
+		res = -1;
+		goto out;
+	}
+
+	if (save_opts(opts)) {
+		res = -1;
+		goto out;
+	}
+
 	update_abort_safety_file();
 	res = pick_commits(r, &todo_list, opts);
+
+out:
 	todo_list_release(&todo_list);
 	return res;
 }
diff --git a/t/t3510-cherry-pick-sequence.sh b/t/t3510-cherry-pick-sequence.sh
index 7eb52b12ed..93c725bac3 100755
--- a/t/t3510-cherry-pick-sequence.sh
+++ b/t/t3510-cherry-pick-sequence.sh
@@ -12,6 +12,7 @@ test_description='Test cherry-pick continuation features
 
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # Repeat first match 10 times
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 16/22] unpack-trees: clear index when not propagating it
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (14 preceding siblings ...)
  2024-08-06  9:00 ` [PATCH 15/22] sequencer: release todo list on error paths Patrick Steinhardt
@ 2024-08-06  9:00 ` Patrick Steinhardt
  2024-08-06  9:00 ` [PATCH 17/22] diff: fix leak when parsing invalid ignore regex option Patrick Steinhardt
                   ` (10 subsequent siblings)
  26 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  9:00 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 2128 bytes --]

When provided a pointer to a destination index, then `unpack_trees()`
will end up copying its `o->internal.result` index into the provided
pointer. In those cases it is thus not necessary to free the index, as
we have transferred ownership of it.

There are cases though where we do not end up transferring ownership of
the memory, but `clear_unpack_trees_porcelain()` will never discard the
index in that case and thus cause a memory leak. And right now it cannot
do so in the first place because we have no indicator of whether we did
or didn't transfer ownership of the index.

Adapt the code to zero out the index in case we transfer its ownership.
Like this, we can now unconditionally discard the index when being asked
to clear the `unpack_trees_options`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t3705-add-sparse-checkout.sh | 1 +
 unpack-trees.c                 | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/t/t3705-add-sparse-checkout.sh b/t/t3705-add-sparse-checkout.sh
index 2bade9e804..6ae45a788d 100755
--- a/t/t3705-add-sparse-checkout.sh
+++ b/t/t3705-add-sparse-checkout.sh
@@ -2,6 +2,7 @@
 
 test_description='git add in sparse checked out working trees'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 SPARSE_ENTRY_BLOB=""
diff --git a/unpack-trees.c b/unpack-trees.c
index 7dc884fafd..9a55cb6204 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -210,6 +210,7 @@ void clear_unpack_trees_porcelain(struct unpack_trees_options *opts)
 {
 	strvec_clear(&opts->internal.msgs_to_free);
 	memset(opts->internal.msgs, 0, sizeof(opts->internal.msgs));
+	discard_index(&opts->internal.result);
 }
 
 static int do_add_entry(struct unpack_trees_options *o, struct cache_entry *ce,
@@ -2082,6 +2083,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		o->internal.result.updated_workdir = 1;
 		discard_index(o->dst_index);
 		*o->dst_index = o->internal.result;
+		memset(&o->internal.result, 0, sizeof(o->internal.result));
 	} else {
 		discard_index(&o->internal.result);
 	}
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 17/22] diff: fix leak when parsing invalid ignore regex option
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (15 preceding siblings ...)
  2024-08-06  9:00 ` [PATCH 16/22] unpack-trees: clear index when not propagating it Patrick Steinhardt
@ 2024-08-06  9:00 ` Patrick Steinhardt
  2024-08-06  9:00 ` [PATCH 18/22] builtin/format-patch: fix various trivial memory leaks Patrick Steinhardt
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  9:00 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1382 bytes --]

When parsing invalid ignore regexes passed via the `-I` option we don't
free already-allocated memory, leading to a memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 diff.c                  | 6 +++++-
 t/t4013-diff-various.sh | 1 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index ebb7538e04..9251c47b72 100644
--- a/diff.c
+++ b/diff.c
@@ -5464,9 +5464,13 @@ static int diff_opt_ignore_regex(const struct option *opt,
 	regex_t *regex;
 
 	BUG_ON_OPT_NEG(unset);
+
 	regex = xmalloc(sizeof(*regex));
-	if (regcomp(regex, arg, REG_EXTENDED | REG_NEWLINE))
+	if (regcomp(regex, arg, REG_EXTENDED | REG_NEWLINE)) {
+		free(regex);
 		return error(_("invalid regex given to -I: '%s'"), arg);
+	}
+
 	ALLOC_GROW(options->ignore_regex, options->ignore_regex_nr + 1,
 		   options->ignore_regex_alloc);
 	options->ignore_regex[options->ignore_regex_nr++] = regex;
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index 3855d68dbc..87d248d034 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -8,6 +8,7 @@ test_description='Various diff formatting options'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-diff.sh
 
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 18/22] builtin/format-patch: fix various trivial memory leaks
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (16 preceding siblings ...)
  2024-08-06  9:00 ` [PATCH 17/22] diff: fix leak when parsing invalid ignore regex option Patrick Steinhardt
@ 2024-08-06  9:00 ` Patrick Steinhardt
  2024-08-07  8:51   ` James Liu
  2024-08-06  9:01 ` [PATCH 19/22] userdiff: fix leaking memory for configured diff drivers Patrick Steinhardt
                   ` (8 subsequent siblings)
  26 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  9:00 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 2788 bytes --]

There are various memory leaks hit by git-format-patch(1). Basically all
of them are trivial, except that un-setting `diffopt.no_free` requires
us to unset the `diffopt.file` because we manually close it already.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/log.c           | 12 +++++++++---
 t/t4014-format-patch.sh |  1 +
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index a73a767606..ff997a0d0e 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1833,6 +1833,7 @@ static struct commit *get_base_commit(const struct format_config *cfg,
 			}
 
 			rev[i] = merge_base->item;
+			free_commit_list(merge_base);
 		}
 
 		if (rev_nr % 2)
@@ -2023,6 +2024,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	const char *rfc = NULL;
 	int creation_factor = -1;
 	const char *signature = git_version_string;
+	char *signature_to_free = NULL;
 	char *signature_file_arg = NULL;
 	struct keep_callback_data keep_callback_data = {
 		.cfg = &cfg,
@@ -2443,7 +2445,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 
 		if (strbuf_read_file(&buf, signature_file, 128) < 0)
 			die_errno(_("unable to read signature file '%s'"), signature_file);
-		signature = strbuf_detach(&buf, NULL);
+		signature = signature_to_free = strbuf_detach(&buf, NULL);
 	} else if (cfg.signature) {
 		signature = cfg.signature;
 	}
@@ -2548,12 +2550,13 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 			else
 				print_signature(signature, rev.diffopt.file);
 		}
-		if (output_directory)
+		if (output_directory) {
 			fclose(rev.diffopt.file);
+			rev.diffopt.file = NULL;
+		}
 	}
 	stop_progress(&progress);
 	free(list);
-	free(branch_name);
 	if (ignore_if_in_upstream)
 		free_patch_ids(&ids);
 
@@ -2565,11 +2568,14 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	strbuf_release(&rdiff_title);
 	free(description_file);
 	free(signature_file_arg);
+	free(signature_to_free);
+	free(branch_name);
 	free(to_free);
 	free(rev.message_id);
 	if (rev.ref_message_ids)
 		string_list_clear(rev.ref_message_ids, 0);
 	free(rev.ref_message_ids);
+	rev.diffopt.no_free = 0;
 	release_revisions(&rev);
 	format_config_release(&cfg);
 	return 0;
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index 884f83fb8a..1c46e963e4 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -8,6 +8,7 @@ test_description='various format-patch tests'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-terminal.sh
 
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 19/22] userdiff: fix leaking memory for configured diff drivers
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (17 preceding siblings ...)
  2024-08-06  9:00 ` [PATCH 18/22] builtin/format-patch: fix various trivial memory leaks Patrick Steinhardt
@ 2024-08-06  9:01 ` Patrick Steinhardt
  2024-08-07  9:25   ` James Liu
  2024-08-06  9:01 ` [PATCH 20/22] builtin/log: fix leak when showing converted blob contents Patrick Steinhardt
                   ` (7 subsequent siblings)
  26 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  9:01 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 5966 bytes --]

The userdiff structures may be initialized either statically on the
stack or dynamically via configuration keys. In the latter case we end
up leaking memory because we didn't have any infrastructure to discern
those strings which have been allocated statically and those which have
been allocated dynamically.

Refactor the code such that we have two pointers for each of these
strings: one that holds the value as accessed by other subsystems, and
one that points to the same string in case it has been allocated. Like
this, we can safely free the second pointer and thus plug those memory
leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 range-diff.c                     |  6 +++--
 t/t4018-diff-funcname.sh         |  1 +
 t/t4042-diff-textconv-caching.sh |  2 ++
 t/t4048-diff-combined-binary.sh  |  1 +
 t/t4209-log-pickaxe.sh           |  2 ++
 userdiff.c                       | 38 ++++++++++++++++++++++++--------
 userdiff.h                       |  4 ++++
 7 files changed, 43 insertions(+), 11 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index 5f01605550..bbb0952264 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -450,8 +450,10 @@ static void output_pair_header(struct diff_options *diffopt,
 }
 
 static struct userdiff_driver section_headers = {
-	.funcname = { "^ ## (.*) ##$\n"
-		      "^.?@@ (.*)$", REG_EXTENDED }
+	.funcname = {
+		.pattern = "^ ## (.*) ##$\n^.?@@ (.*)$",
+		.cflags = REG_EXTENDED,
+	},
 };
 
 static struct diff_filespec *get_filespec(const char *name, const char *p)
diff --git a/t/t4018-diff-funcname.sh b/t/t4018-diff-funcname.sh
index e026fac1f4..8128c30e7f 100755
--- a/t/t4018-diff-funcname.sh
+++ b/t/t4018-diff-funcname.sh
@@ -5,6 +5,7 @@
 
 test_description='Test custom diff function name patterns'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup' '
diff --git a/t/t4042-diff-textconv-caching.sh b/t/t4042-diff-textconv-caching.sh
index 8ebfa3c1be..a179205394 100755
--- a/t/t4042-diff-textconv-caching.sh
+++ b/t/t4042-diff-textconv-caching.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test textconv caching'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 cat >helper <<'EOF'
diff --git a/t/t4048-diff-combined-binary.sh b/t/t4048-diff-combined-binary.sh
index 0260cf64f5..f399484bce 100755
--- a/t/t4048-diff-combined-binary.sh
+++ b/t/t4048-diff-combined-binary.sh
@@ -4,6 +4,7 @@ test_description='combined and merge diff handle binary files and textconv'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup binary merge conflict' '
diff --git a/t/t4209-log-pickaxe.sh b/t/t4209-log-pickaxe.sh
index 64e1623733..b42fdc54fc 100755
--- a/t/t4209-log-pickaxe.sh
+++ b/t/t4209-log-pickaxe.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='log --grep/--author/--regexp-ignore-case/-S/-G'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_log () {
diff --git a/userdiff.c b/userdiff.c
index c4ebb9ff73..989629149f 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -399,8 +399,11 @@ static struct userdiff_driver *userdiff_find_by_namelen(const char *name, size_t
 static int parse_funcname(struct userdiff_funcname *f, const char *k,
 		const char *v, int cflags)
 {
-	if (git_config_string((char **) &f->pattern, k, v) < 0)
+	f->pattern = NULL;
+	FREE_AND_NULL(f->pattern_owned);
+	if (git_config_string(&f->pattern_owned, k, v) < 0)
 		return -1;
+	f->pattern = f->pattern_owned;
 	f->cflags = cflags;
 	return 0;
 }
@@ -444,20 +447,37 @@ int userdiff_config(const char *k, const char *v)
 		return parse_funcname(&drv->funcname, k, v, REG_EXTENDED);
 	if (!strcmp(type, "binary"))
 		return parse_tristate(&drv->binary, k, v);
-	if (!strcmp(type, "command"))
-		return git_config_string((char **) &drv->external.cmd, k, v);
+	if (!strcmp(type, "command")) {
+		FREE_AND_NULL(drv->external.cmd);
+		return git_config_string(&drv->external.cmd, k, v);
+	}
 	if (!strcmp(type, "trustexitcode")) {
 		drv->external.trust_exit_code = git_config_bool(k, v);
 		return 0;
 	}
-	if (!strcmp(type, "textconv"))
-		return git_config_string((char **) &drv->textconv, k, v);
+	if (!strcmp(type, "textconv")) {
+		int ret;
+		FREE_AND_NULL(drv->textconv_owned);
+		ret = git_config_string(&drv->textconv_owned, k, v);
+		drv->textconv = drv->textconv_owned;
+		return ret;
+	}
 	if (!strcmp(type, "cachetextconv"))
 		return parse_bool(&drv->textconv_want_cache, k, v);
-	if (!strcmp(type, "wordregex"))
-		return git_config_string((char **) &drv->word_regex, k, v);
-	if (!strcmp(type, "algorithm"))
-		return git_config_string((char **) &drv->algorithm, k, v);
+	if (!strcmp(type, "wordregex")) {
+		int ret;
+		FREE_AND_NULL(drv->word_regex_owned);
+		ret = git_config_string(&drv->word_regex_owned, k, v);
+		drv->word_regex = drv->word_regex_owned;
+		return ret;
+	}
+	if (!strcmp(type, "algorithm")) {
+		int ret;
+		FREE_AND_NULL(drv->algorithm_owned);
+		ret = git_config_string(&drv->algorithm_owned, k, v);
+		drv->algorithm = drv->algorithm_owned;
+		return ret;
+	}
 
 	return 0;
 }
diff --git a/userdiff.h b/userdiff.h
index 7565930337..827361b0bc 100644
--- a/userdiff.h
+++ b/userdiff.h
@@ -8,6 +8,7 @@ struct repository;
 
 struct userdiff_funcname {
 	const char *pattern;
+	char *pattern_owned;
 	int cflags;
 };
 
@@ -20,11 +21,14 @@ struct userdiff_driver {
 	const char *name;
 	struct external_diff external;
 	const char *algorithm;
+	char *algorithm_owned;
 	int binary;
 	struct userdiff_funcname funcname;
 	const char *word_regex;
+	char *word_regex_owned;
 	const char *word_regex_multi_byte;
 	const char *textconv;
+	char *textconv_owned;
 	struct notes_cache *textconv_cache;
 	int textconv_want_cache;
 };
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 20/22] builtin/log: fix leak when showing converted blob contents
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (18 preceding siblings ...)
  2024-08-06  9:01 ` [PATCH 19/22] userdiff: fix leaking memory for configured diff drivers Patrick Steinhardt
@ 2024-08-06  9:01 ` Patrick Steinhardt
  2024-08-06  9:01 ` [PATCH 21/22] diff: free state populated via options Patrick Steinhardt
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  9:01 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1189 bytes --]

In `show_blob_object()`, we proactively call `textconv_object()`. In
case we have a textconv driver for this blob we will end up showing the
converted contents, otherwise we'll show the un-converted contents of it
instead.

When the object has been converted we never free the buffer containing
the converted contents. Fix this to plug this memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/log.c            | 1 +
 t/t4030-diff-textconv.sh | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/builtin/log.c b/builtin/log.c
index ff997a0d0e..1a684b68f2 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -707,6 +707,7 @@ static int show_blob_object(const struct object_id *oid, struct rev_info *rev, c
 
 	write_or_die(1, buf, size);
 	object_context_release(&obj_context);
+	free(buf);
 	return 0;
 }
 
diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh
index a39a626664..29f6d610c2 100755
--- a/t/t4030-diff-textconv.sh
+++ b/t/t4030-diff-textconv.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='diff.*.textconv tests'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 find_diff() {
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 21/22] diff: free state populated via options
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (19 preceding siblings ...)
  2024-08-06  9:01 ` [PATCH 20/22] builtin/log: fix leak when showing converted blob contents Patrick Steinhardt
@ 2024-08-06  9:01 ` Patrick Steinhardt
  2024-08-06  9:01 ` [PATCH 22/22] builtin/diff: free symmetric diff members Patrick Steinhardt
                   ` (5 subsequent siblings)
  26 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  9:01 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 2019 bytes --]

The `objfind` and `anchors` members of `struct diff_options` are
populated via option parsing, but are never freed in `diff_free()`. Fix
this to plug those memory leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 diff.c                   | 10 ++++++++++
 t/t4064-diff-oidfind.sh  |  2 ++
 t/t4065-diff-anchored.sh |  1 +
 t/t4069-remerge-diff.sh  |  1 +
 4 files changed, 14 insertions(+)

diff --git a/diff.c b/diff.c
index 9251c47b72..4035a9374d 100644
--- a/diff.c
+++ b/diff.c
@@ -6717,6 +6717,16 @@ void diff_free(struct diff_options *options)
 	if (options->no_free)
 		return;
 
+	if (options->objfind) {
+		oidset_clear(options->objfind);
+		FREE_AND_NULL(options->objfind);
+	}
+
+	for (size_t i = 0; i < options->anchors_nr; i++)
+		free(options->anchors[i]);
+	FREE_AND_NULL(options->anchors);
+	options->anchors_nr = options->anchors_alloc = 0;
+
 	diff_free_file(options);
 	diff_free_ignore_regex(options);
 	clear_pathspec(&options->pathspec);
diff --git a/t/t4064-diff-oidfind.sh b/t/t4064-diff-oidfind.sh
index 6d8c8986fc..846f285f77 100755
--- a/t/t4064-diff-oidfind.sh
+++ b/t/t4064-diff-oidfind.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test finding specific blobs in the revision walking'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup ' '
diff --git a/t/t4065-diff-anchored.sh b/t/t4065-diff-anchored.sh
index b3f510f040..647537c12e 100755
--- a/t/t4065-diff-anchored.sh
+++ b/t/t4065-diff-anchored.sh
@@ -2,6 +2,7 @@
 
 test_description='anchored diff algorithm'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success '--anchored' '
diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
index 07323ebafe..888714bbd3 100755
--- a/t/t4069-remerge-diff.sh
+++ b/t/t4069-remerge-diff.sh
@@ -2,6 +2,7 @@
 
 test_description='remerge-diff handling'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # This test is ort-specific
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH 22/22] builtin/diff: free symmetric diff members
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (20 preceding siblings ...)
  2024-08-06  9:01 ` [PATCH 21/22] diff: free state populated via options Patrick Steinhardt
@ 2024-08-06  9:01 ` Patrick Steinhardt
  2024-08-07  9:27 ` [PATCH 00/22] Memory leak fixes (pt.4) James Liu
                   ` (4 subsequent siblings)
  26 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-06  9:01 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 2502 bytes --]

We populate a `struct symdiff` in case the user has requested a
symmetric diff. Part of this is to populate a `skip` bitmap that
indicates whihc commits shall be ignored in the diff. But while this
bitmap is dynamically allocated, we never free it.

Fix this by introducing and calling a new `symdiff_release()` function
that does this for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/diff.c                       | 10 +++++++++-
 t/t4068-diff-symmetric-merge-base.sh |  1 +
 t/t4108-apply-threeway.sh            |  1 +
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/builtin/diff.c b/builtin/diff.c
index 9b6cdabe15..f87f68a5bc 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -388,6 +388,13 @@ static void symdiff_prepare(struct rev_info *rev, struct symdiff *sym)
 	sym->skip = map;
 }
 
+static void symdiff_release(struct symdiff *sdiff)
+{
+	if (!sdiff)
+		return;
+	bitmap_free(sdiff->skip);
+}
+
 int cmd_diff(int argc, const char **argv, const char *prefix)
 {
 	int i;
@@ -398,7 +405,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
 	struct object_array_entry *blob[2];
 	int nongit = 0, no_index = 0;
 	int result;
-	struct symdiff sdiff;
+	struct symdiff sdiff = {0};
 
 	/*
 	 * We could get N tree-ish in the rev.pending_objects list.
@@ -619,6 +626,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
 		refresh_index_quietly();
 	release_revisions(&rev);
 	object_array_clear(&ent);
+	symdiff_release(&sdiff);
 	UNLEAK(blob);
 	return result;
 }
diff --git a/t/t4068-diff-symmetric-merge-base.sh b/t/t4068-diff-symmetric-merge-base.sh
index eff63c16b0..4d6565e728 100755
--- a/t/t4068-diff-symmetric-merge-base.sh
+++ b/t/t4068-diff-symmetric-merge-base.sh
@@ -5,6 +5,7 @@ test_description='behavior of diff with symmetric-diff setups and --merge-base'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # build these situations:
diff --git a/t/t4108-apply-threeway.sh b/t/t4108-apply-threeway.sh
index c558282bc0..3211e1e65f 100755
--- a/t/t4108-apply-threeway.sh
+++ b/t/t4108-apply-threeway.sh
@@ -5,6 +5,7 @@ test_description='git apply --3way'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 print_sanitized_conflicted_diff () {
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* Re: [PATCH 02/22] git: fix leaking system paths
  2024-08-06  8:59 ` [PATCH 02/22] git: fix leaking system paths Patrick Steinhardt
@ 2024-08-07  4:02   ` James Liu
  0 siblings, 0 replies; 146+ messages in thread
From: James Liu @ 2024-08-07  4:02 UTC (permalink / raw)
  To: Patrick Steinhardt, git

On Tue Aug 6, 2024 at 6:59 PM AEST, Patrick Steinhardt wrote:
> Git has some flags to make it output system paths as they have been
> compiled into Git. This is done by calling `system_path()`, which
> returns an allocated string. This string isn't ever free'd though,
> creating a memory leak.
>
> Plug those leaks. While they are surfaced by t0211, there are more
> memory leaks looming exposed by that test suite and it thus does not yet
> pass with the memory leak checker enabled.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  git.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/git.c b/git.c
> index e35af9b0e5..5eab88b472 100644
> --- a/git.c
> +++ b/git.c
> @@ -173,15 +173,21 @@ static int handle_options(const char ***argv, int *argc, int *envchanged)
>  				exit(0);
>  			}
>  		} else if (!strcmp(cmd, "--html-path")) {
> -			puts(system_path(GIT_HTML_PATH));
> +			char *path = system_path(GIT_HTML_PATH);
> +			puts(path);
> +			free(path);
>  			trace2_cmd_name("_query_");
>  			exit(0);
>  		} else if (!strcmp(cmd, "--man-path")) {
> -			puts(system_path(GIT_MAN_PATH));
> +			char *path = system_path(GIT_MAN_PATH);
> +			puts(path);
> +			free(path);
>  			trace2_cmd_name("_query_");
>  			exit(0);
>  		} else if (!strcmp(cmd, "--info-path")) {
> -			puts(system_path(GIT_INFO_PATH));
> +			char *path = system_path(GIT_INFO_PATH);
> +			puts(path);
> +			free(path);
>  			trace2_cmd_name("_query_");
>  			exit(0);
>  		} else if (!strcmp(cmd, "-p") || !strcmp(cmd, "--paginate")) {

Oh interesting. These don't immediately stand out as leaky due to the
absence of intermediate variables, but nevertheless an allocation took
place that we need to free.


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 06/22] read-cache: fix leaking hashfile when writing index fails
  2024-08-06  9:00 ` [PATCH 06/22] read-cache: fix leaking hashfile when writing index fails Patrick Steinhardt
@ 2024-08-07  7:01   ` James Liu
  2024-08-08  5:04     ` Patrick Steinhardt
  0 siblings, 1 reply; 146+ messages in thread
From: James Liu @ 2024-08-07  7:01 UTC (permalink / raw)
  To: Patrick Steinhardt, git

On Tue Aug 6, 2024 at 7:00 PM AEST, Patrick Steinhardt wrote:
> Refactor the code to have a common exit path where we can free this and
> other allocated memory. While at it, refactor our use of `strbuf`s such
> that we reuse the same buffer to avoid some unneeded allocations.
>
> @@ -3105,7 +3117,14 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
>  	trace2_data_intmax("index", the_repository, "write/cache_nr",
>  			   istate->cache_nr);
>  
> -	return 0;
> +	ret = 0;
> +
> +out:
> +	if (f)
> +		free_hashfile(f);
> +	strbuf_release(&sb);
> +	free(ieot);
> +	return ret;
>  }

Is it generally a pattern in Git to use `goto <label>` instead of
returns when there are multiple return points in a function? We're also
performing cleanup duties here and in most of those scenarios but there
are some cases like `reftable_be_pack_refs()` where the goto simply
collapses multiple return points into a single path.


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 08/22] config: fix leaking comment character config
  2024-08-06  9:00 ` [PATCH 08/22] config: fix leaking comment character config Patrick Steinhardt
@ 2024-08-07  7:11   ` James Liu
  2024-08-08  5:04     ` Patrick Steinhardt
  0 siblings, 1 reply; 146+ messages in thread
From: James Liu @ 2024-08-07  7:11 UTC (permalink / raw)
  To: Patrick Steinhardt, git

On Tue Aug 6, 2024 at 7:00 PM AEST, Patrick Steinhardt wrote:
> Refactor the code so that we initialize the value with another array.
> This allows us to free the value in case the string is not pointing to
> that constant array anymore.
>
> diff --git a/environment.c b/environment.c
> index 5cea2c9f54..8297c6e37b 100644
> --- a/environment.c
> +++ b/environment.c
> @@ -113,7 +113,8 @@ int protect_ntfs = PROTECT_NTFS_DEFAULT;
>   * The character that begins a commented line in user-editable file
>   * that is subject to stripspace.
>   */
> -const char *comment_line_str = "#";
> +const char comment_line_str_default[] = "#";
> +const char *comment_line_str = comment_line_str_default;
>  int auto_comment_line_char;
>  
>  /* Parallel index stat data preload? */

Is my understanding correct that `comment_line_str` is now just a
pointer to the `comment_line_str_default` array, and thus can be freed
once we're done with it?


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 09/22] builtin/rebase: fix leaking `commit.gpgsign` value
  2024-08-06  9:00 ` [PATCH 09/22] builtin/rebase: fix leaking `commit.gpgsign` value Patrick Steinhardt
@ 2024-08-07  7:32   ` James Liu
  2024-08-08  5:05     ` Patrick Steinhardt
  2024-08-08 10:07   ` Phillip Wood
  1 sibling, 1 reply; 146+ messages in thread
From: James Liu @ 2024-08-07  7:32 UTC (permalink / raw)
  To: Patrick Steinhardt, git

On Tue Aug 6, 2024 at 7:00 PM AEST, Patrick Steinhardt wrote:
> In `get_replay_opts()`, we unconditionally override the `gpg_sign` field
> that already got populated by `sequencer_init_config()` in case the user
> has "commit.gpgsign" set in their config. It is kind of dubious whether
> this is the correct thing to do or a bug. What is clear though is that
> this creates a memory leak.
>
> Let's mark this assignment with a TODO comment to figure out whether
> this needs to be fixed or not. Meanwhile though, let's plug the memory
> leak.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  builtin/rebase.c              | 8 ++++++++
>  sequencer.c                   | 1 +
>  t/t3404-rebase-interactive.sh | 1 +
>  t/t3435-rebase-gpg-sign.sh    | 1 +
>  t/t7030-verify-tag.sh         | 1 +
>  5 files changed, 12 insertions(+)
>
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index e3a8e74cfc..f65316a023 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -186,7 +186,15 @@ static struct replay_opts get_replay_opts(const struct rebase_options *opts)
>  	replay.committer_date_is_author_date =
>  					opts->committer_date_is_author_date;
>  	replay.ignore_date = opts->ignore_date;
> +
> +	/*
> +	 * TODO: Is it really intentional that we unconditionally override
> +	 * `replay.gpg_sign` even if it has already been initialized via the
> +	 * configuration?
> +	 */
> +	free(replay.gpg_sign);
>  	replay.gpg_sign = xstrdup_or_null(opts->gpg_sign_opt);
> +
>  	replay.reflog_action = xstrdup(opts->reflog_action);
>  	if (opts->strategy)
>  		replay.strategy = xstrdup_or_null(opts->strategy);
> diff --git a/sequencer.c b/sequencer.c
> index 0291920f0b..cade9b0ca8 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -303,6 +303,7 @@ static int git_sequencer_config(const char *k, const char *v,
>  	}
>  
>  	if (!strcmp(k, "commit.gpgsign")) {
> +		free(opts->gpg_sign);
>  		opts->gpg_sign = git_config_bool(k, v) ? xstrdup("") : NULL;
>  		return 0;
>  	}

It looks like this free'ing would be managed by the caller by invoking
`replay_opts_release()`, but it's not being done consistently.

For example, `do_interactive_rebase()` invokes `replay_opts_release()`,
but `run_sequencer_rebase()` does not. Would it be better to address the
leak here?


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 13/22] builtin/fast-export: plug leaking tag names
  2024-08-06  9:00 ` [PATCH 13/22] builtin/fast-export: plug leaking tag names Patrick Steinhardt
@ 2024-08-07  8:31   ` James Liu
  2024-08-08  5:05     ` Patrick Steinhardt
  0 siblings, 1 reply; 146+ messages in thread
From: James Liu @ 2024-08-07  8:31 UTC (permalink / raw)
  To: Patrick Steinhardt, git

On Tue Aug 6, 2024 at 7:00 PM AEST, Patrick Steinhardt wrote:
> Refactor the code to make the lists we put those names into duplicate
> the memory. This allows us to properly free the string as required and
> thus plugs the memory leak.
>
> While this requires us to allocate more data overall, it shouldn't be
> all that bad given that the number of allocations corresponds with the
> number of command line parameters, which typically aren't all that many.

Ahh so using the `STRING_LIST_INIT_DUP` initialiser means that every
time we call `string_list_append()` on the list, we retain ownership of
the string and the list gets its own copy.

That means we're able to free our own copy later on.


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 18/22] builtin/format-patch: fix various trivial memory leaks
  2024-08-06  9:00 ` [PATCH 18/22] builtin/format-patch: fix various trivial memory leaks Patrick Steinhardt
@ 2024-08-07  8:51   ` James Liu
  2024-08-08  5:05     ` Patrick Steinhardt
  0 siblings, 1 reply; 146+ messages in thread
From: James Liu @ 2024-08-07  8:51 UTC (permalink / raw)
  To: Patrick Steinhardt, git

> diff --git a/builtin/log.c b/builtin/log.c
> index a73a767606..ff997a0d0e 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -2023,6 +2024,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>  	const char *rfc = NULL;
>  	int creation_factor = -1;
>  	const char *signature = git_version_string;
> +	char *signature_to_free = NULL;
>  	char *signature_file_arg = NULL;
>  	struct keep_callback_data keep_callback_data = {
>  		.cfg = &cfg,
> @@ -2443,7 +2445,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>  
>  		if (strbuf_read_file(&buf, signature_file, 128) < 0)
>  			die_errno(_("unable to read signature file '%s'"), signature_file);
> -		signature = strbuf_detach(&buf, NULL);
> +		signature = signature_to_free = strbuf_detach(&buf, NULL);

Do I understand this correctly, that the multiple assignment here allows
us to maintain a reference to the pointer returned by `strbuf_detach()`
in `signature_to_free`, and we do this because `signature` can take on a
different value below?

>  	} else if (cfg.signature) {
>  		signature = cfg.signature;
>  	}



^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 19/22] userdiff: fix leaking memory for configured diff drivers
  2024-08-06  9:01 ` [PATCH 19/22] userdiff: fix leaking memory for configured diff drivers Patrick Steinhardt
@ 2024-08-07  9:25   ` James Liu
  2024-08-08  5:05     ` Patrick Steinhardt
  0 siblings, 1 reply; 146+ messages in thread
From: James Liu @ 2024-08-07  9:25 UTC (permalink / raw)
  To: Patrick Steinhardt, git

> Refactor the code such that we have two pointers for each of these
> strings: one that holds the value as accessed by other subsystems, and
> one that points to the same string in case it has been allocated. Like
> this, we can safely free the second pointer and thus plug those memory
> leaks.
>
> diff --git a/userdiff.c b/userdiff.c
> index c4ebb9ff73..989629149f 100644
> --- a/userdiff.c
> +++ b/userdiff.c
> @@ -399,8 +399,11 @@ static struct userdiff_driver *userdiff_find_by_namelen(const char *name, size_t
>  static int parse_funcname(struct userdiff_funcname *f, const char *k,
>  		const char *v, int cflags)
>  {
> -	if (git_config_string((char **) &f->pattern, k, v) < 0)
> +	f->pattern = NULL;
> +	FREE_AND_NULL(f->pattern_owned);
> +	if (git_config_string(&f->pattern_owned, k, v) < 0)
>  		return -1;
> +	f->pattern = f->pattern_owned;
>  	f->cflags = cflags;
>  	return 0;
>  }

I'm not sure if I understand this change completely. We don't seem to be
using `pattern_owned` (and the other *_owned) fields differently from
their regular counterparts.

Is it because we can't do the following?

        FREE_AND_NULL((char **)f->pattern);


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 00/22] Memory leak fixes (pt.4)
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (21 preceding siblings ...)
  2024-08-06  9:01 ` [PATCH 22/22] builtin/diff: free symmetric diff members Patrick Steinhardt
@ 2024-08-07  9:27 ` James Liu
  2024-08-08  5:05   ` Patrick Steinhardt
  2024-08-07 16:59 ` Junio C Hamano
                   ` (3 subsequent siblings)
  26 siblings, 1 reply; 146+ messages in thread
From: James Liu @ 2024-08-07  9:27 UTC (permalink / raw)
  To: Patrick Steinhardt, git

On Tue Aug 6, 2024 at 6:59 PM AEST, Patrick Steinhardt wrote:
> Hi,
>
> the third set of memory leak fixes was merged to `next`, so this is the
> next part of more or less random memory leak fixes all over the place.
> With this series, we're at ~155 leaking test suites. Naturally, I've
> already got v5 in the pipeline, which brings us down to ~120.
>
> The series is built on top of 406f326d27 (The second batch, 2024-08-01)
> with ps/leakfixes-part-3 at f30bfafcd4 (commit-reach: fix trivial memory
> leak when computing reachability, 2024-08-01) merged into it.
>
> Thanks!
>
> Patrick

Thanks Patrick, most of these fixes make sense to me! I appreciate that
even the minor changes are accompanied by context.

Cheers,
James

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 00/22] Memory leak fixes (pt.4)
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (22 preceding siblings ...)
  2024-08-07  9:27 ` [PATCH 00/22] Memory leak fixes (pt.4) James Liu
@ 2024-08-07 16:59 ` Junio C Hamano
  2024-08-07 17:03   ` Patrick Steinhardt
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                   ` (2 subsequent siblings)
  26 siblings, 1 reply; 146+ messages in thread
From: Junio C Hamano @ 2024-08-07 16:59 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

> The series is built on top of 406f326d27 (The second batch, 2024-08-01)
> with ps/leakfixes-part-3 at f30bfafcd4 (commit-reach: fix trivial memory
> leak when computing reachability, 2024-08-01) merged into it.

A quick question.  Is it on your radar that transport_get() leaks
the helper name when "foo::bar" is given as a remote?

  https://github.com/git/git/actions/runs/10274435719/job/28431161208#step:5:893

If not, I'll handle it separately, whose fix should look something
like the attached.

Thanks.

 transport.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git c/transport.c w/transport.c
index 12cc5b4d96..13bf8183b7 100644
--- c/transport.c
+++ w/transport.c
@@ -1115,6 +1115,7 @@ static struct transport_vtable builtin_smart_vtable = {
 struct transport *transport_get(struct remote *remote, const char *url)
 {
 	const char *helper;
+	char *helper_to_free = NULL;
 	const char *p;
 	struct transport *ret = xcalloc(1, sizeof(*ret));
 
@@ -1139,10 +1140,11 @@ struct transport *transport_get(struct remote *remote, const char *url)
 	while (is_urlschemechar(p == url, *p))
 		p++;
 	if (starts_with(p, "::"))
-		helper = xstrndup(url, p - url);
+		helper_to_free = helper = xstrndup(url, p - url);
 
 	if (helper) {
 		transport_helper_init(ret, helper);
+		free(helper_to_free);
 	} else if (starts_with(url, "rsync:")) {
 		die(_("git-over-rsync is no longer supported"));
 	} else if (url_is_local_not_ssh(url) && is_file(url) && is_bundle(url, 1)) {


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* Re: [PATCH 00/22] Memory leak fixes (pt.4)
  2024-08-07 16:59 ` Junio C Hamano
@ 2024-08-07 17:03   ` Patrick Steinhardt
  2024-08-08  0:32     ` Junio C Hamano
  0 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-07 17:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 963 bytes --]

On Wed, Aug 07, 2024 at 09:59:39AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > The series is built on top of 406f326d27 (The second batch, 2024-08-01)
> > with ps/leakfixes-part-3 at f30bfafcd4 (commit-reach: fix trivial memory
> > leak when computing reachability, 2024-08-01) merged into it.
> 
> A quick question.  Is it on your radar that transport_get() leaks
> the helper name when "foo::bar" is given as a remote?
> 
>   https://github.com/git/git/actions/runs/10274435719/job/28431161208#step:5:893
> 
> If not, I'll handle it separately, whose fix should look something
> like the attached.
> 
> Thanks.

Yeah, it's in part 5 [1], 97613b9cb9 (transport-helper: fix leaking
helper name, 2024-05-27). Feel free to handle it separately though, I'll
wait for part 4 to land first anyway, which likely takes a couple of
days.

Patrick

[1]: https://gitlab.com/gitlab-org/git/-/tree/pks-leak-fixes-pt5

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 00/22] Memory leak fixes (pt.4)
  2024-08-07 17:03   ` Patrick Steinhardt
@ 2024-08-08  0:32     ` Junio C Hamano
  0 siblings, 0 replies; 146+ messages in thread
From: Junio C Hamano @ 2024-08-08  0:32 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

> On Wed, Aug 07, 2024 at 09:59:39AM -0700, Junio C Hamano wrote:
>> Patrick Steinhardt <ps@pks.im> writes:
>> 
>> > The series is built on top of 406f326d27 (The second batch, 2024-08-01)
>> > with ps/leakfixes-part-3 at f30bfafcd4 (commit-reach: fix trivial memory
>> > leak when computing reachability, 2024-08-01) merged into it.
>> 
>> A quick question.  Is it on your radar that transport_get() leaks
>> the helper name when "foo::bar" is given as a remote?
>> 
>>   https://github.com/git/git/actions/runs/10274435719/job/28431161208#step:5:893
>> 
>> If not, I'll handle it separately, whose fix should look something
>> like the attached.
>> 
>> Thanks.
>
> Yeah, it's in part 5 [1], 97613b9cb9 (transport-helper: fix leaking
> helper name, 2024-05-27). Feel free to handle it separately though, I'll
> wait for part 4 to land first anyway, which likely takes a couple of
> days.

OK, will do, as this seems to break CI.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 06/22] read-cache: fix leaking hashfile when writing index fails
  2024-08-07  7:01   ` James Liu
@ 2024-08-08  5:04     ` Patrick Steinhardt
  0 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08  5:04 UTC (permalink / raw)
  To: James Liu; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1354 bytes --]

On Wed, Aug 07, 2024 at 05:01:17PM +1000, James Liu wrote:
> On Tue Aug 6, 2024 at 7:00 PM AEST, Patrick Steinhardt wrote:
> > Refactor the code to have a common exit path where we can free this and
> > other allocated memory. While at it, refactor our use of `strbuf`s such
> > that we reuse the same buffer to avoid some unneeded allocations.
> >
> > @@ -3105,7 +3117,14 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
> >  	trace2_data_intmax("index", the_repository, "write/cache_nr",
> >  			   istate->cache_nr);
> >  
> > -	return 0;
> > +	ret = 0;
> > +
> > +out:
> > +	if (f)
> > +		free_hashfile(f);
> > +	strbuf_release(&sb);
> > +	free(ieot);
> > +	return ret;
> >  }
> 
> Is it generally a pattern in Git to use `goto <label>` instead of
> returns when there are multiple return points in a function? We're also
> performing cleanup duties here and in most of those scenarios but there
> are some cases like `reftable_be_pack_refs()` where the goto simply
> collapses multiple return points into a single path.

Yes, that's usually how we avoid repetetive cleanup code for each of the
return paths. `reftable_be_pack_refs()` is a bit more on the curious
side as the common exit path doesn't really do helpful. That one could
have just as well used plain returns.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 08/22] config: fix leaking comment character config
  2024-08-07  7:11   ` James Liu
@ 2024-08-08  5:04     ` Patrick Steinhardt
  2024-08-08 15:54       ` Junio C Hamano
  0 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08  5:04 UTC (permalink / raw)
  To: James Liu; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1917 bytes --]

On Wed, Aug 07, 2024 at 05:11:28PM +1000, James Liu wrote:
> On Tue Aug 6, 2024 at 7:00 PM AEST, Patrick Steinhardt wrote:
> > Refactor the code so that we initialize the value with another array.
> > This allows us to free the value in case the string is not pointing to
> > that constant array anymore.
> >
> > diff --git a/environment.c b/environment.c
> > index 5cea2c9f54..8297c6e37b 100644
> > --- a/environment.c
> > +++ b/environment.c
> > @@ -113,7 +113,8 @@ int protect_ntfs = PROTECT_NTFS_DEFAULT;
> >   * The character that begins a commented line in user-editable file
> >   * that is subject to stripspace.
> >   */
> > -const char *comment_line_str = "#";
> > +const char comment_line_str_default[] = "#";
> > +const char *comment_line_str = comment_line_str_default;
> >  int auto_comment_line_char;
> >  
> >  /* Parallel index stat data preload? */
> 
> Is my understanding correct that `comment_line_str` is now just a
> pointer to the `comment_line_str_default` array, and thus can be freed
> once we're done with it?

Not quite. By default, `comment_line_str` also points to
comment_line_str_default`, which is a string constant and thus neither
of these variables can be free'd. But what this split allows us to do is
to check whether `comment_line_str` has changed from the default, and
thus we can conditionall free it when it does not point to the default
value anymore.

Now that I revisit this commit I'm not quite happy with it anymore. We
still need to have the cast, which is somewhat awkward. I think the
better solution is to instead have a `comment_line_str_allocated`
variable that is non-constant. I'll adapt the code accordingly.

An even better solution would be to have `struct strbuf` provide an
initializer that populates it with a string constant. But that feels
like a larger undertaking, so I'll leave that for the future.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 09/22] builtin/rebase: fix leaking `commit.gpgsign` value
  2024-08-07  7:32   ` James Liu
@ 2024-08-08  5:05     ` Patrick Steinhardt
  0 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08  5:05 UTC (permalink / raw)
  To: James Liu; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 2956 bytes --]

On Wed, Aug 07, 2024 at 05:32:25PM +1000, James Liu wrote:
> On Tue Aug 6, 2024 at 7:00 PM AEST, Patrick Steinhardt wrote:
> > In `get_replay_opts()`, we unconditionally override the `gpg_sign` field
> > that already got populated by `sequencer_init_config()` in case the user
> > has "commit.gpgsign" set in their config. It is kind of dubious whether
> > this is the correct thing to do or a bug. What is clear though is that
> > this creates a memory leak.
> >
> > Let's mark this assignment with a TODO comment to figure out whether
> > this needs to be fixed or not. Meanwhile though, let's plug the memory
> > leak.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  builtin/rebase.c              | 8 ++++++++
> >  sequencer.c                   | 1 +
> >  t/t3404-rebase-interactive.sh | 1 +
> >  t/t3435-rebase-gpg-sign.sh    | 1 +
> >  t/t7030-verify-tag.sh         | 1 +
> >  5 files changed, 12 insertions(+)
> >
> > diff --git a/builtin/rebase.c b/builtin/rebase.c
> > index e3a8e74cfc..f65316a023 100644
> > --- a/builtin/rebase.c
> > +++ b/builtin/rebase.c
> > @@ -186,7 +186,15 @@ static struct replay_opts get_replay_opts(const struct rebase_options *opts)
> >  	replay.committer_date_is_author_date =
> >  					opts->committer_date_is_author_date;
> >  	replay.ignore_date = opts->ignore_date;
> > +
> > +	/*
> > +	 * TODO: Is it really intentional that we unconditionally override
> > +	 * `replay.gpg_sign` even if it has already been initialized via the
> > +	 * configuration?
> > +	 */
> > +	free(replay.gpg_sign);
> >  	replay.gpg_sign = xstrdup_or_null(opts->gpg_sign_opt);
> > +
> >  	replay.reflog_action = xstrdup(opts->reflog_action);
> >  	if (opts->strategy)
> >  		replay.strategy = xstrdup_or_null(opts->strategy);
> > diff --git a/sequencer.c b/sequencer.c
> > index 0291920f0b..cade9b0ca8 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -303,6 +303,7 @@ static int git_sequencer_config(const char *k, const char *v,
> >  	}
> >  
> >  	if (!strcmp(k, "commit.gpgsign")) {
> > +		free(opts->gpg_sign);
> >  		opts->gpg_sign = git_config_bool(k, v) ? xstrdup("") : NULL;
> >  		return 0;
> >  	}
> 
> It looks like this free'ing would be managed by the caller by invoking
> `replay_opts_release()`, but it's not being done consistently.
> 
> For example, `do_interactive_rebase()` invokes `replay_opts_release()`,
> but `run_sequencer_rebase()` does not. Would it be better to address the
> leak here?

The problem here isn't that `replay_opts_release()` doesn't free the
values, or that the function isn't called consistently. The problem
rather is that we assign `opts->gpg_sign` even though it may already be
assigned an allocated string. Consequently, `replay_opts_release()`
doesn't even have a chance to free the old value because it cannot see
it anymore.

I'll massage the commit message a bit to clarify this.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 13/22] builtin/fast-export: plug leaking tag names
  2024-08-07  8:31   ` James Liu
@ 2024-08-08  5:05     ` Patrick Steinhardt
  0 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08  5:05 UTC (permalink / raw)
  To: James Liu; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1377 bytes --]

On Wed, Aug 07, 2024 at 06:31:33PM +1000, James Liu wrote:
> On Tue Aug 6, 2024 at 7:00 PM AEST, Patrick Steinhardt wrote:
> > Refactor the code to make the lists we put those names into duplicate
> > the memory. This allows us to properly free the string as required and
> > thus plugs the memory leak.
> >
> > While this requires us to allocate more data overall, it shouldn't be
> > all that bad given that the number of allocations corresponds with the
> > number of command line parameters, which typically aren't all that many.
> 
> Ahh so using the `STRING_LIST_INIT_DUP` initialiser means that every
> time we call `string_list_append()` on the list, we retain ownership of
> the string and the list gets its own copy.
> 
> That means we're able to free our own copy later on.

Yes, exactly. I think that we really should change the naming though.
I've repeatedly seen the pattern that people think initializing the list
wtih `_NODUP` would transfer ownership of inserted strings. It does not
though, it simply assumes that the strings will be kept alive by the
caller.

This is made worse by the fact that we have `strvec_insert_nodup()`,
which _does_ transfer ownership. So we're using two different meanings
for "nodup", so I totally get why people are confused by this interface.

I'll leave that for a separate series though.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 18/22] builtin/format-patch: fix various trivial memory leaks
  2024-08-07  8:51   ` James Liu
@ 2024-08-08  5:05     ` Patrick Steinhardt
  0 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08  5:05 UTC (permalink / raw)
  To: James Liu; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1618 bytes --]

On Wed, Aug 07, 2024 at 06:51:52PM +1000, James Liu wrote:
> > diff --git a/builtin/log.c b/builtin/log.c
> > index a73a767606..ff997a0d0e 100644
> > --- a/builtin/log.c
> > +++ b/builtin/log.c
> > @@ -2023,6 +2024,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
> >  	const char *rfc = NULL;
> >  	int creation_factor = -1;
> >  	const char *signature = git_version_string;
> > +	char *signature_to_free = NULL;
> >  	char *signature_file_arg = NULL;
> >  	struct keep_callback_data keep_callback_data = {
> >  		.cfg = &cfg,
> > @@ -2443,7 +2445,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
> >  
> >  		if (strbuf_read_file(&buf, signature_file, 128) < 0)
> >  			die_errno(_("unable to read signature file '%s'"), signature_file);
> > -		signature = strbuf_detach(&buf, NULL);
> > +		signature = signature_to_free = strbuf_detach(&buf, NULL);
> 
> Do I understand this correctly, that the multiple assignment here allows
> us to maintain a reference to the pointer returned by `strbuf_detach()`
> in `signature_to_free`, and we do this because `signature` can take on a
> different value below?

Not only below, but also its default value is `git_version_string`,
which is a string constant. So we use the multiple assignment such that
we can avoid freeing `signature`, which may contain string constants,
and unconditionally free `signature_to_free` because that variable
always holds an allocated string or a `NULL` pointer.

Patrick

> >  	} else if (cfg.signature) {
> >  		signature = cfg.signature;
> >  	}
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 19/22] userdiff: fix leaking memory for configured diff drivers
  2024-08-07  9:25   ` James Liu
@ 2024-08-08  5:05     ` Patrick Steinhardt
  2024-08-08 16:05       ` Junio C Hamano
  0 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08  5:05 UTC (permalink / raw)
  To: James Liu; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1542 bytes --]

On Wed, Aug 07, 2024 at 07:25:51PM +1000, James Liu wrote:
> > Refactor the code such that we have two pointers for each of these
> > strings: one that holds the value as accessed by other subsystems, and
> > one that points to the same string in case it has been allocated. Like
> > this, we can safely free the second pointer and thus plug those memory
> > leaks.
> >
> > diff --git a/userdiff.c b/userdiff.c
> > index c4ebb9ff73..989629149f 100644
> > --- a/userdiff.c
> > +++ b/userdiff.c
> > @@ -399,8 +399,11 @@ static struct userdiff_driver *userdiff_find_by_namelen(const char *name, size_t
> >  static int parse_funcname(struct userdiff_funcname *f, const char *k,
> >  		const char *v, int cflags)
> >  {
> > -	if (git_config_string((char **) &f->pattern, k, v) < 0)
> > +	f->pattern = NULL;
> > +	FREE_AND_NULL(f->pattern_owned);
> > +	if (git_config_string(&f->pattern_owned, k, v) < 0)
> >  		return -1;
> > +	f->pattern = f->pattern_owned;
> >  	f->cflags = cflags;
> >  	return 0;
> >  }
> 
> I'm not sure if I understand this change completely. We don't seem to be
> using `pattern_owned` (and the other *_owned) fields differently from
> their regular counterparts.
> 
> Is it because we can't do the following?
> 
>         FREE_AND_NULL((char **)f->pattern);

Yup. We have a bunch of statically defined userdiff drivers, all of
which use string constants as patterns. We thus cannot reliably free
those and instead have to track the allocated strings in a separate
variable.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 00/22] Memory leak fixes (pt.4)
  2024-08-07  9:27 ` [PATCH 00/22] Memory leak fixes (pt.4) James Liu
@ 2024-08-08  5:05   ` Patrick Steinhardt
  2024-08-08  6:00     ` James Liu
  0 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08  5:05 UTC (permalink / raw)
  To: James Liu; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 867 bytes --]

On Wed, Aug 07, 2024 at 07:27:32PM +1000, James Liu wrote:
> On Tue Aug 6, 2024 at 6:59 PM AEST, Patrick Steinhardt wrote:
> > Hi,
> >
> > the third set of memory leak fixes was merged to `next`, so this is the
> > next part of more or less random memory leak fixes all over the place.
> > With this series, we're at ~155 leaking test suites. Naturally, I've
> > already got v5 in the pipeline, which brings us down to ~120.
> >
> > The series is built on top of 406f326d27 (The second batch, 2024-08-01)
> > with ps/leakfixes-part-3 at f30bfafcd4 (commit-reach: fix trivial memory
> > leak when computing reachability, 2024-08-01) merged into it.
> >
> > Thanks!
> >
> > Patrick
> 
> Thanks Patrick, most of these fixes make sense to me! I appreciate that
> even the minor changes are accompanied by context.

Thanks for your review!

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 00/22] Memory leak fixes (pt.4)
  2024-08-08  5:05   ` Patrick Steinhardt
@ 2024-08-08  6:00     ` James Liu
  0 siblings, 0 replies; 146+ messages in thread
From: James Liu @ 2024-08-08  6:00 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On Thu Aug 8, 2024 at 3:05 PM AEST, Patrick Steinhardt wrote:
> On Wed, Aug 07, 2024 at 07:27:32PM +1000, James Liu wrote:
> > On Tue Aug 6, 2024 at 6:59 PM AEST, Patrick Steinhardt wrote:
> > > Hi,
> > >
> > > the third set of memory leak fixes was merged to `next`, so this is the
> > > next part of more or less random memory leak fixes all over the place.
> > > With this series, we're at ~155 leaking test suites. Naturally, I've
> > > already got v5 in the pipeline, which brings us down to ~120.
> > >
> > > The series is built on top of 406f326d27 (The second batch, 2024-08-01)
> > > with ps/leakfixes-part-3 at f30bfafcd4 (commit-reach: fix trivial memory
> > > leak when computing reachability, 2024-08-01) merged into it.
> > >
> > > Thanks!
> > >
> > > Patrick
> > 
> > Thanks Patrick, most of these fixes make sense to me! I appreciate that
> > even the minor changes are accompanied by context.
>
> Thanks for your review!
>
> Patrick

Thanks for responding to my questions! I don't have anything further to
add.

Cheers,
James

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 09/22] builtin/rebase: fix leaking `commit.gpgsign` value
  2024-08-06  9:00 ` [PATCH 09/22] builtin/rebase: fix leaking `commit.gpgsign` value Patrick Steinhardt
  2024-08-07  7:32   ` James Liu
@ 2024-08-08 10:07   ` Phillip Wood
  2024-08-08 12:58     ` Patrick Steinhardt
  1 sibling, 1 reply; 146+ messages in thread
From: Phillip Wood @ 2024-08-08 10:07 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: James Liu

Hi Patrick

On 06/08/2024 10:00, Patrick Steinhardt wrote:
> @@ -186,7 +186,15 @@ static struct replay_opts get_replay_opts(const struct rebase_options *opts)
>   	replay.committer_date_is_author_date =
>   					opts->committer_date_is_author_date;
>   	replay.ignore_date = opts->ignore_date;
> +
> +	/*
> +	 * TODO: Is it really intentional that we unconditionally override
> +	 * `replay.gpg_sign` even if it has already been initialized via the
> +	 * configuration?
> +	 */
> +	free(replay.gpg_sign);
>   	replay.gpg_sign = xstrdup_or_null(opts->gpg_sign_opt);
> +

The code that handles "-S" could certainly be clearer. The value 
returned from the config is either "" or NULL, not a key name. In 
cmd_main() options.gpg_sign_opt is initialized by rebase_config(), we 
set gpg_sign to "" if options.gpg_sign_opt is non-NULL, free 
options.gpg_sign_opt and then copy gpg_sign back into 
options.gpg_sign_opt after parsing the command line so we're not losing 
anything by unconditionally copying it here. The code changes look good, 
though I'm not sure we need to add the blank lines. It's always nice to 
see more tests marked as leak-free especially a big file like t3404.

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 15/22] sequencer: release todo list on error paths
  2024-08-06  9:00 ` [PATCH 15/22] sequencer: release todo list on error paths Patrick Steinhardt
@ 2024-08-08 10:08   ` Phillip Wood
  2024-08-08 16:31     ` Junio C Hamano
  0 siblings, 1 reply; 146+ messages in thread
From: Phillip Wood @ 2024-08-08 10:08 UTC (permalink / raw)
  To: Patrick Steinhardt, git

Hi Patrick

On 06/08/2024 10:00, Patrick Steinhardt wrote:
> We're not releasing the `todo_list` in `sequencer_pick_revisions()` when
> hitting an error path. Restructure the function to have a common exit
> path such that we can easily clean up the list and thus plug this memory
> leak.

This looks good, I've left a couple of small formatting comments below 
if you do end up re-rolling.

> @@ -5506,11 +5508,14 @@ int sequencer_pick_revisions(struct repository *r,
>   				enum object_type type = oid_object_info(r,
>   									&oid,
>   									NULL);
> -				return error(_("%s: can't cherry-pick a %s"),
> +				res = error(_("%s: can't cherry-pick a %s"),
>   					name, type_name(type));

This line needs re-indenting to match the changes above.

> +				goto out;
>   			}
> -		} else
> -			return error(_("%s: bad revision"), name);
> +		} else {
> +			res = error(_("%s: bad revision"), name);
> +			goto out;
> +		}
>   	}
>   
>   	/*
> @@ -5525,14 +5530,23 @@ int sequencer_pick_revisions(struct repository *r,
>   	    opts->revs->no_walk &&
>   	    !opts->revs->cmdline.rev->flags) {
>   		struct commit *cmit;
> -		if (prepare_revision_walk(opts->revs))
> -			return error(_("revision walk setup failed"));
> +

This whitespace change is good as it means we now have an empty line 
between the variable declarations and the code, the others I'm not 
fussed about either way.

Best Wishes

Phillip

> +		if (prepare_revision_walk(opts->revs)) {
> +			res = error(_("revision walk setup failed"));
> +			goto out;
> +		}
> +
>   		cmit = get_revision(opts->revs);
> -		if (!cmit)
> -			return error(_("empty commit set passed"));
> +		if (!cmit) {
> +			res = error(_("empty commit set passed"));
> +			goto out;
> +		}
> +
>   		if (get_revision(opts->revs))
>   			BUG("unexpected extra commit from walk");
> -		return single_pick(r, cmit, opts);
> +
> +		res = single_pick(r, cmit, opts);
> +		goto out;
>   	}
>   
>   	/*
> @@ -5542,16 +5556,30 @@ int sequencer_pick_revisions(struct repository *r,
>   	 */
>   
>   	if (walk_revs_populate_todo(&todo_list, opts) ||
> -			create_seq_dir(r) < 0)
> -		return -1;
> -	if (repo_get_oid(r, "HEAD", &oid) && (opts->action == REPLAY_REVERT))
> -		return error(_("can't revert as initial commit"));
> -	if (save_head(oid_to_hex(&oid)))
> -		return -1;
> -	if (save_opts(opts))
> -		return -1;
> +			create_seq_dir(r) < 0) {
> +		res = -1;
> +		goto out;
> +	}
> +
> +	if (repo_get_oid(r, "HEAD", &oid) && (opts->action == REPLAY_REVERT)) {
> +		res = error(_("can't revert as initial commit"));
> +		goto out;
> +	}
> +
> +	if (save_head(oid_to_hex(&oid))) {
> +		res = -1;
> +		goto out;
> +	}
> +
> +	if (save_opts(opts)) {
> +		res = -1;
> +		goto out;
> +	}
> +
>   	update_abort_safety_file();
>   	res = pick_commits(r, &todo_list, opts);
> +
> +out:
>   	todo_list_release(&todo_list);
>   	return res;
>   }
> diff --git a/t/t3510-cherry-pick-sequence.sh b/t/t3510-cherry-pick-sequence.sh
> index 7eb52b12ed..93c725bac3 100755
> --- a/t/t3510-cherry-pick-sequence.sh
> +++ b/t/t3510-cherry-pick-sequence.sh
> @@ -12,6 +12,7 @@ test_description='Test cherry-pick continuation features
>   
>   '
>   
> +TEST_PASSES_SANITIZE_LEAK=true
>   . ./test-lib.sh
>   
>   # Repeat first match 10 times

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 09/22] builtin/rebase: fix leaking `commit.gpgsign` value
  2024-08-08 10:07   ` Phillip Wood
@ 2024-08-08 12:58     ` Patrick Steinhardt
  0 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 12:58 UTC (permalink / raw)
  To: phillip.wood; +Cc: git, James Liu

[-- Attachment #1: Type: text/plain, Size: 1383 bytes --]

On Thu, Aug 08, 2024 at 11:07:53AM +0100, Phillip Wood wrote:
> Hi Patrick
> 
> On 06/08/2024 10:00, Patrick Steinhardt wrote:
> > @@ -186,7 +186,15 @@ static struct replay_opts get_replay_opts(const struct rebase_options *opts)
> >   	replay.committer_date_is_author_date =
> >   					opts->committer_date_is_author_date;
> >   	replay.ignore_date = opts->ignore_date;
> > +
> > +	/*
> > +	 * TODO: Is it really intentional that we unconditionally override
> > +	 * `replay.gpg_sign` even if it has already been initialized via the
> > +	 * configuration?
> > +	 */
> > +	free(replay.gpg_sign);
> >   	replay.gpg_sign = xstrdup_or_null(opts->gpg_sign_opt);
> > +
> 
> The code that handles "-S" could certainly be clearer. The value returned
> from the config is either "" or NULL, not a key name. In cmd_main()
> options.gpg_sign_opt is initialized by rebase_config(), we set gpg_sign to
> "" if options.gpg_sign_opt is non-NULL, free options.gpg_sign_opt and then
> copy gpg_sign back into options.gpg_sign_opt after parsing the command line
> so we're not losing anything by unconditionally copying it here. The code
> changes look good, though I'm not sure we need to add the blank lines. It's
> always nice to see more tests marked as leak-free especially a big file like
> t3404.

Okay. In that case I'll just drop the comment. Thanks!

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* [PATCH v2 00/22] Memory leak fixes (pt.4)
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (23 preceding siblings ...)
  2024-08-07 16:59 ` Junio C Hamano
@ 2024-08-08 13:04 ` Patrick Steinhardt
  2024-08-08 13:04   ` [PATCH v2 01/22] remote: plug memory leak when aliasing URLs Patrick Steinhardt
                     ` (23 more replies)
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
  26 siblings, 24 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:04 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 12399 bytes --]

Hi,

this is the second version of my fourth batch of patches that fix
various memory leaks.

Changes compared to v1:

  - Adapt the memory leak fix for command characters to instead use a
    `comment_line_str_allocated` variable.

  - Clarify some commit messages.

  - Drop the TODO comment about `rebase.gpgsign`. Turns out that this is
    working as intended, as explained by Phillip.

Thanks!

Patrick

Patrick Steinhardt (22):
  remote: plug memory leak when aliasing URLs
  git: fix leaking system paths
  object-file: fix memory leak when reading corrupted headers
  object-name: fix leaking symlink paths in object context
  bulk-checkin: fix leaking state TODO
  read-cache: fix leaking hashfile when writing index fails
  submodule-config: fix leaking name enrty when traversing submodules
  config: fix leaking comment character config
  builtin/rebase: fix leaking `commit.gpgsign` value
  builtin/notes: fix leaking `struct notes_tree` when merging notes
  builtin/fast-import: plug trivial memory leaks
  builtin/fast-export: fix leaking diff options
  builtin/fast-export: plug leaking tag names
  merge-ort: unconditionally release attributes index
  sequencer: release todo list on error paths
  unpack-trees: clear index when not propagating it
  diff: fix leak when parsing invalid ignore regex option
  builtin/format-patch: fix various trivial memory leaks
  userdiff: fix leaking memory for configured diff drivers
  builtin/log: fix leak when showing converted blob contents
  diff: free state populated via options
  builtin/diff: free symmetric diff members

 builtin/commit.c                      |  7 +-
 builtin/diff.c                        | 10 ++-
 builtin/fast-export.c                 | 19 ++++--
 builtin/fast-import.c                 |  8 ++-
 builtin/log.c                         | 13 +++-
 builtin/notes.c                       |  9 ++-
 builtin/rebase.c                      |  1 +
 bulk-checkin.c                        |  2 +
 config.c                              |  4 +-
 csum-file.c                           |  2 +-
 csum-file.h                           | 10 +++
 diff.c                                | 16 ++++-
 environment.c                         |  1 +
 environment.h                         |  1 +
 git.c                                 | 12 +++-
 merge-ort.c                           |  3 +-
 object-file.c                         |  1 +
 object-name.c                         |  1 +
 range-diff.c                          |  6 +-
 read-cache.c                          | 97 ++++++++++++++++-----------
 remote.c                              |  2 +
 sequencer.c                           | 67 ++++++++++++------
 submodule-config.c                    | 18 +++--
 t/t0210-trace2-normal.sh              |  2 +-
 t/t1006-cat-file.sh                   |  1 +
 t/t1050-large.sh                      |  1 +
 t/t1450-fsck.sh                       |  1 +
 t/t1601-index-bogus.sh                |  2 +
 t/t2107-update-index-basic.sh         |  1 +
 t/t3310-notes-merge-manual-resolve.sh |  1 +
 t/t3311-notes-merge-fanout.sh         |  1 +
 t/t3404-rebase-interactive.sh         |  1 +
 t/t3435-rebase-gpg-sign.sh            |  1 +
 t/t3507-cherry-pick-conflict.sh       |  1 +
 t/t3510-cherry-pick-sequence.sh       |  1 +
 t/t3705-add-sparse-checkout.sh        |  1 +
 t/t4013-diff-various.sh               |  1 +
 t/t4014-format-patch.sh               |  1 +
 t/t4018-diff-funcname.sh              |  1 +
 t/t4030-diff-textconv.sh              |  2 +
 t/t4042-diff-textconv-caching.sh      |  2 +
 t/t4048-diff-combined-binary.sh       |  1 +
 t/t4064-diff-oidfind.sh               |  2 +
 t/t4065-diff-anchored.sh              |  1 +
 t/t4068-diff-symmetric-merge-base.sh  |  1 +
 t/t4069-remerge-diff.sh               |  1 +
 t/t4108-apply-threeway.sh             |  1 +
 t/t4209-log-pickaxe.sh                |  2 +
 t/t6421-merge-partial-clone.sh        |  1 +
 t/t6428-merge-conflicts-sparse.sh     |  1 +
 t/t7008-filter-branch-null-sha1.sh    |  1 +
 t/t7030-verify-tag.sh                 |  1 +
 t/t7817-grep-sparse-checkout.sh       |  1 +
 t/t9300-fast-import.sh                |  1 +
 t/t9304-fast-import-marks.sh          |  2 +
 t/t9351-fast-export-anonymize.sh      |  1 +
 unpack-trees.c                        |  2 +
 userdiff.c                            | 38 ++++++++---
 userdiff.h                            |  4 ++
 59 files changed, 288 insertions(+), 106 deletions(-)

Range-diff against v1:
 1:  6e2fcd85c7 =  1:  2afa51f9ff remote: plug memory leak when aliasing URLs
 2:  9574995a24 =  2:  324140e4fd git: fix leaking system paths
 3:  f7e67d02d2 =  3:  43a38a2281 object-file: fix memory leak when reading corrupted headers
 4:  a9caaaed55 =  4:  9d3dc145e8 object-name: fix leaking symlink paths in object context
 5:  794af66103 =  5:  454139e7a4 bulk-checkin: fix leaking state TODO
 6:  2810cada0a =  6:  f8b7195796 read-cache: fix leaking hashfile when writing index fails
 7:  03f699cf39 =  7:  762fb5aa73 submodule-config: fix leaking name enrty when traversing submodules
 8:  a34c90a552 !  8:  8fbd72a100 config: fix leaking comment character config
    @@ Commit message
         without free'ing the previous value. In fact, it can't easily free the
         value in the first place because it may contain a string constant.
     
    -    Refactor the code so that we initialize the value with another array.
    -    This allows us to free the value in case the string is not pointing to
    -    that constant array anymore.
    +    Refactor the code such that we track allocated comment character strings
    +    via a separate non-constant variable `comment_line_str_allocated`. Adapt
    +    sites that set `comment_line_str` to set both and free the old value
    +    that was stored in `comment_line_str_allocated`.
     
         This memory leak is being hit in t3404. As there are still other memory
         leaks in that file we cannot yet mark it as passing with leak checking
    @@ Commit message
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
    + ## builtin/commit.c ##
    +@@ builtin/commit.c: static void adjust_comment_line_char(const struct strbuf *sb)
    + 	const char *p;
    + 
    + 	if (!memchr(sb->buf, candidates[0], sb->len)) {
    +-		comment_line_str = xstrfmt("%c", candidates[0]);
    ++		free(comment_line_str_allocated);
    ++		comment_line_str = comment_line_str_allocated =
    ++			xstrfmt("%c", candidates[0]);
    + 		return;
    + 	}
    + 
    +@@ builtin/commit.c: static void adjust_comment_line_char(const struct strbuf *sb)
    + 	if (!*p)
    + 		die(_("unable to select a comment character that is not used\n"
    + 		      "in the current commit message"));
    +-	comment_line_str = xstrfmt("%c", *p);
    ++	free(comment_line_str_allocated);
    ++	comment_line_str = comment_line_str_allocated = xstrfmt("%c", *p);
    + }
    + 
    + static void prepare_amend_commit(struct commit *commit, struct strbuf *sb,
    +
      ## config.c ##
     @@ config.c: static int git_default_core_config(const char *var, const char *value,
      		else if (value[0]) {
      			if (strchr(value, '\n'))
      				return error(_("%s cannot contain newline"), var);
    -+			if (comment_line_str != comment_line_str_default)
    -+				free((char *) comment_line_str);
    - 			comment_line_str = xstrdup(value);
    +-			comment_line_str = xstrdup(value);
    ++			free(comment_line_str_allocated);
    ++			comment_line_str = comment_line_str_allocated =
    ++				xstrdup(value);
      			auto_comment_line_char = 0;
      		} else
    + 			return error(_("%s must have at least one character"), var);
     
      ## environment.c ##
     @@ environment.c: int protect_ntfs = PROTECT_NTFS_DEFAULT;
    -  * The character that begins a commented line in user-editable file
       * that is subject to stripspace.
       */
    --const char *comment_line_str = "#";
    -+const char comment_line_str_default[] = "#";
    -+const char *comment_line_str = comment_line_str_default;
    + const char *comment_line_str = "#";
    ++char *comment_line_str_allocated;
      int auto_comment_line_char;
      
      /* Parallel index stat data preload? */
     
      ## environment.h ##
     @@ environment.h: struct strvec;
    -  * The character that begins a commented line in user-editable file
       * that is subject to stripspace.
       */
    -+extern const char comment_line_str_default[];
      extern const char *comment_line_str;
    ++extern char *comment_line_str_allocated;
      extern int auto_comment_line_char;
      
    + /*
 9:  05290fc1f1 !  9:  e497b76e9c builtin/rebase: fix leaking `commit.gpgsign` value
    @@ Metadata
      ## Commit message ##
         builtin/rebase: fix leaking `commit.gpgsign` value
     
    -    In `get_replay_opts()`, we unconditionally override the `gpg_sign` field
    -    that already got populated by `sequencer_init_config()` in case the user
    -    has "commit.gpgsign" set in their config. It is kind of dubious whether
    -    this is the correct thing to do or a bug. What is clear though is that
    -    this creates a memory leak.
    +    In `get_replay_opts()`, we override the `gpg_sign` field that already
    +    got populated by `sequencer_init_config()` in case the user has
    +    "commit.gpgsign" set in their config. This creates a memory leak because
    +    we overwrite the previously assigned value, which may have already
    +    pointed to an allocated string.
     
    -    Let's mark this assignment with a TODO comment to figure out whether
    -    this needs to be fixed or not. Meanwhile though, let's plug the memory
    -    leak.
    +    Let's plug the memory leak by freeing the value before we overwrite it.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
    @@ builtin/rebase.c: static struct replay_opts get_replay_opts(const struct rebase_
      	replay.committer_date_is_author_date =
      					opts->committer_date_is_author_date;
      	replay.ignore_date = opts->ignore_date;
    -+
    -+	/*
    -+	 * TODO: Is it really intentional that we unconditionally override
    -+	 * `replay.gpg_sign` even if it has already been initialized via the
    -+	 * configuration?
    -+	 */
     +	free(replay.gpg_sign);
      	replay.gpg_sign = xstrdup_or_null(opts->gpg_sign_opt);
    -+
      	replay.reflog_action = xstrdup(opts->reflog_action);
      	if (opts->strategy)
    - 		replay.strategy = xstrdup_or_null(opts->strategy);
     
      ## sequencer.c ##
     @@ sequencer.c: static int git_sequencer_config(const char *k, const char *v,
10:  4f5d490074 = 10:  c886b666f7 builtin/notes: fix leaking `struct notes_tree` when merging notes
11:  798b911f77 = 11:  d1c757157b builtin/fast-import: plug trivial memory leaks
12:  660732d29d = 12:  fa2d5c5d6b builtin/fast-export: fix leaking diff options
13:  64366155de = 13:  d9dd860d2a builtin/fast-export: plug leaking tag names
14:  b12015b3c3 = 14:  8f6860485e merge-ort: unconditionally release attributes index
15:  df4c21b49f ! 15:  ea6a350f31 sequencer: release todo list on error paths
    @@ sequencer.c: int sequencer_pick_revisions(struct repository *r,
      									&oid,
      									NULL);
     -				return error(_("%s: can't cherry-pick a %s"),
    +-					name, type_name(type));
     +				res = error(_("%s: can't cherry-pick a %s"),
    - 					name, type_name(type));
    ++					    name, type_name(type));
     +				goto out;
      			}
     -		} else
16:  1f8553fd43 = 16:  2755023742 unpack-trees: clear index when not propagating it
17:  c6db8df324 = 17:  edf6f148cd diff: fix leak when parsing invalid ignore regex option
18:  bf818a8a79 = 18:  343e3bd4df builtin/format-patch: fix various trivial memory leaks
19:  ef780aa360 = 19:  be2c5b0bca userdiff: fix leaking memory for configured diff drivers
20:  f3882986a3 = 20:  7888203833 builtin/log: fix leak when showing converted blob contents
21:  a49bb2e0cc = 21:  245fc30afb diff: free state populated via options
22:  fb52599404 = 22:  343ddcd17b builtin/diff: free symmetric diff members
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* [PATCH v2 01/22] remote: plug memory leak when aliasing URLs
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
@ 2024-08-08 13:04   ` Patrick Steinhardt
  2024-08-12  8:27     ` karthik nayak
                       ` (2 more replies)
  2024-08-08 13:04   ` [PATCH v2 02/22] git: fix leaking system paths Patrick Steinhardt
                     ` (22 subsequent siblings)
  23 siblings, 3 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:04 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 1931 bytes --]

When we have a `url.*.insteadOf` configuration, then we end up aliasing
URLs when populating remotes. One place where this happens is in
`alias_all_urls()`, where we loop through all remotes and then alias
each of their URLs. The actual aliasing logic is then contained in
`alias_url()`, which returns an allocated string that contains the new
URL. This URL replaces the old URL that we have in the strvec that
contanis all remote URLs.

We replace the remote URLs via `strvec_replace()`, which does not hand
over ownership of the new string to the vector. Still, we didn't free
the aliased URL and thus have a memory leak here. Fix it by freeing the
aliased string.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 remote.c                 | 2 ++
 t/t0210-trace2-normal.sh | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/remote.c b/remote.c
index f43cf5e7a4..3b898edd23 100644
--- a/remote.c
+++ b/remote.c
@@ -499,6 +499,7 @@ static void alias_all_urls(struct remote_state *remote_state)
 			if (alias)
 				strvec_replace(&remote_state->remotes[i]->pushurl,
 					       j, alias);
+			free(alias);
 		}
 		add_pushurl_aliases = remote_state->remotes[i]->pushurl.nr == 0;
 		for (j = 0; j < remote_state->remotes[i]->url.nr; j++) {
@@ -512,6 +513,7 @@ static void alias_all_urls(struct remote_state *remote_state)
 			if (alias)
 				strvec_replace(&remote_state->remotes[i]->url,
 					       j, alias);
+			free(alias);
 		}
 	}
 }
diff --git a/t/t0210-trace2-normal.sh b/t/t0210-trace2-normal.sh
index c312657a12..b9adc94aab 100755
--- a/t/t0210-trace2-normal.sh
+++ b/t/t0210-trace2-normal.sh
@@ -2,7 +2,7 @@
 
 test_description='test trace2 facility (normal target)'
 
-TEST_PASSES_SANITIZE_LEAK=false
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # Turn off any inherited trace2 settings for this test.
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 02/22] git: fix leaking system paths
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
  2024-08-08 13:04   ` [PATCH v2 01/22] remote: plug memory leak when aliasing URLs Patrick Steinhardt
@ 2024-08-08 13:04   ` Patrick Steinhardt
  2024-08-12 14:11     ` Taylor Blau
  2024-08-08 13:04   ` [PATCH v2 03/22] object-file: fix memory leak when reading corrupted headers Patrick Steinhardt
                     ` (21 subsequent siblings)
  23 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:04 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 1495 bytes --]

Git has some flags to make it output system paths as they have been
compiled into Git. This is done by calling `system_path()`, which
returns an allocated string. This string isn't ever free'd though,
creating a memory leak.

Plug those leaks. While they are surfaced by t0211, there are more
memory leaks looming exposed by that test suite and it thus does not yet
pass with the memory leak checker enabled.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 git.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/git.c b/git.c
index e35af9b0e5..5eab88b472 100644
--- a/git.c
+++ b/git.c
@@ -173,15 +173,21 @@ static int handle_options(const char ***argv, int *argc, int *envchanged)
 				exit(0);
 			}
 		} else if (!strcmp(cmd, "--html-path")) {
-			puts(system_path(GIT_HTML_PATH));
+			char *path = system_path(GIT_HTML_PATH);
+			puts(path);
+			free(path);
 			trace2_cmd_name("_query_");
 			exit(0);
 		} else if (!strcmp(cmd, "--man-path")) {
-			puts(system_path(GIT_MAN_PATH));
+			char *path = system_path(GIT_MAN_PATH);
+			puts(path);
+			free(path);
 			trace2_cmd_name("_query_");
 			exit(0);
 		} else if (!strcmp(cmd, "--info-path")) {
-			puts(system_path(GIT_INFO_PATH));
+			char *path = system_path(GIT_INFO_PATH);
+			puts(path);
+			free(path);
 			trace2_cmd_name("_query_");
 			exit(0);
 		} else if (!strcmp(cmd, "-p") || !strcmp(cmd, "--paginate")) {
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 03/22] object-file: fix memory leak when reading corrupted headers
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
  2024-08-08 13:04   ` [PATCH v2 01/22] remote: plug memory leak when aliasing URLs Patrick Steinhardt
  2024-08-08 13:04   ` [PATCH v2 02/22] git: fix leaking system paths Patrick Steinhardt
@ 2024-08-08 13:04   ` Patrick Steinhardt
  2024-08-12  8:43     ` karthik nayak
  2024-08-08 13:04   ` [PATCH v2 04/22] object-name: fix leaking symlink paths in object context Patrick Steinhardt
                     ` (20 subsequent siblings)
  23 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:04 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 1400 bytes --]

When reading corrupt object headers in `read_loose_object()`, then we
bail out immediately. This causes a memory leak though because we would
have already initialized the zstream in `unpack_loose_header()`, and it
is the callers responsibility to finish the zstream even on error. While
this feels weird, other callsites do it correctly already.

Fix this leak by ending the zstream even on errors. We may want to
revisit this interface in the future such that the callee handles this
for us already when there was an error.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 object-file.c   | 1 +
 t/t1450-fsck.sh | 1 +
 2 files changed, 2 insertions(+)

diff --git a/object-file.c b/object-file.c
index 065103be3e..7c65c435cd 100644
--- a/object-file.c
+++ b/object-file.c
@@ -2954,6 +2954,7 @@ int read_loose_object(const char *path,
 	if (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
 				NULL) != ULHR_OK) {
 		error(_("unable to unpack header of %s"), path);
+		git_inflate_end(&stream);
 		goto out;
 	}
 
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 8a456b1142..280cbf3e03 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -6,6 +6,7 @@ test_description='git fsck random collection of tests
 * (main) A
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success setup '
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 04/22] object-name: fix leaking symlink paths in object context
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2024-08-08 13:04   ` [PATCH v2 03/22] object-file: fix memory leak when reading corrupted headers Patrick Steinhardt
@ 2024-08-08 13:04   ` Patrick Steinhardt
  2024-08-08 13:04   ` [PATCH v2 05/22] bulk-checkin: fix leaking state TODO Patrick Steinhardt
                     ` (19 subsequent siblings)
  23 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:04 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 1015 bytes --]

The object context may be populated with symlink contents when reading a
symlink, but the associated strbuf doesn't ever get released when
releasing the object context, causing a memory leak. Plug it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 object-name.c       | 1 +
 t/t1006-cat-file.sh | 1 +
 2 files changed, 2 insertions(+)

diff --git a/object-name.c b/object-name.c
index 240a93e7ce..e39fa50e47 100644
--- a/object-name.c
+++ b/object-name.c
@@ -1765,6 +1765,7 @@ int strbuf_check_branch_ref(struct strbuf *sb, const char *name)
 void object_context_release(struct object_context *ctx)
 {
 	free(ctx->path);
+	strbuf_release(&ctx->symlink_path);
 }
 
 /*
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index ff9bf213aa..d36cd7c086 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -2,6 +2,7 @@
 
 test_description='git cat-file'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_cmdmode_usage () {
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 05/22] bulk-checkin: fix leaking state TODO
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2024-08-08 13:04   ` [PATCH v2 04/22] object-name: fix leaking symlink paths in object context Patrick Steinhardt
@ 2024-08-08 13:04   ` Patrick Steinhardt
  2024-08-08 13:04   ` [PATCH v2 06/22] read-cache: fix leaking hashfile when writing index fails Patrick Steinhardt
                     ` (18 subsequent siblings)
  23 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:04 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 2954 bytes --]

When flushing a bulk-checking to disk we also reset the `struct
bulk_checkin_packfile` state. But while we free some of its members,
others aren't being free'd, leading to memory leaks:

  - The temporary packfile name is not getting freed.

  - The `struct hashfile` only gets freed in case we end up calling
    `finalize_hashfile()`. There are code paths though where that is not
    the case, namely when nothing has been written. For this, we need to
    make `free_hashfile()` public.

Fix those leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 bulk-checkin.c   |  2 ++
 csum-file.c      |  2 +-
 csum-file.h      | 10 ++++++++++
 t/t1050-large.sh |  1 +
 4 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/bulk-checkin.c b/bulk-checkin.c
index da8673199b..9089c214fa 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -61,6 +61,7 @@ static void flush_bulk_checkin_packfile(struct bulk_checkin_packfile *state)
 
 	if (state->nr_written == 0) {
 		close(state->f->fd);
+		free_hashfile(state->f);
 		unlink(state->pack_tmp_name);
 		goto clear_exit;
 	} else if (state->nr_written == 1) {
@@ -83,6 +84,7 @@ static void flush_bulk_checkin_packfile(struct bulk_checkin_packfile *state)
 		free(state->written[i]);
 
 clear_exit:
+	free(state->pack_tmp_name);
 	free(state->written);
 	memset(state, 0, sizeof(*state));
 
diff --git a/csum-file.c b/csum-file.c
index 8abbf01325..7e0ece1305 100644
--- a/csum-file.c
+++ b/csum-file.c
@@ -56,7 +56,7 @@ void hashflush(struct hashfile *f)
 	}
 }
 
-static void free_hashfile(struct hashfile *f)
+void free_hashfile(struct hashfile *f)
 {
 	free(f->buffer);
 	free(f->check_buffer);
diff --git a/csum-file.h b/csum-file.h
index 566e05cbd2..ca553eba17 100644
--- a/csum-file.h
+++ b/csum-file.h
@@ -46,6 +46,16 @@ int hashfile_truncate(struct hashfile *, struct hashfile_checkpoint *);
 struct hashfile *hashfd(int fd, const char *name);
 struct hashfile *hashfd_check(const char *name);
 struct hashfile *hashfd_throughput(int fd, const char *name, struct progress *tp);
+
+/*
+ * Free the hashfile without flushing its contents to disk. This only
+ * needs to be called when not calling `finalize_hashfile()`.
+ */
+void free_hashfile(struct hashfile *f);
+
+/*
+ * Finalize the hashfile by flushing data to disk and free'ing it.
+ */
 int finalize_hashfile(struct hashfile *, unsigned char *, enum fsync_component, unsigned int);
 void hashwrite(struct hashfile *, const void *, unsigned int);
 void hashflush(struct hashfile *f);
diff --git a/t/t1050-large.sh b/t/t1050-large.sh
index c71932b024..ed638f6644 100755
--- a/t/t1050-large.sh
+++ b/t/t1050-large.sh
@@ -3,6 +3,7 @@
 
 test_description='adding and checking out large blobs'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'core.bigFileThreshold must be non-negative' '
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 06/22] read-cache: fix leaking hashfile when writing index fails
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2024-08-08 13:04   ` [PATCH v2 05/22] bulk-checkin: fix leaking state TODO Patrick Steinhardt
@ 2024-08-08 13:04   ` Patrick Steinhardt
  2024-08-08 13:05   ` [PATCH v2 07/22] submodule-config: fix leaking name enrty when traversing submodules Patrick Steinhardt
                     ` (17 subsequent siblings)
  23 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:04 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 7458 bytes --]

In `do_write_index()`, we use a `struct hashfile` to write the index
with a trailer hash. In case the write fails though, we never clean up
the allocated `hashfile` state and thus leak memory.

Refactor the code to have a common exit path where we can free this and
other allocated memory. While at it, refactor our use of `strbuf`s such
that we reuse the same buffer to avoid some unneeded allocations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 read-cache.c                       | 97 ++++++++++++++++++------------
 t/t1601-index-bogus.sh             |  2 +
 t/t2107-update-index-basic.sh      |  1 +
 t/t7008-filter-branch-null-sha1.sh |  1 +
 4 files changed, 62 insertions(+), 39 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 48bf24f87c..36821fe5b5 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2840,8 +2840,9 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	int csum_fsync_flag;
 	int ieot_entries = 1;
 	struct index_entry_offset_table *ieot = NULL;
-	int nr, nr_threads;
 	struct repository *r = istate->repo;
+	struct strbuf sb = STRBUF_INIT;
+	int nr, nr_threads, ret;
 
 	f = hashfd(tempfile->fd, tempfile->filename.buf);
 
@@ -2962,8 +2963,8 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	strbuf_release(&previous_name_buf);
 
 	if (err) {
-		free(ieot);
-		return err;
+		ret = err;
+		goto out;
 	}
 
 	offset = hashfile_total(f);
@@ -2985,20 +2986,20 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	 * index.
 	 */
 	if (ieot) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		write_ieot_extension(&sb, ieot);
 		err = write_index_ext_header(f, eoie_c, CACHE_EXT_INDEXENTRYOFFSETTABLE, sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		free(ieot);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 
 	if (write_extensions & WRITE_SPLIT_INDEX_EXTENSION &&
 	    istate->split_index) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		if (istate->sparse_index)
 			die(_("cannot write split index for a sparse index"));
@@ -3007,59 +3008,66 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 			write_index_ext_header(f, eoie_c, CACHE_EXT_LINK,
 					       sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 	if (write_extensions & WRITE_CACHE_TREE_EXTENSION &&
 	    !drop_cache_tree && istate->cache_tree) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		cache_tree_write(&sb, istate->cache_tree);
 		err = write_index_ext_header(f, eoie_c, CACHE_EXT_TREE, sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 	if (write_extensions & WRITE_RESOLVE_UNDO_EXTENSION &&
 	    istate->resolve_undo) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		resolve_undo_write(&sb, istate->resolve_undo);
 		err = write_index_ext_header(f, eoie_c, CACHE_EXT_RESOLVE_UNDO,
 					     sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 	if (write_extensions & WRITE_UNTRACKED_CACHE_EXTENSION &&
 	    istate->untracked) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		write_untracked_extension(&sb, istate->untracked);
 		err = write_index_ext_header(f, eoie_c, CACHE_EXT_UNTRACKED,
 					     sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 	if (write_extensions & WRITE_FSMONITOR_EXTENSION &&
 	    istate->fsmonitor_last_update) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		write_fsmonitor_extension(&sb, istate);
 		err = write_index_ext_header(f, eoie_c, CACHE_EXT_FSMONITOR, sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 	if (istate->sparse_index) {
-		if (write_index_ext_header(f, eoie_c, CACHE_EXT_SPARSE_DIRECTORIES, 0) < 0)
-			return -1;
+		if (write_index_ext_header(f, eoie_c, CACHE_EXT_SPARSE_DIRECTORIES, 0) < 0) {
+			ret = -1;
+			goto out;
+		}
 	}
 
 	/*
@@ -3069,14 +3077,15 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	 * when loading the shared index.
 	 */
 	if (eoie_c) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		write_eoie_extension(&sb, eoie_c, offset);
 		err = write_index_ext_header(f, NULL, CACHE_EXT_ENDOFINDEXENTRIES, sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 
 	csum_fsync_flag = 0;
@@ -3085,13 +3094,16 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 
 	finalize_hashfile(f, istate->oid.hash, FSYNC_COMPONENT_INDEX,
 			  CSUM_HASH_IN_STREAM | csum_fsync_flag);
+	f = NULL;
 
 	if (close_tempfile_gently(tempfile)) {
-		error(_("could not close '%s'"), get_tempfile_path(tempfile));
-		return -1;
+		ret = error(_("could not close '%s'"), get_tempfile_path(tempfile));
+		goto out;
+	}
+	if (stat(get_tempfile_path(tempfile), &st)) {
+		ret = -1;
+		goto out;
 	}
-	if (stat(get_tempfile_path(tempfile), &st))
-		return -1;
 	istate->timestamp.sec = (unsigned int)st.st_mtime;
 	istate->timestamp.nsec = ST_MTIME_NSEC(st);
 	trace_performance_since(start, "write index, changed mask = %x", istate->cache_changed);
@@ -3105,7 +3117,14 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	trace2_data_intmax("index", the_repository, "write/cache_nr",
 			   istate->cache_nr);
 
-	return 0;
+	ret = 0;
+
+out:
+	if (f)
+		free_hashfile(f);
+	strbuf_release(&sb);
+	free(ieot);
+	return ret;
 }
 
 void set_alternate_index_output(const char *name)
diff --git a/t/t1601-index-bogus.sh b/t/t1601-index-bogus.sh
index 4171f1e141..5dcc101882 100755
--- a/t/t1601-index-bogus.sh
+++ b/t/t1601-index-bogus.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test handling of bogus index entries'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'create tree with null sha1' '
diff --git a/t/t2107-update-index-basic.sh b/t/t2107-update-index-basic.sh
index cc72ead79f..f0eab13f96 100755
--- a/t/t2107-update-index-basic.sh
+++ b/t/t2107-update-index-basic.sh
@@ -5,6 +5,7 @@ test_description='basic update-index tests
 Tests for command-line parsing and basic operation.
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'update-index --nonsense fails' '
diff --git a/t/t7008-filter-branch-null-sha1.sh b/t/t7008-filter-branch-null-sha1.sh
index 93fbc92b8d..0ce8fd2c89 100755
--- a/t/t7008-filter-branch-null-sha1.sh
+++ b/t/t7008-filter-branch-null-sha1.sh
@@ -2,6 +2,7 @@
 
 test_description='filter-branch removal of trees with null sha1'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup: base commits' '
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 07/22] submodule-config: fix leaking name enrty when traversing submodules
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2024-08-08 13:04   ` [PATCH v2 06/22] read-cache: fix leaking hashfile when writing index fails Patrick Steinhardt
@ 2024-08-08 13:05   ` Patrick Steinhardt
  2024-08-08 13:05   ` [PATCH v2 08/22] config: fix leaking comment character config Patrick Steinhardt
                     ` (16 subsequent siblings)
  23 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:05 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 2376 bytes --]

We traverse through submodules in the tree via `tree_entry()`, passing
to it a `struct name_entry` that it is supposed to populate with the
tree entry's contents. We unnecessarily allocate this variable instead
of passing a variable that is allocated on the stack, and the ultimately
don't even free that variable. This is unnecessary and leaks memory.

Convert the variable to instead be allocated on the stack to plug the
memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 submodule-config.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/submodule-config.c b/submodule-config.c
index 9b0bb0b9f4..c8f2bb2bdd 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -899,27 +899,25 @@ static void traverse_tree_submodules(struct repository *r,
 {
 	struct tree_desc tree;
 	struct submodule_tree_entry *st_entry;
-	struct name_entry *name_entry;
+	struct name_entry name_entry;
 	char *tree_path = NULL;
 
-	name_entry = xmalloc(sizeof(*name_entry));
-
 	fill_tree_descriptor(r, &tree, treeish_name);
-	while (tree_entry(&tree, name_entry)) {
+	while (tree_entry(&tree, &name_entry)) {
 		if (prefix)
 			tree_path =
-				mkpathdup("%s/%s", prefix, name_entry->path);
+				mkpathdup("%s/%s", prefix, name_entry.path);
 		else
-			tree_path = xstrdup(name_entry->path);
+			tree_path = xstrdup(name_entry.path);
 
-		if (S_ISGITLINK(name_entry->mode) &&
+		if (S_ISGITLINK(name_entry.mode) &&
 		    is_tree_submodule_active(r, root_tree, tree_path)) {
 			ALLOC_GROW(out->entries, out->entry_nr + 1,
 				   out->entry_alloc);
 			st_entry = &out->entries[out->entry_nr++];
 
 			st_entry->name_entry = xmalloc(sizeof(*st_entry->name_entry));
-			*st_entry->name_entry = *name_entry;
+			*st_entry->name_entry = name_entry;
 			st_entry->submodule =
 				submodule_from_path(r, root_tree, tree_path);
 			st_entry->repo = xmalloc(sizeof(*st_entry->repo));
@@ -927,9 +925,9 @@ static void traverse_tree_submodules(struct repository *r,
 						root_tree))
 				FREE_AND_NULL(st_entry->repo);
 
-		} else if (S_ISDIR(name_entry->mode))
+		} else if (S_ISDIR(name_entry.mode))
 			traverse_tree_submodules(r, root_tree, tree_path,
-						 &name_entry->oid, out);
+						 &name_entry.oid, out);
 		free(tree_path);
 	}
 }
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 08/22] config: fix leaking comment character config
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2024-08-08 13:05   ` [PATCH v2 07/22] submodule-config: fix leaking name enrty when traversing submodules Patrick Steinhardt
@ 2024-08-08 13:05   ` Patrick Steinhardt
  2024-08-08 17:12     ` Junio C Hamano
  2024-08-08 13:05   ` [PATCH v2 09/22] builtin/rebase: fix leaking `commit.gpgsign` value Patrick Steinhardt
                     ` (15 subsequent siblings)
  23 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:05 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 3239 bytes --]

When the comment line character has been specified multiple times in the
configuration, then `git_default_core_config()` will cause a memory leak
because it unconditionally copies the string into `comment_line_str`
without free'ing the previous value. In fact, it can't easily free the
value in the first place because it may contain a string constant.

Refactor the code such that we track allocated comment character strings
via a separate non-constant variable `comment_line_str_allocated`. Adapt
sites that set `comment_line_str` to set both and free the old value
that was stored in `comment_line_str_allocated`.

This memory leak is being hit in t3404. As there are still other memory
leaks in that file we cannot yet mark it as passing with leak checking
enabled.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/commit.c | 7 +++++--
 config.c         | 4 +++-
 environment.c    | 1 +
 environment.h    | 1 +
 4 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 66427ba82d..025b1c4686 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -684,7 +684,9 @@ static void adjust_comment_line_char(const struct strbuf *sb)
 	const char *p;
 
 	if (!memchr(sb->buf, candidates[0], sb->len)) {
-		comment_line_str = xstrfmt("%c", candidates[0]);
+		free(comment_line_str_allocated);
+		comment_line_str = comment_line_str_allocated =
+			xstrfmt("%c", candidates[0]);
 		return;
 	}
 
@@ -705,7 +707,8 @@ static void adjust_comment_line_char(const struct strbuf *sb)
 	if (!*p)
 		die(_("unable to select a comment character that is not used\n"
 		      "in the current commit message"));
-	comment_line_str = xstrfmt("%c", *p);
+	free(comment_line_str_allocated);
+	comment_line_str = comment_line_str_allocated = xstrfmt("%c", *p);
 }
 
 static void prepare_amend_commit(struct commit *commit, struct strbuf *sb,
diff --git a/config.c b/config.c
index 6421894614..cb78b652ee 100644
--- a/config.c
+++ b/config.c
@@ -1596,7 +1596,9 @@ static int git_default_core_config(const char *var, const char *value,
 		else if (value[0]) {
 			if (strchr(value, '\n'))
 				return error(_("%s cannot contain newline"), var);
-			comment_line_str = xstrdup(value);
+			free(comment_line_str_allocated);
+			comment_line_str = comment_line_str_allocated =
+				xstrdup(value);
 			auto_comment_line_char = 0;
 		} else
 			return error(_("%s must have at least one character"), var);
diff --git a/environment.c b/environment.c
index 5cea2c9f54..1a95798d5f 100644
--- a/environment.c
+++ b/environment.c
@@ -114,6 +114,7 @@ int protect_ntfs = PROTECT_NTFS_DEFAULT;
  * that is subject to stripspace.
  */
 const char *comment_line_str = "#";
+char *comment_line_str_allocated;
 int auto_comment_line_char;
 
 /* Parallel index stat data preload? */
diff --git a/environment.h b/environment.h
index e9f01d4d11..0e0906f125 100644
--- a/environment.h
+++ b/environment.h
@@ -9,6 +9,7 @@ struct strvec;
  * that is subject to stripspace.
  */
 extern const char *comment_line_str;
+extern char *comment_line_str_allocated;
 extern int auto_comment_line_char;
 
 /*
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 09/22] builtin/rebase: fix leaking `commit.gpgsign` value
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (7 preceding siblings ...)
  2024-08-08 13:05   ` [PATCH v2 08/22] config: fix leaking comment character config Patrick Steinhardt
@ 2024-08-08 13:05   ` Patrick Steinhardt
  2024-08-08 13:05   ` [PATCH v2 10/22] builtin/notes: fix leaking `struct notes_tree` when merging notes Patrick Steinhardt
                     ` (14 subsequent siblings)
  23 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:05 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 2767 bytes --]

In `get_replay_opts()`, we override the `gpg_sign` field that already
got populated by `sequencer_init_config()` in case the user has
"commit.gpgsign" set in their config. This creates a memory leak because
we overwrite the previously assigned value, which may have already
pointed to an allocated string.

Let's plug the memory leak by freeing the value before we overwrite it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/rebase.c              | 1 +
 sequencer.c                   | 1 +
 t/t3404-rebase-interactive.sh | 1 +
 t/t3435-rebase-gpg-sign.sh    | 1 +
 t/t7030-verify-tag.sh         | 1 +
 5 files changed, 5 insertions(+)

diff --git a/builtin/rebase.c b/builtin/rebase.c
index e3a8e74cfc..2f01d5d3a6 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -186,6 +186,7 @@ static struct replay_opts get_replay_opts(const struct rebase_options *opts)
 	replay.committer_date_is_author_date =
 					opts->committer_date_is_author_date;
 	replay.ignore_date = opts->ignore_date;
+	free(replay.gpg_sign);
 	replay.gpg_sign = xstrdup_or_null(opts->gpg_sign_opt);
 	replay.reflog_action = xstrdup(opts->reflog_action);
 	if (opts->strategy)
diff --git a/sequencer.c b/sequencer.c
index 0291920f0b..cade9b0ca8 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -303,6 +303,7 @@ static int git_sequencer_config(const char *k, const char *v,
 	}
 
 	if (!strcmp(k, "commit.gpgsign")) {
+		free(opts->gpg_sign);
 		opts->gpg_sign = git_config_bool(k, v) ? xstrdup("") : NULL;
 		return 0;
 	}
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index f92baad138..f171af3061 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -26,6 +26,7 @@ Initial setup:
  touch file "conflict".
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 . "$TEST_DIRECTORY"/lib-rebase.sh
diff --git a/t/t3435-rebase-gpg-sign.sh b/t/t3435-rebase-gpg-sign.sh
index 6aa2aeb628..6e329fea7c 100755
--- a/t/t3435-rebase-gpg-sign.sh
+++ b/t/t3435-rebase-gpg-sign.sh
@@ -8,6 +8,7 @@ test_description='test rebase --[no-]gpg-sign'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY/lib-rebase.sh"
 . "$TEST_DIRECTORY/lib-gpg.sh"
diff --git a/t/t7030-verify-tag.sh b/t/t7030-verify-tag.sh
index 6f526c37c2..effa826744 100755
--- a/t/t7030-verify-tag.sh
+++ b/t/t7030-verify-tag.sh
@@ -4,6 +4,7 @@ test_description='signed tag tests'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY/lib-gpg.sh"
 
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 10/22] builtin/notes: fix leaking `struct notes_tree` when merging notes
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (8 preceding siblings ...)
  2024-08-08 13:05   ` [PATCH v2 09/22] builtin/rebase: fix leaking `commit.gpgsign` value Patrick Steinhardt
@ 2024-08-08 13:05   ` Patrick Steinhardt
  2024-08-08 13:05   ` [PATCH v2 11/22] builtin/fast-import: plug trivial memory leaks Patrick Steinhardt
                     ` (13 subsequent siblings)
  23 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:05 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 2970 bytes --]

We allocate a `struct notes_tree` in `merge_commit()` which we then
initialize via `init_notes()`. It's not really necessary to allocate the
structure though given that we never pass ownership to the caller.
Furthermore, the allocation leads to a memory leak because despite its
name, `free_notes()` doesn't free the `notes_tree` but only clears it.

Fix this issue by converting the code to use an on-stack variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/notes.c                       | 9 ++++-----
 t/t3310-notes-merge-manual-resolve.sh | 1 +
 t/t3311-notes-merge-fanout.sh         | 1 +
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/builtin/notes.c b/builtin/notes.c
index d9c356e354..81cbaeec6b 100644
--- a/builtin/notes.c
+++ b/builtin/notes.c
@@ -807,7 +807,7 @@ static int merge_commit(struct notes_merge_options *o)
 {
 	struct strbuf msg = STRBUF_INIT;
 	struct object_id oid, parent_oid;
-	struct notes_tree *t;
+	struct notes_tree t = {0};
 	struct commit *partial;
 	struct pretty_print_context pretty_ctx;
 	void *local_ref_to_free;
@@ -830,8 +830,7 @@ static int merge_commit(struct notes_merge_options *o)
 	else
 		oidclr(&parent_oid, the_repository->hash_algo);
 
-	CALLOC_ARRAY(t, 1);
-	init_notes(t, "NOTES_MERGE_PARTIAL", combine_notes_overwrite, 0);
+	init_notes(&t, "NOTES_MERGE_PARTIAL", combine_notes_overwrite, 0);
 
 	o->local_ref = local_ref_to_free =
 		refs_resolve_refdup(get_main_ref_store(the_repository),
@@ -839,7 +838,7 @@ static int merge_commit(struct notes_merge_options *o)
 	if (!o->local_ref)
 		die(_("failed to resolve NOTES_MERGE_REF"));
 
-	if (notes_merge_commit(o, t, partial, &oid))
+	if (notes_merge_commit(o, &t, partial, &oid))
 		die(_("failed to finalize notes merge"));
 
 	/* Reuse existing commit message in reflog message */
@@ -853,7 +852,7 @@ static int merge_commit(struct notes_merge_options *o)
 			is_null_oid(&parent_oid) ? NULL : &parent_oid,
 			0, UPDATE_REFS_DIE_ON_ERR);
 
-	free_notes(t);
+	free_notes(&t);
 	strbuf_release(&msg);
 	ret = merge_abort(o);
 	free(local_ref_to_free);
diff --git a/t/t3310-notes-merge-manual-resolve.sh b/t/t3310-notes-merge-manual-resolve.sh
index 597df5ebc0..04866b89be 100755
--- a/t/t3310-notes-merge-manual-resolve.sh
+++ b/t/t3310-notes-merge-manual-resolve.sh
@@ -5,6 +5,7 @@
 
 test_description='Test notes merging with manual conflict resolution'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # Set up a notes merge scenario with different kinds of conflicts
diff --git a/t/t3311-notes-merge-fanout.sh b/t/t3311-notes-merge-fanout.sh
index 5b675417e9..ce4144db0f 100755
--- a/t/t3311-notes-merge-fanout.sh
+++ b/t/t3311-notes-merge-fanout.sh
@@ -5,6 +5,7 @@
 
 test_description='Test notes merging at various fanout levels'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 verify_notes () {
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 11/22] builtin/fast-import: plug trivial memory leaks
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (9 preceding siblings ...)
  2024-08-08 13:05   ` [PATCH v2 10/22] builtin/notes: fix leaking `struct notes_tree` when merging notes Patrick Steinhardt
@ 2024-08-08 13:05   ` Patrick Steinhardt
  2024-08-08 13:05   ` [PATCH v2 12/22] builtin/fast-export: fix leaking diff options Patrick Steinhardt
                     ` (12 subsequent siblings)
  23 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:05 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 2493 bytes --]

Plug some trivial memory leaks in git-fast-import(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/fast-import.c        | 8 ++++++--
 t/t9300-fast-import.sh       | 1 +
 t/t9304-fast-import-marks.sh | 2 ++
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index d21c4053a7..6dfeb01665 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -206,8 +206,8 @@ static unsigned int object_entry_alloc = 5000;
 static struct object_entry_pool *blocks;
 static struct hashmap object_table;
 static struct mark_set *marks;
-static const char *export_marks_file;
-static const char *import_marks_file;
+static char *export_marks_file;
+static char *import_marks_file;
 static int import_marks_file_from_stream;
 static int import_marks_file_ignore_missing;
 static int import_marks_file_done;
@@ -3274,6 +3274,7 @@ static void option_import_marks(const char *marks,
 			read_marks();
 	}
 
+	free(import_marks_file);
 	import_marks_file = make_fast_import_path(marks);
 	import_marks_file_from_stream = from_stream;
 	import_marks_file_ignore_missing = ignore_missing;
@@ -3316,6 +3317,7 @@ static void option_active_branches(const char *branches)
 
 static void option_export_marks(const char *marks)
 {
+	free(export_marks_file);
 	export_marks_file = make_fast_import_path(marks);
 }
 
@@ -3357,6 +3359,8 @@ static void option_rewrite_submodules(const char *arg, struct string_list *list)
 	free(f);
 
 	string_list_insert(list, s)->util = ms;
+
+	free(s);
 }
 
 static int parse_one_option(const char *option)
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 1e68426852..3b3c371740 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -7,6 +7,7 @@ test_description='test git fast-import utility'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-diff.sh ;# test-lib chdir's into trash
 
diff --git a/t/t9304-fast-import-marks.sh b/t/t9304-fast-import-marks.sh
index 410a871c52..1f776a80f3 100755
--- a/t/t9304-fast-import-marks.sh
+++ b/t/t9304-fast-import-marks.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test exotic situations with marks'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup dump of basic history' '
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 12/22] builtin/fast-export: fix leaking diff options
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (10 preceding siblings ...)
  2024-08-08 13:05   ` [PATCH v2 11/22] builtin/fast-import: plug trivial memory leaks Patrick Steinhardt
@ 2024-08-08 13:05   ` Patrick Steinhardt
  2024-08-12  9:05     ` karthik nayak
  2024-08-08 13:05   ` [PATCH v2 13/22] builtin/fast-export: plug leaking tag names Patrick Steinhardt
                     ` (11 subsequent siblings)
  23 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:05 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 1115 bytes --]

Before caling `handle_commit()` in a loop, we set `diffopt.no_free` such
that its contents aren't getting freed inside of `handle_commit()`. We
never unset that flag though, which means that it'll ultimately leak
when calling `release_revisions()`.

Fix this by unsetting the flag after the loop.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/fast-export.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 4b6e8c6832..fe92d2436c 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -1278,9 +1278,11 @@ int cmd_fast_export(int argc, const char **argv, const char *prefix)
 	revs.diffopt.format_callback = show_filemodify;
 	revs.diffopt.format_callback_data = &paths_of_changed_objects;
 	revs.diffopt.flags.recursive = 1;
+
 	revs.diffopt.no_free = 1;
 	while ((commit = get_revision(&revs)))
 		handle_commit(commit, &revs, &paths_of_changed_objects);
+	revs.diffopt.no_free = 0;
 
 	handle_tags_and_duplicates(&extra_refs);
 	handle_tags_and_duplicates(&tag_refs);
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 13/22] builtin/fast-export: plug leaking tag names
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (11 preceding siblings ...)
  2024-08-08 13:05   ` [PATCH v2 12/22] builtin/fast-export: fix leaking diff options Patrick Steinhardt
@ 2024-08-08 13:05   ` Patrick Steinhardt
  2024-08-08 13:05   ` [PATCH v2 14/22] merge-ort: unconditionally release attributes index Patrick Steinhardt
                     ` (10 subsequent siblings)
  23 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:05 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 3889 bytes --]

When resolving revisions in `get_tags_and_duplicates()`, we only
partially manage the lifetime of `full_name`. In fact, managing its
lifetime properly is almost impossible because we put direct pointers to
that variable into multiple lists without duplicating the string. The
consequence is that these strings will ultimately leak.

Refactor the code to make the lists we put those names into duplicate
the memory. This allows us to properly free the string as required and
thus plugs the memory leak.

While this requires us to allocate more data overall, it shouldn't be
all that bad given that the number of allocations corresponds with the
number of command line parameters, which typically aren't all that many.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/fast-export.c            | 17 ++++++++++++-----
 t/t9351-fast-export-anonymize.sh |  1 +
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index fe92d2436c..f253b79322 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -42,8 +42,8 @@ static int full_tree;
 static int reference_excluded_commits;
 static int show_original_ids;
 static int mark_tags;
-static struct string_list extra_refs = STRING_LIST_INIT_NODUP;
-static struct string_list tag_refs = STRING_LIST_INIT_NODUP;
+static struct string_list extra_refs = STRING_LIST_INIT_DUP;
+static struct string_list tag_refs = STRING_LIST_INIT_DUP;
 static struct refspec refspecs = REFSPEC_INIT_FETCH;
 static int anonymize;
 static struct hashmap anonymized_seeds;
@@ -901,7 +901,7 @@ static void handle_tag(const char *name, struct tag *tag)
 	free(buf);
 }
 
-static struct commit *get_commit(struct rev_cmdline_entry *e, char *full_name)
+static struct commit *get_commit(struct rev_cmdline_entry *e, const char *full_name)
 {
 	switch (e->item->type) {
 	case OBJ_COMMIT:
@@ -932,14 +932,16 @@ static void get_tags_and_duplicates(struct rev_cmdline_info *info)
 		struct rev_cmdline_entry *e = info->rev + i;
 		struct object_id oid;
 		struct commit *commit;
-		char *full_name;
+		char *full_name = NULL;
 
 		if (e->flags & UNINTERESTING)
 			continue;
 
 		if (repo_dwim_ref(the_repository, e->name, strlen(e->name),
-				  &oid, &full_name, 0) != 1)
+				  &oid, &full_name, 0) != 1) {
+			free(full_name);
 			continue;
+		}
 
 		if (refspecs.nr) {
 			char *private;
@@ -955,6 +957,7 @@ static void get_tags_and_duplicates(struct rev_cmdline_info *info)
 			warning("%s: Unexpected object of type %s, skipping.",
 				e->name,
 				type_name(e->item->type));
+			free(full_name);
 			continue;
 		}
 
@@ -963,10 +966,12 @@ static void get_tags_and_duplicates(struct rev_cmdline_info *info)
 			break;
 		case OBJ_BLOB:
 			export_blob(&commit->object.oid);
+			free(full_name);
 			continue;
 		default: /* OBJ_TAG (nested tags) is already handled */
 			warning("Tag points to object of unexpected type %s, skipping.",
 				type_name(commit->object.type));
+			free(full_name);
 			continue;
 		}
 
@@ -979,6 +984,8 @@ static void get_tags_and_duplicates(struct rev_cmdline_info *info)
 
 		if (!*revision_sources_at(&revision_sources, commit))
 			*revision_sources_at(&revision_sources, commit) = full_name;
+		else
+			free(full_name);
 	}
 
 	string_list_sort(&extra_refs);
diff --git a/t/t9351-fast-export-anonymize.sh b/t/t9351-fast-export-anonymize.sh
index 156a647484..c0d9d7be75 100755
--- a/t/t9351-fast-export-anonymize.sh
+++ b/t/t9351-fast-export-anonymize.sh
@@ -4,6 +4,7 @@ test_description='basic tests for fast-export --anonymize'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup simple repo' '
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 14/22] merge-ort: unconditionally release attributes index
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (12 preceding siblings ...)
  2024-08-08 13:05   ` [PATCH v2 13/22] builtin/fast-export: plug leaking tag names Patrick Steinhardt
@ 2024-08-08 13:05   ` Patrick Steinhardt
  2024-08-08 13:05   ` [PATCH v2 15/22] sequencer: release todo list on error paths Patrick Steinhardt
                     ` (9 subsequent siblings)
  23 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:05 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 3212 bytes --]

We conditionally release the index used for reading gitattributes in
merge-ort based on whether or the index has been populated. This check
uses `cache_nr` as a condition. This isn't sufficient though, as the
variable may be zero even when some other parts of the index have been
populated. This leads to memory leaks when sparse checkouts are in use,
as we may not end up releasing the sparse checkout patterns.

Fix this issue by unconditionally releasing the index.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 merge-ort.c                       | 3 +--
 t/t3507-cherry-pick-conflict.sh   | 1 +
 t/t6421-merge-partial-clone.sh    | 1 +
 t/t6428-merge-conflicts-sparse.sh | 1 +
 t/t7817-grep-sparse-checkout.sh   | 1 +
 5 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index e9d01ac7f7..3752c7e595 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -689,8 +689,7 @@ static void clear_or_reinit_internal_opts(struct merge_options_internal *opti,
 	 */
 	strmap_clear_func(&opti->conflicted, 0);
 
-	if (opti->attr_index.cache_nr) /* true iff opt->renormalize */
-		discard_index(&opti->attr_index);
+	discard_index(&opti->attr_index);
 
 	/* Free memory used by various renames maps */
 	for (i = MERGE_SIDE1; i <= MERGE_SIDE2; ++i) {
diff --git a/t/t3507-cherry-pick-conflict.sh b/t/t3507-cherry-pick-conflict.sh
index f3947b400a..10e9c91dbb 100755
--- a/t/t3507-cherry-pick-conflict.sh
+++ b/t/t3507-cherry-pick-conflict.sh
@@ -13,6 +13,7 @@ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 TEST_CREATE_REPO_NO_TEMPLATE=1
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 pristine_detach () {
diff --git a/t/t6421-merge-partial-clone.sh b/t/t6421-merge-partial-clone.sh
index 711b709e75..020375c805 100755
--- a/t/t6421-merge-partial-clone.sh
+++ b/t/t6421-merge-partial-clone.sh
@@ -26,6 +26,7 @@ test_description="limiting blob downloads when merging with partial clones"
 #                     underscore notation is to differentiate different
 #                     files that might be renamed into each other's paths.)
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-merge.sh
 
diff --git a/t/t6428-merge-conflicts-sparse.sh b/t/t6428-merge-conflicts-sparse.sh
index 9919c3fa7c..8a79bc2e92 100755
--- a/t/t6428-merge-conflicts-sparse.sh
+++ b/t/t6428-merge-conflicts-sparse.sh
@@ -22,6 +22,7 @@ test_description="merge cases"
 #                     underscore notation is to differentiate different
 #                     files that might be renamed into each other's paths.)
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-merge.sh
 
diff --git a/t/t7817-grep-sparse-checkout.sh b/t/t7817-grep-sparse-checkout.sh
index eb59564565..0ba7817fb7 100755
--- a/t/t7817-grep-sparse-checkout.sh
+++ b/t/t7817-grep-sparse-checkout.sh
@@ -33,6 +33,7 @@ should leave the following structure in the working tree:
 But note that sub2 should have the SKIP_WORKTREE bit set.
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup' '
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 15/22] sequencer: release todo list on error paths
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (13 preceding siblings ...)
  2024-08-08 13:05   ` [PATCH v2 14/22] merge-ort: unconditionally release attributes index Patrick Steinhardt
@ 2024-08-08 13:05   ` Patrick Steinhardt
  2024-08-08 13:05   ` [PATCH v2 16/22] unpack-trees: clear index when not propagating it Patrick Steinhardt
                     ` (8 subsequent siblings)
  23 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:05 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 3458 bytes --]

We're not releasing the `todo_list` in `sequencer_pick_revisions()` when
hitting an error path. Restructure the function to have a common exit
path such that we can easily clean up the list and thus plug this memory
leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 sequencer.c                     | 66 +++++++++++++++++++++++----------
 t/t3510-cherry-pick-sequence.sh |  1 +
 2 files changed, 48 insertions(+), 19 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index cade9b0ca8..ea559c31f1 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -5490,8 +5490,10 @@ int sequencer_pick_revisions(struct repository *r,
 	int i, res;
 
 	assert(opts->revs);
-	if (read_and_refresh_cache(r, opts))
-		return -1;
+	if (read_and_refresh_cache(r, opts)) {
+		res = -1;
+		goto out;
+	}
 
 	for (i = 0; i < opts->revs->pending.nr; i++) {
 		struct object_id oid;
@@ -5506,11 +5508,14 @@ int sequencer_pick_revisions(struct repository *r,
 				enum object_type type = oid_object_info(r,
 									&oid,
 									NULL);
-				return error(_("%s: can't cherry-pick a %s"),
-					name, type_name(type));
+				res = error(_("%s: can't cherry-pick a %s"),
+					    name, type_name(type));
+				goto out;
 			}
-		} else
-			return error(_("%s: bad revision"), name);
+		} else {
+			res = error(_("%s: bad revision"), name);
+			goto out;
+		}
 	}
 
 	/*
@@ -5525,14 +5530,23 @@ int sequencer_pick_revisions(struct repository *r,
 	    opts->revs->no_walk &&
 	    !opts->revs->cmdline.rev->flags) {
 		struct commit *cmit;
-		if (prepare_revision_walk(opts->revs))
-			return error(_("revision walk setup failed"));
+
+		if (prepare_revision_walk(opts->revs)) {
+			res = error(_("revision walk setup failed"));
+			goto out;
+		}
+
 		cmit = get_revision(opts->revs);
-		if (!cmit)
-			return error(_("empty commit set passed"));
+		if (!cmit) {
+			res = error(_("empty commit set passed"));
+			goto out;
+		}
+
 		if (get_revision(opts->revs))
 			BUG("unexpected extra commit from walk");
-		return single_pick(r, cmit, opts);
+
+		res = single_pick(r, cmit, opts);
+		goto out;
 	}
 
 	/*
@@ -5542,16 +5556,30 @@ int sequencer_pick_revisions(struct repository *r,
 	 */
 
 	if (walk_revs_populate_todo(&todo_list, opts) ||
-			create_seq_dir(r) < 0)
-		return -1;
-	if (repo_get_oid(r, "HEAD", &oid) && (opts->action == REPLAY_REVERT))
-		return error(_("can't revert as initial commit"));
-	if (save_head(oid_to_hex(&oid)))
-		return -1;
-	if (save_opts(opts))
-		return -1;
+			create_seq_dir(r) < 0) {
+		res = -1;
+		goto out;
+	}
+
+	if (repo_get_oid(r, "HEAD", &oid) && (opts->action == REPLAY_REVERT)) {
+		res = error(_("can't revert as initial commit"));
+		goto out;
+	}
+
+	if (save_head(oid_to_hex(&oid))) {
+		res = -1;
+		goto out;
+	}
+
+	if (save_opts(opts)) {
+		res = -1;
+		goto out;
+	}
+
 	update_abort_safety_file();
 	res = pick_commits(r, &todo_list, opts);
+
+out:
 	todo_list_release(&todo_list);
 	return res;
 }
diff --git a/t/t3510-cherry-pick-sequence.sh b/t/t3510-cherry-pick-sequence.sh
index 7eb52b12ed..93c725bac3 100755
--- a/t/t3510-cherry-pick-sequence.sh
+++ b/t/t3510-cherry-pick-sequence.sh
@@ -12,6 +12,7 @@ test_description='Test cherry-pick continuation features
 
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # Repeat first match 10 times
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 16/22] unpack-trees: clear index when not propagating it
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (14 preceding siblings ...)
  2024-08-08 13:05   ` [PATCH v2 15/22] sequencer: release todo list on error paths Patrick Steinhardt
@ 2024-08-08 13:05   ` Patrick Steinhardt
  2024-08-08 13:05   ` [PATCH v2 17/22] diff: fix leak when parsing invalid ignore regex option Patrick Steinhardt
                     ` (7 subsequent siblings)
  23 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:05 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 2143 bytes --]

When provided a pointer to a destination index, then `unpack_trees()`
will end up copying its `o->internal.result` index into the provided
pointer. In those cases it is thus not necessary to free the index, as
we have transferred ownership of it.

There are cases though where we do not end up transferring ownership of
the memory, but `clear_unpack_trees_porcelain()` will never discard the
index in that case and thus cause a memory leak. And right now it cannot
do so in the first place because we have no indicator of whether we did
or didn't transfer ownership of the index.

Adapt the code to zero out the index in case we transfer its ownership.
Like this, we can now unconditionally discard the index when being asked
to clear the `unpack_trees_options`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t3705-add-sparse-checkout.sh | 1 +
 unpack-trees.c                 | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/t/t3705-add-sparse-checkout.sh b/t/t3705-add-sparse-checkout.sh
index 2bade9e804..6ae45a788d 100755
--- a/t/t3705-add-sparse-checkout.sh
+++ b/t/t3705-add-sparse-checkout.sh
@@ -2,6 +2,7 @@
 
 test_description='git add in sparse checked out working trees'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 SPARSE_ENTRY_BLOB=""
diff --git a/unpack-trees.c b/unpack-trees.c
index 7dc884fafd..9a55cb6204 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -210,6 +210,7 @@ void clear_unpack_trees_porcelain(struct unpack_trees_options *opts)
 {
 	strvec_clear(&opts->internal.msgs_to_free);
 	memset(opts->internal.msgs, 0, sizeof(opts->internal.msgs));
+	discard_index(&opts->internal.result);
 }
 
 static int do_add_entry(struct unpack_trees_options *o, struct cache_entry *ce,
@@ -2082,6 +2083,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		o->internal.result.updated_workdir = 1;
 		discard_index(o->dst_index);
 		*o->dst_index = o->internal.result;
+		memset(&o->internal.result, 0, sizeof(o->internal.result));
 	} else {
 		discard_index(&o->internal.result);
 	}
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 17/22] diff: fix leak when parsing invalid ignore regex option
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (15 preceding siblings ...)
  2024-08-08 13:05   ` [PATCH v2 16/22] unpack-trees: clear index when not propagating it Patrick Steinhardt
@ 2024-08-08 13:05   ` Patrick Steinhardt
  2024-08-08 13:05   ` [PATCH v2 18/22] builtin/format-patch: fix various trivial memory leaks Patrick Steinhardt
                     ` (6 subsequent siblings)
  23 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:05 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 1397 bytes --]

When parsing invalid ignore regexes passed via the `-I` option we don't
free already-allocated memory, leading to a memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 diff.c                  | 6 +++++-
 t/t4013-diff-various.sh | 1 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index ebb7538e04..9251c47b72 100644
--- a/diff.c
+++ b/diff.c
@@ -5464,9 +5464,13 @@ static int diff_opt_ignore_regex(const struct option *opt,
 	regex_t *regex;
 
 	BUG_ON_OPT_NEG(unset);
+
 	regex = xmalloc(sizeof(*regex));
-	if (regcomp(regex, arg, REG_EXTENDED | REG_NEWLINE))
+	if (regcomp(regex, arg, REG_EXTENDED | REG_NEWLINE)) {
+		free(regex);
 		return error(_("invalid regex given to -I: '%s'"), arg);
+	}
+
 	ALLOC_GROW(options->ignore_regex, options->ignore_regex_nr + 1,
 		   options->ignore_regex_alloc);
 	options->ignore_regex[options->ignore_regex_nr++] = regex;
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index 3855d68dbc..87d248d034 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -8,6 +8,7 @@ test_description='Various diff formatting options'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-diff.sh
 
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 18/22] builtin/format-patch: fix various trivial memory leaks
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (16 preceding siblings ...)
  2024-08-08 13:05   ` [PATCH v2 17/22] diff: fix leak when parsing invalid ignore regex option Patrick Steinhardt
@ 2024-08-08 13:05   ` Patrick Steinhardt
  2024-08-08 13:05   ` [PATCH v2 19/22] userdiff: fix leaking memory for configured diff drivers Patrick Steinhardt
                     ` (5 subsequent siblings)
  23 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:05 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 2803 bytes --]

There are various memory leaks hit by git-format-patch(1). Basically all
of them are trivial, except that un-setting `diffopt.no_free` requires
us to unset the `diffopt.file` because we manually close it already.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/log.c           | 12 +++++++++---
 t/t4014-format-patch.sh |  1 +
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index a73a767606..ff997a0d0e 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1833,6 +1833,7 @@ static struct commit *get_base_commit(const struct format_config *cfg,
 			}
 
 			rev[i] = merge_base->item;
+			free_commit_list(merge_base);
 		}
 
 		if (rev_nr % 2)
@@ -2023,6 +2024,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	const char *rfc = NULL;
 	int creation_factor = -1;
 	const char *signature = git_version_string;
+	char *signature_to_free = NULL;
 	char *signature_file_arg = NULL;
 	struct keep_callback_data keep_callback_data = {
 		.cfg = &cfg,
@@ -2443,7 +2445,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 
 		if (strbuf_read_file(&buf, signature_file, 128) < 0)
 			die_errno(_("unable to read signature file '%s'"), signature_file);
-		signature = strbuf_detach(&buf, NULL);
+		signature = signature_to_free = strbuf_detach(&buf, NULL);
 	} else if (cfg.signature) {
 		signature = cfg.signature;
 	}
@@ -2548,12 +2550,13 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 			else
 				print_signature(signature, rev.diffopt.file);
 		}
-		if (output_directory)
+		if (output_directory) {
 			fclose(rev.diffopt.file);
+			rev.diffopt.file = NULL;
+		}
 	}
 	stop_progress(&progress);
 	free(list);
-	free(branch_name);
 	if (ignore_if_in_upstream)
 		free_patch_ids(&ids);
 
@@ -2565,11 +2568,14 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	strbuf_release(&rdiff_title);
 	free(description_file);
 	free(signature_file_arg);
+	free(signature_to_free);
+	free(branch_name);
 	free(to_free);
 	free(rev.message_id);
 	if (rev.ref_message_ids)
 		string_list_clear(rev.ref_message_ids, 0);
 	free(rev.ref_message_ids);
+	rev.diffopt.no_free = 0;
 	release_revisions(&rev);
 	format_config_release(&cfg);
 	return 0;
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index 884f83fb8a..1c46e963e4 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -8,6 +8,7 @@ test_description='various format-patch tests'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-terminal.sh
 
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 19/22] userdiff: fix leaking memory for configured diff drivers
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (17 preceding siblings ...)
  2024-08-08 13:05   ` [PATCH v2 18/22] builtin/format-patch: fix various trivial memory leaks Patrick Steinhardt
@ 2024-08-08 13:05   ` Patrick Steinhardt
  2024-08-08 13:06   ` [PATCH v2 20/22] builtin/log: fix leak when showing converted blob contents Patrick Steinhardt
                     ` (4 subsequent siblings)
  23 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:05 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 5981 bytes --]

The userdiff structures may be initialized either statically on the
stack or dynamically via configuration keys. In the latter case we end
up leaking memory because we didn't have any infrastructure to discern
those strings which have been allocated statically and those which have
been allocated dynamically.

Refactor the code such that we have two pointers for each of these
strings: one that holds the value as accessed by other subsystems, and
one that points to the same string in case it has been allocated. Like
this, we can safely free the second pointer and thus plug those memory
leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 range-diff.c                     |  6 +++--
 t/t4018-diff-funcname.sh         |  1 +
 t/t4042-diff-textconv-caching.sh |  2 ++
 t/t4048-diff-combined-binary.sh  |  1 +
 t/t4209-log-pickaxe.sh           |  2 ++
 userdiff.c                       | 38 ++++++++++++++++++++++++--------
 userdiff.h                       |  4 ++++
 7 files changed, 43 insertions(+), 11 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index 5f01605550..bbb0952264 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -450,8 +450,10 @@ static void output_pair_header(struct diff_options *diffopt,
 }
 
 static struct userdiff_driver section_headers = {
-	.funcname = { "^ ## (.*) ##$\n"
-		      "^.?@@ (.*)$", REG_EXTENDED }
+	.funcname = {
+		.pattern = "^ ## (.*) ##$\n^.?@@ (.*)$",
+		.cflags = REG_EXTENDED,
+	},
 };
 
 static struct diff_filespec *get_filespec(const char *name, const char *p)
diff --git a/t/t4018-diff-funcname.sh b/t/t4018-diff-funcname.sh
index e026fac1f4..8128c30e7f 100755
--- a/t/t4018-diff-funcname.sh
+++ b/t/t4018-diff-funcname.sh
@@ -5,6 +5,7 @@
 
 test_description='Test custom diff function name patterns'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup' '
diff --git a/t/t4042-diff-textconv-caching.sh b/t/t4042-diff-textconv-caching.sh
index 8ebfa3c1be..a179205394 100755
--- a/t/t4042-diff-textconv-caching.sh
+++ b/t/t4042-diff-textconv-caching.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test textconv caching'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 cat >helper <<'EOF'
diff --git a/t/t4048-diff-combined-binary.sh b/t/t4048-diff-combined-binary.sh
index 0260cf64f5..f399484bce 100755
--- a/t/t4048-diff-combined-binary.sh
+++ b/t/t4048-diff-combined-binary.sh
@@ -4,6 +4,7 @@ test_description='combined and merge diff handle binary files and textconv'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup binary merge conflict' '
diff --git a/t/t4209-log-pickaxe.sh b/t/t4209-log-pickaxe.sh
index 64e1623733..b42fdc54fc 100755
--- a/t/t4209-log-pickaxe.sh
+++ b/t/t4209-log-pickaxe.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='log --grep/--author/--regexp-ignore-case/-S/-G'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_log () {
diff --git a/userdiff.c b/userdiff.c
index c4ebb9ff73..989629149f 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -399,8 +399,11 @@ static struct userdiff_driver *userdiff_find_by_namelen(const char *name, size_t
 static int parse_funcname(struct userdiff_funcname *f, const char *k,
 		const char *v, int cflags)
 {
-	if (git_config_string((char **) &f->pattern, k, v) < 0)
+	f->pattern = NULL;
+	FREE_AND_NULL(f->pattern_owned);
+	if (git_config_string(&f->pattern_owned, k, v) < 0)
 		return -1;
+	f->pattern = f->pattern_owned;
 	f->cflags = cflags;
 	return 0;
 }
@@ -444,20 +447,37 @@ int userdiff_config(const char *k, const char *v)
 		return parse_funcname(&drv->funcname, k, v, REG_EXTENDED);
 	if (!strcmp(type, "binary"))
 		return parse_tristate(&drv->binary, k, v);
-	if (!strcmp(type, "command"))
-		return git_config_string((char **) &drv->external.cmd, k, v);
+	if (!strcmp(type, "command")) {
+		FREE_AND_NULL(drv->external.cmd);
+		return git_config_string(&drv->external.cmd, k, v);
+	}
 	if (!strcmp(type, "trustexitcode")) {
 		drv->external.trust_exit_code = git_config_bool(k, v);
 		return 0;
 	}
-	if (!strcmp(type, "textconv"))
-		return git_config_string((char **) &drv->textconv, k, v);
+	if (!strcmp(type, "textconv")) {
+		int ret;
+		FREE_AND_NULL(drv->textconv_owned);
+		ret = git_config_string(&drv->textconv_owned, k, v);
+		drv->textconv = drv->textconv_owned;
+		return ret;
+	}
 	if (!strcmp(type, "cachetextconv"))
 		return parse_bool(&drv->textconv_want_cache, k, v);
-	if (!strcmp(type, "wordregex"))
-		return git_config_string((char **) &drv->word_regex, k, v);
-	if (!strcmp(type, "algorithm"))
-		return git_config_string((char **) &drv->algorithm, k, v);
+	if (!strcmp(type, "wordregex")) {
+		int ret;
+		FREE_AND_NULL(drv->word_regex_owned);
+		ret = git_config_string(&drv->word_regex_owned, k, v);
+		drv->word_regex = drv->word_regex_owned;
+		return ret;
+	}
+	if (!strcmp(type, "algorithm")) {
+		int ret;
+		FREE_AND_NULL(drv->algorithm_owned);
+		ret = git_config_string(&drv->algorithm_owned, k, v);
+		drv->algorithm = drv->algorithm_owned;
+		return ret;
+	}
 
 	return 0;
 }
diff --git a/userdiff.h b/userdiff.h
index 7565930337..827361b0bc 100644
--- a/userdiff.h
+++ b/userdiff.h
@@ -8,6 +8,7 @@ struct repository;
 
 struct userdiff_funcname {
 	const char *pattern;
+	char *pattern_owned;
 	int cflags;
 };
 
@@ -20,11 +21,14 @@ struct userdiff_driver {
 	const char *name;
 	struct external_diff external;
 	const char *algorithm;
+	char *algorithm_owned;
 	int binary;
 	struct userdiff_funcname funcname;
 	const char *word_regex;
+	char *word_regex_owned;
 	const char *word_regex_multi_byte;
 	const char *textconv;
+	char *textconv_owned;
 	struct notes_cache *textconv_cache;
 	int textconv_want_cache;
 };
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 20/22] builtin/log: fix leak when showing converted blob contents
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (18 preceding siblings ...)
  2024-08-08 13:05   ` [PATCH v2 19/22] userdiff: fix leaking memory for configured diff drivers Patrick Steinhardt
@ 2024-08-08 13:06   ` Patrick Steinhardt
  2024-08-08 13:06   ` [PATCH v2 21/22] diff: free state populated via options Patrick Steinhardt
                     ` (3 subsequent siblings)
  23 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:06 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 1204 bytes --]

In `show_blob_object()`, we proactively call `textconv_object()`. In
case we have a textconv driver for this blob we will end up showing the
converted contents, otherwise we'll show the un-converted contents of it
instead.

When the object has been converted we never free the buffer containing
the converted contents. Fix this to plug this memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/log.c            | 1 +
 t/t4030-diff-textconv.sh | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/builtin/log.c b/builtin/log.c
index ff997a0d0e..1a684b68f2 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -707,6 +707,7 @@ static int show_blob_object(const struct object_id *oid, struct rev_info *rev, c
 
 	write_or_die(1, buf, size);
 	object_context_release(&obj_context);
+	free(buf);
 	return 0;
 }
 
diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh
index a39a626664..29f6d610c2 100755
--- a/t/t4030-diff-textconv.sh
+++ b/t/t4030-diff-textconv.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='diff.*.textconv tests'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 find_diff() {
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 21/22] diff: free state populated via options
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (19 preceding siblings ...)
  2024-08-08 13:06   ` [PATCH v2 20/22] builtin/log: fix leak when showing converted blob contents Patrick Steinhardt
@ 2024-08-08 13:06   ` Patrick Steinhardt
  2024-08-08 13:06   ` [PATCH v2 22/22] builtin/diff: free symmetric diff members Patrick Steinhardt
                     ` (2 subsequent siblings)
  23 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:06 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 2034 bytes --]

The `objfind` and `anchors` members of `struct diff_options` are
populated via option parsing, but are never freed in `diff_free()`. Fix
this to plug those memory leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 diff.c                   | 10 ++++++++++
 t/t4064-diff-oidfind.sh  |  2 ++
 t/t4065-diff-anchored.sh |  1 +
 t/t4069-remerge-diff.sh  |  1 +
 4 files changed, 14 insertions(+)

diff --git a/diff.c b/diff.c
index 9251c47b72..4035a9374d 100644
--- a/diff.c
+++ b/diff.c
@@ -6717,6 +6717,16 @@ void diff_free(struct diff_options *options)
 	if (options->no_free)
 		return;
 
+	if (options->objfind) {
+		oidset_clear(options->objfind);
+		FREE_AND_NULL(options->objfind);
+	}
+
+	for (size_t i = 0; i < options->anchors_nr; i++)
+		free(options->anchors[i]);
+	FREE_AND_NULL(options->anchors);
+	options->anchors_nr = options->anchors_alloc = 0;
+
 	diff_free_file(options);
 	diff_free_ignore_regex(options);
 	clear_pathspec(&options->pathspec);
diff --git a/t/t4064-diff-oidfind.sh b/t/t4064-diff-oidfind.sh
index 6d8c8986fc..846f285f77 100755
--- a/t/t4064-diff-oidfind.sh
+++ b/t/t4064-diff-oidfind.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test finding specific blobs in the revision walking'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup ' '
diff --git a/t/t4065-diff-anchored.sh b/t/t4065-diff-anchored.sh
index b3f510f040..647537c12e 100755
--- a/t/t4065-diff-anchored.sh
+++ b/t/t4065-diff-anchored.sh
@@ -2,6 +2,7 @@
 
 test_description='anchored diff algorithm'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success '--anchored' '
diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
index 07323ebafe..888714bbd3 100755
--- a/t/t4069-remerge-diff.sh
+++ b/t/t4069-remerge-diff.sh
@@ -2,6 +2,7 @@
 
 test_description='remerge-diff handling'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # This test is ort-specific
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 22/22] builtin/diff: free symmetric diff members
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (20 preceding siblings ...)
  2024-08-08 13:06   ` [PATCH v2 21/22] diff: free state populated via options Patrick Steinhardt
@ 2024-08-08 13:06   ` Patrick Steinhardt
  2024-08-12  9:12     ` karthik nayak
  2024-08-12  9:13   ` [PATCH v2 00/22] Memory leak fixes (pt.4) karthik nayak
  2024-08-12 14:01   ` Phillip Wood
  23 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-08 13:06 UTC (permalink / raw)
  To: git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 2517 bytes --]

We populate a `struct symdiff` in case the user has requested a
symmetric diff. Part of this is to populate a `skip` bitmap that
indicates whihc commits shall be ignored in the diff. But while this
bitmap is dynamically allocated, we never free it.

Fix this by introducing and calling a new `symdiff_release()` function
that does this for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/diff.c                       | 10 +++++++++-
 t/t4068-diff-symmetric-merge-base.sh |  1 +
 t/t4108-apply-threeway.sh            |  1 +
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/builtin/diff.c b/builtin/diff.c
index 9b6cdabe15..f87f68a5bc 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -388,6 +388,13 @@ static void symdiff_prepare(struct rev_info *rev, struct symdiff *sym)
 	sym->skip = map;
 }
 
+static void symdiff_release(struct symdiff *sdiff)
+{
+	if (!sdiff)
+		return;
+	bitmap_free(sdiff->skip);
+}
+
 int cmd_diff(int argc, const char **argv, const char *prefix)
 {
 	int i;
@@ -398,7 +405,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
 	struct object_array_entry *blob[2];
 	int nongit = 0, no_index = 0;
 	int result;
-	struct symdiff sdiff;
+	struct symdiff sdiff = {0};
 
 	/*
 	 * We could get N tree-ish in the rev.pending_objects list.
@@ -619,6 +626,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
 		refresh_index_quietly();
 	release_revisions(&rev);
 	object_array_clear(&ent);
+	symdiff_release(&sdiff);
 	UNLEAK(blob);
 	return result;
 }
diff --git a/t/t4068-diff-symmetric-merge-base.sh b/t/t4068-diff-symmetric-merge-base.sh
index eff63c16b0..4d6565e728 100755
--- a/t/t4068-diff-symmetric-merge-base.sh
+++ b/t/t4068-diff-symmetric-merge-base.sh
@@ -5,6 +5,7 @@ test_description='behavior of diff with symmetric-diff setups and --merge-base'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # build these situations:
diff --git a/t/t4108-apply-threeway.sh b/t/t4108-apply-threeway.sh
index c558282bc0..3211e1e65f 100755
--- a/t/t4108-apply-threeway.sh
+++ b/t/t4108-apply-threeway.sh
@@ -5,6 +5,7 @@ test_description='git apply --3way'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 print_sanitized_conflicted_diff () {
-- 
2.46.0.46.g406f326d27.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* Re: [PATCH 08/22] config: fix leaking comment character config
  2024-08-08  5:04     ` Patrick Steinhardt
@ 2024-08-08 15:54       ` Junio C Hamano
  0 siblings, 0 replies; 146+ messages in thread
From: Junio C Hamano @ 2024-08-08 15:54 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: James Liu, git

Patrick Steinhardt <ps@pks.im> writes:

> Now that I revisit this commit I'm not quite happy with it anymore. We
> still need to have the cast, which is somewhat awkward. I think the
> better solution is to instead have a `comment_line_str_allocated`
> variable that is non-constant. I'll adapt the code accordingly.
>
> An even better solution would be to have `struct strbuf` provide an
> initializer that populates it with a string constant. But that feels
> like a larger undertaking, so I'll leave that for the future.

FWIW, I found the "now we have a variable to refer to the address of
the string constant, we can compare to detect if we allocated and
need to free" in this round is a good place to stop.

I view the approach to use an auxiliary variable *_allocated is a
regression compared to what we see here.  The approach makes it easy
to forget to futz it when an allocated piece of memory is assigned
to the main variable.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 19/22] userdiff: fix leaking memory for configured diff drivers
  2024-08-08  5:05     ` Patrick Steinhardt
@ 2024-08-08 16:05       ` Junio C Hamano
  0 siblings, 0 replies; 146+ messages in thread
From: Junio C Hamano @ 2024-08-08 16:05 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: James Liu, git

Patrick Steinhardt <ps@pks.im> writes:

>> > -	if (git_config_string((char **) &f->pattern, k, v) < 0)
>> > +	f->pattern = NULL;
>> > +	FREE_AND_NULL(f->pattern_owned);
>> > +	if (git_config_string(&f->pattern_owned, k, v) < 0)
>> >  		return -1;
>> > +	f->pattern = f->pattern_owned;
>> >  	f->cflags = cflags;
>> >  	return 0;
>> >  }
>
> Yup. We have a bunch of statically defined userdiff drivers, all of
> which use string constants as patterns. We thus cannot reliably free
> those and instead have to track the allocated strings in a separate
> variable.

In other words, this is the usual "foo is the variable to be used,
and it may point at foo_to_free, when the value is an allocated
string" pattern.  I doubt .pattern_to_free is a better name even in
the name of consistency---foo_to_free is certainly much better than
foo_owned as a name for a temporary variable in a small scope, but a
structure member is a much longer validity and I am OK if we decide
to adopt the convention to call a structure member .foo_owned when
it is used in this manner.

Thanks.


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH 15/22] sequencer: release todo list on error paths
  2024-08-08 10:08   ` Phillip Wood
@ 2024-08-08 16:31     ` Junio C Hamano
  0 siblings, 0 replies; 146+ messages in thread
From: Junio C Hamano @ 2024-08-08 16:31 UTC (permalink / raw)
  To: Phillip Wood; +Cc: Patrick Steinhardt, git

Phillip Wood <phillip.wood123@gmail.com> writes:

> Hi Patrick
>
> On 06/08/2024 10:00, Patrick Steinhardt wrote:
>> We're not releasing the `todo_list` in `sequencer_pick_revisions()` when
>> hitting an error path. Restructure the function to have a common exit
>> path such that we can easily clean up the list and thus plug this memory
>> leak.
>
> This looks good, I've left a couple of small formatting comments below
> if you do end up re-rolling.

Oh, formatting nitpicks, my favourite ;-)

>> @@ -5506,11 +5508,14 @@ int sequencer_pick_revisions(struct repository *r,
>>   				enum object_type type = oid_object_info(r,
>>   									&oid,
>>   									NULL);

Also, if we say

				enum object_type type;

				type = oid_object_info(r, &oid, NULL);

the result is much easier on the eyes usign the same three lines.
Yes, initializing while declaring may look nicer and in some cases
it may even be necessary, but not this one.

>> -				return error(_("%s: can't cherry-pick a %s"),
>> +				res = error(_("%s: can't cherry-pick a %s"),
>>   					name, type_name(type));
>
> This line needs re-indenting to match the changes above.

Thanks.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 08/22] config: fix leaking comment character config
  2024-08-08 13:05   ` [PATCH v2 08/22] config: fix leaking comment character config Patrick Steinhardt
@ 2024-08-08 17:12     ` Junio C Hamano
  2024-08-12  7:45       ` Patrick Steinhardt
  0 siblings, 1 reply; 146+ messages in thread
From: Junio C Hamano @ 2024-08-08 17:12 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, James Liu, Phillip Wood

Patrick Steinhardt <ps@pks.im> writes:

> diff --git a/config.c b/config.c
> index 6421894614..cb78b652ee 100644
> --- a/config.c
> +++ b/config.c
> @@ -1596,7 +1596,9 @@ static int git_default_core_config(const char *var, const char *value,
>  		else if (value[0]) {
>  			if (strchr(value, '\n'))
>  				return error(_("%s cannot contain newline"), var);
> -			comment_line_str = xstrdup(value);
> +			free(comment_line_str_allocated);
> +			comment_line_str = comment_line_str_allocated =
> +				xstrdup(value);

If you are to follow the _to_free pattern, you do not have to
allocate here, no?  We borrow the value in the configset and point
at it via comment_line_str, and clear comment_line_str_to_free
because there is nothing to free now.  I.e.

			comment_line_str = value;
			FREE_AND_NULL(comment_line_str_allocated);

I still think the approach taken by the previous iteration was
simpler and much less error prone, though.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 08/22] config: fix leaking comment character config
  2024-08-08 17:12     ` Junio C Hamano
@ 2024-08-12  7:45       ` Patrick Steinhardt
  2024-08-12 20:32         ` Junio C Hamano
  0 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-12  7:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, James Liu, Phillip Wood

On Thu, Aug 08, 2024 at 10:12:26AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > diff --git a/config.c b/config.c
> > index 6421894614..cb78b652ee 100644
> > --- a/config.c
> > +++ b/config.c
> > @@ -1596,7 +1596,9 @@ static int git_default_core_config(const char *var, const char *value,
> >  		else if (value[0]) {
> >  			if (strchr(value, '\n'))
> >  				return error(_("%s cannot contain newline"), var);
> > -			comment_line_str = xstrdup(value);
> > +			free(comment_line_str_allocated);
> > +			comment_line_str = comment_line_str_allocated =
> > +				xstrdup(value);
> 
> If you are to follow the _to_free pattern, you do not have to
> allocate here, no?  We borrow the value in the configset and point
> at it via comment_line_str, and clear comment_line_str_to_free
> because there is nothing to free now.  I.e.
> 
> 			comment_line_str = value;
> 			FREE_AND_NULL(comment_line_str_allocated);

Only if it is guaranteed that the configuration will never be re-read,
which would end up discarding memory owned by the old string. Which
should be the case already, but to the best of my knowledge we do not
document the expected lifetime of config strings anywhere.

> I still think the approach taken by the previous iteration was
> simpler and much less error prone, though.

I personally prefer this iteration. I feel that it is way more
discoverable to have an explicit indicator that something needs to be
freed, which the `_allocated` suffix brings us. With the old version,
the caller needs to become aware that the constant string may sometimes
need to be freed, and that sometimes is figured out by comparing to a
magic variable, which feels worse to me.

Ultimately, both solutions are okay-ish, but I don't consider either of
them to be great. As mentioned elsewhere, I think the best solution
would be to adapt the `struct strbuf` interface to have an initializer
like `STRBUF_INIT_CONST("foobar")` that allows us to initialize it with
a string constant. There wouldn't be any need to have two variables
anymore, and the `strbuf` API would handle the lifecycle of its contents
for us. In any case, I'd say this is a #leftoverbit and is better done
in a subsequent patch series.

I don't really think it makes sense to reroll this version to swap out
the patch for the first version again, but am happy to adapt if you
prefer that.

Thanks!

Patrick

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 01/22] remote: plug memory leak when aliasing URLs
  2024-08-08 13:04   ` [PATCH v2 01/22] remote: plug memory leak when aliasing URLs Patrick Steinhardt
@ 2024-08-12  8:27     ` karthik nayak
  2024-08-12 14:08     ` Taylor Blau
  2024-08-12 14:37     ` Jeff King
  2 siblings, 0 replies; 146+ messages in thread
From: karthik nayak @ 2024-08-12  8:27 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 522 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> When we have a `url.*.insteadOf` configuration, then we end up aliasing
> URLs when populating remotes. One place where this happens is in
> `alias_all_urls()`, where we loop through all remotes and then alias
> each of their URLs. The actual aliasing logic is then contained in
> `alias_url()`, which returns an allocated string that contains the new
> URL. This URL replaces the old URL that we have in the strvec that
> contanis all remote URLs.
>

s/contanis/contains

[snip]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 03/22] object-file: fix memory leak when reading corrupted headers
  2024-08-08 13:04   ` [PATCH v2 03/22] object-file: fix memory leak when reading corrupted headers Patrick Steinhardt
@ 2024-08-12  8:43     ` karthik nayak
  0 siblings, 0 replies; 146+ messages in thread
From: karthik nayak @ 2024-08-12  8:43 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 603 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> When reading corrupt object headers in `read_loose_object()`, then we

s/then//

> bail out immediately. This causes a memory leak though because we would
> have already initialized the zstream in `unpack_loose_header()`, and it
> is the callers responsibility to finish the zstream even on error. While
> this feels weird, other callsites do it correctly already.
>
> Fix this leak by ending the zstream even on errors. We may want to
> revisit this interface in the future such that the callee handles this
> for us already when there was an error.
>

[snip]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 12/22] builtin/fast-export: fix leaking diff options
  2024-08-08 13:05   ` [PATCH v2 12/22] builtin/fast-export: fix leaking diff options Patrick Steinhardt
@ 2024-08-12  9:05     ` karthik nayak
  0 siblings, 0 replies; 146+ messages in thread
From: karthik nayak @ 2024-08-12  9:05 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 377 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> Before caling `handle_commit()` in a loop, we set `diffopt.no_free` such

s/caling/calling

> that its contents aren't getting freed inside of `handle_commit()`. We
> never unset that flag though, which means that it'll ultimately leak
> when calling `release_revisions()`.
>
> Fix this by unsetting the flag after the loop.
>

[snip]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 22/22] builtin/diff: free symmetric diff members
  2024-08-08 13:06   ` [PATCH v2 22/22] builtin/diff: free symmetric diff members Patrick Steinhardt
@ 2024-08-12  9:12     ` karthik nayak
  0 siblings, 0 replies; 146+ messages in thread
From: karthik nayak @ 2024-08-12  9:12 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 423 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> We populate a `struct symdiff` in case the user has requested a
> symmetric diff. Part of this is to populate a `skip` bitmap that
> indicates whihc commits shall be ignored in the diff. But while this

s/whihc/which

> bitmap is dynamically allocated, we never free it.
>
> Fix this by introducing and calling a new `symdiff_release()` function
> that does this for us.
>

[snip]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 00/22] Memory leak fixes (pt.4)
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (21 preceding siblings ...)
  2024-08-08 13:06   ` [PATCH v2 22/22] builtin/diff: free symmetric diff members Patrick Steinhardt
@ 2024-08-12  9:13   ` karthik nayak
  2024-08-12 15:49     ` Junio C Hamano
  2024-08-12 14:01   ` Phillip Wood
  23 siblings, 1 reply; 146+ messages in thread
From: karthik nayak @ 2024-08-12  9:13 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: James Liu, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 624 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> Hi,
>
> this is the second version of my fourth batch of patches that fix
> various memory leaks.
>
> Changes compared to v1:
>
>   - Adapt the memory leak fix for command characters to instead use a
>     `comment_line_str_allocated` variable.
>
>   - Clarify some commit messages.
>
>   - Drop the TODO comment about `rebase.gpgsign`. Turns out that this is
>     working as intended, as explained by Phillip.
>
> Thanks!
>

I went through the series and apart from some typos, everything looked
great. I don't expect a reroll for those typos though, since they're
minor.

[snip]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 00/22] Memory leak fixes (pt.4)
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
                     ` (22 preceding siblings ...)
  2024-08-12  9:13   ` [PATCH v2 00/22] Memory leak fixes (pt.4) karthik nayak
@ 2024-08-12 14:01   ` Phillip Wood
  2024-08-12 15:50     ` Junio C Hamano
  23 siblings, 1 reply; 146+ messages in thread
From: Phillip Wood @ 2024-08-12 14:01 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: James Liu

Hi Patrick

On 08/08/2024 14:04, Patrick Steinhardt wrote:
> Hi,
> 
> this is the second version of my fourth batch of patches that fix
> various memory leaks.
> 
> Changes compared to v1:
> 
>    - Adapt the memory leak fix for command characters to instead use a
>      `comment_line_str_allocated` variable.
> 
>    - Clarify some commit messages.
> 
>    - Drop the TODO comment about `rebase.gpgsign`. Turns out that this is
>      working as intended, as explained by Phillip.

The changes to the rebase and sequencer patches look good to me

Thanks

Phillip

> Thanks!
> 
> Patrick
> 
> Patrick Steinhardt (22):
>    remote: plug memory leak when aliasing URLs
>    git: fix leaking system paths
>    object-file: fix memory leak when reading corrupted headers
>    object-name: fix leaking symlink paths in object context
>    bulk-checkin: fix leaking state TODO
>    read-cache: fix leaking hashfile when writing index fails
>    submodule-config: fix leaking name enrty when traversing submodules
>    config: fix leaking comment character config
>    builtin/rebase: fix leaking `commit.gpgsign` value
>    builtin/notes: fix leaking `struct notes_tree` when merging notes
>    builtin/fast-import: plug trivial memory leaks
>    builtin/fast-export: fix leaking diff options
>    builtin/fast-export: plug leaking tag names
>    merge-ort: unconditionally release attributes index
>    sequencer: release todo list on error paths
>    unpack-trees: clear index when not propagating it
>    diff: fix leak when parsing invalid ignore regex option
>    builtin/format-patch: fix various trivial memory leaks
>    userdiff: fix leaking memory for configured diff drivers
>    builtin/log: fix leak when showing converted blob contents
>    diff: free state populated via options
>    builtin/diff: free symmetric diff members
> 
>   builtin/commit.c                      |  7 +-
>   builtin/diff.c                        | 10 ++-
>   builtin/fast-export.c                 | 19 ++++--
>   builtin/fast-import.c                 |  8 ++-
>   builtin/log.c                         | 13 +++-
>   builtin/notes.c                       |  9 ++-
>   builtin/rebase.c                      |  1 +
>   bulk-checkin.c                        |  2 +
>   config.c                              |  4 +-
>   csum-file.c                           |  2 +-
>   csum-file.h                           | 10 +++
>   diff.c                                | 16 ++++-
>   environment.c                         |  1 +
>   environment.h                         |  1 +
>   git.c                                 | 12 +++-
>   merge-ort.c                           |  3 +-
>   object-file.c                         |  1 +
>   object-name.c                         |  1 +
>   range-diff.c                          |  6 +-
>   read-cache.c                          | 97 ++++++++++++++++-----------
>   remote.c                              |  2 +
>   sequencer.c                           | 67 ++++++++++++------
>   submodule-config.c                    | 18 +++--
>   t/t0210-trace2-normal.sh              |  2 +-
>   t/t1006-cat-file.sh                   |  1 +
>   t/t1050-large.sh                      |  1 +
>   t/t1450-fsck.sh                       |  1 +
>   t/t1601-index-bogus.sh                |  2 +
>   t/t2107-update-index-basic.sh         |  1 +
>   t/t3310-notes-merge-manual-resolve.sh |  1 +
>   t/t3311-notes-merge-fanout.sh         |  1 +
>   t/t3404-rebase-interactive.sh         |  1 +
>   t/t3435-rebase-gpg-sign.sh            |  1 +
>   t/t3507-cherry-pick-conflict.sh       |  1 +
>   t/t3510-cherry-pick-sequence.sh       |  1 +
>   t/t3705-add-sparse-checkout.sh        |  1 +
>   t/t4013-diff-various.sh               |  1 +
>   t/t4014-format-patch.sh               |  1 +
>   t/t4018-diff-funcname.sh              |  1 +
>   t/t4030-diff-textconv.sh              |  2 +
>   t/t4042-diff-textconv-caching.sh      |  2 +
>   t/t4048-diff-combined-binary.sh       |  1 +
>   t/t4064-diff-oidfind.sh               |  2 +
>   t/t4065-diff-anchored.sh              |  1 +
>   t/t4068-diff-symmetric-merge-base.sh  |  1 +
>   t/t4069-remerge-diff.sh               |  1 +
>   t/t4108-apply-threeway.sh             |  1 +
>   t/t4209-log-pickaxe.sh                |  2 +
>   t/t6421-merge-partial-clone.sh        |  1 +
>   t/t6428-merge-conflicts-sparse.sh     |  1 +
>   t/t7008-filter-branch-null-sha1.sh    |  1 +
>   t/t7030-verify-tag.sh                 |  1 +
>   t/t7817-grep-sparse-checkout.sh       |  1 +
>   t/t9300-fast-import.sh                |  1 +
>   t/t9304-fast-import-marks.sh          |  2 +
>   t/t9351-fast-export-anonymize.sh      |  1 +
>   unpack-trees.c                        |  2 +
>   userdiff.c                            | 38 ++++++++---
>   userdiff.h                            |  4 ++
>   59 files changed, 288 insertions(+), 106 deletions(-)
> 
> Range-diff against v1:
>   1:  6e2fcd85c7 =  1:  2afa51f9ff remote: plug memory leak when aliasing URLs
>   2:  9574995a24 =  2:  324140e4fd git: fix leaking system paths
>   3:  f7e67d02d2 =  3:  43a38a2281 object-file: fix memory leak when reading corrupted headers
>   4:  a9caaaed55 =  4:  9d3dc145e8 object-name: fix leaking symlink paths in object context
>   5:  794af66103 =  5:  454139e7a4 bulk-checkin: fix leaking state TODO
>   6:  2810cada0a =  6:  f8b7195796 read-cache: fix leaking hashfile when writing index fails
>   7:  03f699cf39 =  7:  762fb5aa73 submodule-config: fix leaking name enrty when traversing submodules
>   8:  a34c90a552 !  8:  8fbd72a100 config: fix leaking comment character config
>      @@ Commit message
>           without free'ing the previous value. In fact, it can't easily free the
>           value in the first place because it may contain a string constant.
>       
>      -    Refactor the code so that we initialize the value with another array.
>      -    This allows us to free the value in case the string is not pointing to
>      -    that constant array anymore.
>      +    Refactor the code such that we track allocated comment character strings
>      +    via a separate non-constant variable `comment_line_str_allocated`. Adapt
>      +    sites that set `comment_line_str` to set both and free the old value
>      +    that was stored in `comment_line_str_allocated`.
>       
>           This memory leak is being hit in t3404. As there are still other memory
>           leaks in that file we cannot yet mark it as passing with leak checking
>      @@ Commit message
>       
>           Signed-off-by: Patrick Steinhardt <ps@pks.im>
>       
>      + ## builtin/commit.c ##
>      +@@ builtin/commit.c: static void adjust_comment_line_char(const struct strbuf *sb)
>      + 	const char *p;
>      +
>      + 	if (!memchr(sb->buf, candidates[0], sb->len)) {
>      +-		comment_line_str = xstrfmt("%c", candidates[0]);
>      ++		free(comment_line_str_allocated);
>      ++		comment_line_str = comment_line_str_allocated =
>      ++			xstrfmt("%c", candidates[0]);
>      + 		return;
>      + 	}
>      +
>      +@@ builtin/commit.c: static void adjust_comment_line_char(const struct strbuf *sb)
>      + 	if (!*p)
>      + 		die(_("unable to select a comment character that is not used\n"
>      + 		      "in the current commit message"));
>      +-	comment_line_str = xstrfmt("%c", *p);
>      ++	free(comment_line_str_allocated);
>      ++	comment_line_str = comment_line_str_allocated = xstrfmt("%c", *p);
>      + }
>      +
>      + static void prepare_amend_commit(struct commit *commit, struct strbuf *sb,
>      +
>        ## config.c ##
>       @@ config.c: static int git_default_core_config(const char *var, const char *value,
>        		else if (value[0]) {
>        			if (strchr(value, '\n'))
>        				return error(_("%s cannot contain newline"), var);
>      -+			if (comment_line_str != comment_line_str_default)
>      -+				free((char *) comment_line_str);
>      - 			comment_line_str = xstrdup(value);
>      +-			comment_line_str = xstrdup(value);
>      ++			free(comment_line_str_allocated);
>      ++			comment_line_str = comment_line_str_allocated =
>      ++				xstrdup(value);
>        			auto_comment_line_char = 0;
>        		} else
>      + 			return error(_("%s must have at least one character"), var);
>       
>        ## environment.c ##
>       @@ environment.c: int protect_ntfs = PROTECT_NTFS_DEFAULT;
>      -  * The character that begins a commented line in user-editable file
>         * that is subject to stripspace.
>         */
>      --const char *comment_line_str = "#";
>      -+const char comment_line_str_default[] = "#";
>      -+const char *comment_line_str = comment_line_str_default;
>      + const char *comment_line_str = "#";
>      ++char *comment_line_str_allocated;
>        int auto_comment_line_char;
>        
>        /* Parallel index stat data preload? */
>       
>        ## environment.h ##
>       @@ environment.h: struct strvec;
>      -  * The character that begins a commented line in user-editable file
>         * that is subject to stripspace.
>         */
>      -+extern const char comment_line_str_default[];
>        extern const char *comment_line_str;
>      ++extern char *comment_line_str_allocated;
>        extern int auto_comment_line_char;
>        
>      + /*
>   9:  05290fc1f1 !  9:  e497b76e9c builtin/rebase: fix leaking `commit.gpgsign` value
>      @@ Metadata
>        ## Commit message ##
>           builtin/rebase: fix leaking `commit.gpgsign` value
>       
>      -    In `get_replay_opts()`, we unconditionally override the `gpg_sign` field
>      -    that already got populated by `sequencer_init_config()` in case the user
>      -    has "commit.gpgsign" set in their config. It is kind of dubious whether
>      -    this is the correct thing to do or a bug. What is clear though is that
>      -    this creates a memory leak.
>      +    In `get_replay_opts()`, we override the `gpg_sign` field that already
>      +    got populated by `sequencer_init_config()` in case the user has
>      +    "commit.gpgsign" set in their config. This creates a memory leak because
>      +    we overwrite the previously assigned value, which may have already
>      +    pointed to an allocated string.
>       
>      -    Let's mark this assignment with a TODO comment to figure out whether
>      -    this needs to be fixed or not. Meanwhile though, let's plug the memory
>      -    leak.
>      +    Let's plug the memory leak by freeing the value before we overwrite it.
>       
>           Signed-off-by: Patrick Steinhardt <ps@pks.im>
>       
>      @@ builtin/rebase.c: static struct replay_opts get_replay_opts(const struct rebase_
>        	replay.committer_date_is_author_date =
>        					opts->committer_date_is_author_date;
>        	replay.ignore_date = opts->ignore_date;
>      -+
>      -+	/*
>      -+	 * TODO: Is it really intentional that we unconditionally override
>      -+	 * `replay.gpg_sign` even if it has already been initialized via the
>      -+	 * configuration?
>      -+	 */
>       +	free(replay.gpg_sign);
>        	replay.gpg_sign = xstrdup_or_null(opts->gpg_sign_opt);
>      -+
>        	replay.reflog_action = xstrdup(opts->reflog_action);
>        	if (opts->strategy)
>      - 		replay.strategy = xstrdup_or_null(opts->strategy);
>       
>        ## sequencer.c ##
>       @@ sequencer.c: static int git_sequencer_config(const char *k, const char *v,
> 10:  4f5d490074 = 10:  c886b666f7 builtin/notes: fix leaking `struct notes_tree` when merging notes
> 11:  798b911f77 = 11:  d1c757157b builtin/fast-import: plug trivial memory leaks
> 12:  660732d29d = 12:  fa2d5c5d6b builtin/fast-export: fix leaking diff options
> 13:  64366155de = 13:  d9dd860d2a builtin/fast-export: plug leaking tag names
> 14:  b12015b3c3 = 14:  8f6860485e merge-ort: unconditionally release attributes index
> 15:  df4c21b49f ! 15:  ea6a350f31 sequencer: release todo list on error paths
>      @@ sequencer.c: int sequencer_pick_revisions(struct repository *r,
>        									&oid,
>        									NULL);
>       -				return error(_("%s: can't cherry-pick a %s"),
>      +-					name, type_name(type));
>       +				res = error(_("%s: can't cherry-pick a %s"),
>      - 					name, type_name(type));
>      ++					    name, type_name(type));
>       +				goto out;
>        			}
>       -		} else
> 16:  1f8553fd43 = 16:  2755023742 unpack-trees: clear index when not propagating it
> 17:  c6db8df324 = 17:  edf6f148cd diff: fix leak when parsing invalid ignore regex option
> 18:  bf818a8a79 = 18:  343e3bd4df builtin/format-patch: fix various trivial memory leaks
> 19:  ef780aa360 = 19:  be2c5b0bca userdiff: fix leaking memory for configured diff drivers
> 20:  f3882986a3 = 20:  7888203833 builtin/log: fix leak when showing converted blob contents
> 21:  a49bb2e0cc = 21:  245fc30afb diff: free state populated via options
> 22:  fb52599404 = 22:  343ddcd17b builtin/diff: free symmetric diff members

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 01/22] remote: plug memory leak when aliasing URLs
  2024-08-08 13:04   ` [PATCH v2 01/22] remote: plug memory leak when aliasing URLs Patrick Steinhardt
  2024-08-12  8:27     ` karthik nayak
@ 2024-08-12 14:08     ` Taylor Blau
  2024-08-12 14:37     ` Jeff King
  2 siblings, 0 replies; 146+ messages in thread
From: Taylor Blau @ 2024-08-12 14:08 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, James Liu, Phillip Wood

On Thu, Aug 08, 2024 at 03:04:33PM +0200, Patrick Steinhardt wrote:
> When we have a `url.*.insteadOf` configuration, then we end up aliasing
> URLs when populating remotes. One place where this happens is in
> `alias_all_urls()`, where we loop through all remotes and then alias
> each of their URLs. The actual aliasing logic is then contained in
> `alias_url()`, which returns an allocated string that contains the new
> URL. This URL replaces the old URL that we have in the strvec that
> contanis all remote URLs.
>
> We replace the remote URLs via `strvec_replace()`, which does not hand
> over ownership of the new string to the vector. Still, we didn't free
> the aliased URL and thus have a memory leak here. Fix it by freeing the
> aliased string.

Thanks for the detailed explanation here.

> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  remote.c                 | 2 ++
>  t/t0210-trace2-normal.sh | 2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/remote.c b/remote.c
> index f43cf5e7a4..3b898edd23 100644
> --- a/remote.c
> +++ b/remote.c
> @@ -499,6 +499,7 @@ static void alias_all_urls(struct remote_state *remote_state)
>  			if (alias)
>  				strvec_replace(&remote_state->remotes[i]->pushurl,
>  					       j, alias);
> +			free(alias);
>  		}
>  		add_pushurl_aliases = remote_state->remotes[i]->pushurl.nr == 0;
>  		for (j = 0; j < remote_state->remotes[i]->url.nr; j++) {
> @@ -512,6 +513,7 @@ static void alias_all_urls(struct remote_state *remote_state)
>  			if (alias)
>  				strvec_replace(&remote_state->remotes[i]->url,
>  					       j, alias);
> +			free(alias);
>  		}
>  	}
>  }

These both make sense to me, since alias_url() allocates the string it
returns via xstrfmt(), so having the caller free it makes sense.

I was wondering if there was a nice way to neaten up these two call
paths that both call alias_url(), check for NULL, call strbuf_replace(),
and then free the result. But I think the result here would be pretty
awkward from my attempts at it, so I think this patch looks good as-is.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 02/22] git: fix leaking system paths
  2024-08-08 13:04   ` [PATCH v2 02/22] git: fix leaking system paths Patrick Steinhardt
@ 2024-08-12 14:11     ` Taylor Blau
  2024-08-13  6:30       ` Patrick Steinhardt
  0 siblings, 1 reply; 146+ messages in thread
From: Taylor Blau @ 2024-08-12 14:11 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, James Liu, Phillip Wood

On Thu, Aug 08, 2024 at 03:04:39PM +0200, Patrick Steinhardt wrote:
> Git has some flags to make it output system paths as they have been
> compiled into Git. This is done by calling `system_path()`, which
> returns an allocated string. This string isn't ever free'd though,
> creating a memory leak.
>
> Plug those leaks. While they are surfaced by t0211, there are more
> memory leaks looming exposed by that test suite and it thus does not yet
> pass with the memory leak checker enabled.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  git.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/git.c b/git.c
> index e35af9b0e5..5eab88b472 100644
> --- a/git.c
> +++ b/git.c
> @@ -173,15 +173,21 @@ static int handle_options(const char ***argv, int *argc, int *envchanged)
>  				exit(0);
>  			}
>  		} else if (!strcmp(cmd, "--html-path")) {
> -			puts(system_path(GIT_HTML_PATH));
> +			char *path = system_path(GIT_HTML_PATH);
> +			puts(path);
> +			free(path);
>  			trace2_cmd_name("_query_");
>  			exit(0);
>  		} else if (!strcmp(cmd, "--man-path")) {
> -			puts(system_path(GIT_MAN_PATH));
> +			char *path = system_path(GIT_MAN_PATH);
> +			puts(path);
> +			free(path);
>  			trace2_cmd_name("_query_");
>  			exit(0);
>  		} else if (!strcmp(cmd, "--info-path")) {
> -			puts(system_path(GIT_INFO_PATH));
> +			char *path = system_path(GIT_INFO_PATH);
> +			puts(path);
> +			free(path);
>  			trace2_cmd_name("_query_");
>  			exit(0);
>  		} else if (!strcmp(cmd, "-p") || !strcmp(cmd, "--paginate")) {


Makes sense, though I wonder if this would be slightly cleaner to write
like so (applies on top of this patch):

--- 8< ---
diff --git a/git.c b/git.c
index 5eab88b472..9a618a2740 100644
--- a/git.c
+++ b/git.c
@@ -143,6 +143,13 @@ void setup_auto_pager(const char *cmd, int def)
 	commit_pager_choice();
 }

+static void print_system_path(const char *path)
+{
+	char *s_path = system_path(path);
+	puts(s_path);
+	free(s_path);
+}
+
 static int handle_options(const char ***argv, int *argc, int *envchanged)
 {
 	const char **orig_argv = *argv;
@@ -173,21 +180,15 @@ static int handle_options(const char ***argv, int *argc, int *envchanged)
 				exit(0);
 			}
 		} else if (!strcmp(cmd, "--html-path")) {
-			char *path = system_path(GIT_HTML_PATH);
-			puts(path);
-			free(path);
+			print_system_path(GIT_HTML_PATH);
 			trace2_cmd_name("_query_");
 			exit(0);
 		} else if (!strcmp(cmd, "--man-path")) {
-			char *path = system_path(GIT_MAN_PATH);
-			puts(path);
-			free(path);
+			print_system_path(GIT_MAN_PATH);
 			trace2_cmd_name("_query_");
 			exit(0);
 		} else if (!strcmp(cmd, "--info-path")) {
-			char *path = system_path(GIT_INFO_PATH);
-			puts(path);
-			free(path);
+			print_system_path(GIT_INFO_PATH);
 			trace2_cmd_name("_query_");
 			exit(0);
 		} else if (!strcmp(cmd, "-p") || !strcmp(cmd, "--paginate")) {
--- >8 ---

Thanks,
Taylor

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 01/22] remote: plug memory leak when aliasing URLs
  2024-08-08 13:04   ` [PATCH v2 01/22] remote: plug memory leak when aliasing URLs Patrick Steinhardt
  2024-08-12  8:27     ` karthik nayak
  2024-08-12 14:08     ` Taylor Blau
@ 2024-08-12 14:37     ` Jeff King
  2024-08-13  6:34       ` Patrick Steinhardt
  2 siblings, 1 reply; 146+ messages in thread
From: Jeff King @ 2024-08-12 14:37 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, James Liu, Phillip Wood

On Thu, Aug 08, 2024 at 03:04:33PM +0200, Patrick Steinhardt wrote:

> When we have a `url.*.insteadOf` configuration, then we end up aliasing
> URLs when populating remotes. One place where this happens is in
> `alias_all_urls()`, where we loop through all remotes and then alias
> each of their URLs. The actual aliasing logic is then contained in
> `alias_url()`, which returns an allocated string that contains the new
> URL. This URL replaces the old URL that we have in the strvec that
> contanis all remote URLs.
> 
> We replace the remote URLs via `strvec_replace()`, which does not hand
> over ownership of the new string to the vector. Still, we didn't free
> the aliased URL and thus have a memory leak here. Fix it by freeing the
> aliased string.

Thanks, this one is my fault. When I replaced the open-coded replacement
in 8e804415fd (remote: use strvecs to store remote url/pushurl,
2024-06-14), for some reason I thought that strvec_replace() would take
ownership of the pointer. We could make a "_nodup()" variant, but it is
probably not worth the extra API complexity.

Curiously, these are the only calls for strvec_replace(). You added it
in 11ce77b5cc (strvec: add functions to replace and remove strings,
2024-05-27) but I don't see them used in any iteration of that patch
series. So yet another option is to change the semantics of
strvec_replace(), but I think that is an even worse idea. ;)

-Peff

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 00/22] Memory leak fixes (pt.4)
  2024-08-12  9:13   ` [PATCH v2 00/22] Memory leak fixes (pt.4) karthik nayak
@ 2024-08-12 15:49     ` Junio C Hamano
  2024-08-13  6:27       ` Patrick Steinhardt
  0 siblings, 1 reply; 146+ messages in thread
From: Junio C Hamano @ 2024-08-12 15:49 UTC (permalink / raw)
  To: karthik nayak; +Cc: Patrick Steinhardt, git, James Liu, Phillip Wood

karthik nayak <karthik.188@gmail.com> writes:

> I went through the series and apart from some typos, everything looked
> great. I don't expect a reroll for those typos though, since they're
> minor.

Thanks for a review.  A final reroll that shows only the typofixes
in interdiff/range-diff is not a huge burden, but having to deal
with many separate "here is a typo", "here is another typo" patches
over a period is annoying and it is even worse that many readers
have to get their reading distracted when seeing these leftover
typoes, and get annoyed but yet not annoyed seriously enough to send
these typofix patches to leave them unfixed to cause other readers'
reading hiccupped.  Let's always remember that there are 100x more
readers than those of us who write.

Thanks.


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 00/22] Memory leak fixes (pt.4)
  2024-08-12 14:01   ` Phillip Wood
@ 2024-08-12 15:50     ` Junio C Hamano
  0 siblings, 0 replies; 146+ messages in thread
From: Junio C Hamano @ 2024-08-12 15:50 UTC (permalink / raw)
  To: Phillip Wood; +Cc: Patrick Steinhardt, git, James Liu

Phillip Wood <phillip.wood123@gmail.com> writes:

> On 08/08/2024 14:04, Patrick Steinhardt wrote:
>> Hi,
>> this is the second version of my fourth batch of patches that fix
>> various memory leaks.
>> Changes compared to v1:
>>    - Adapt the memory leak fix for command characters to instead use
>> a
>>      `comment_line_str_allocated` variable.
>>    - Clarify some commit messages.
>>    - Drop the TODO comment about `rebase.gpgsign`. Turns out that
>> this is
>>      working as intended, as explained by Phillip.
>
> The changes to the rebase and sequencer patches look good to me
>
> Thanks
>
> Phillip

Thanks for a review.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 08/22] config: fix leaking comment character config
  2024-08-12  7:45       ` Patrick Steinhardt
@ 2024-08-12 20:32         ` Junio C Hamano
  2024-08-13  6:54           ` Patrick Steinhardt
  0 siblings, 1 reply; 146+ messages in thread
From: Junio C Hamano @ 2024-08-12 20:32 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, James Liu, Phillip Wood

Patrick Steinhardt <ps@pks.im> writes:

> On Thu, Aug 08, 2024 at 10:12:26AM -0700, Junio C Hamano wrote:
>> Patrick Steinhardt <ps@pks.im> writes:
>> 
>> > diff --git a/config.c b/config.c
>> > index 6421894614..cb78b652ee 100644
>> > --- a/config.c
>> > +++ b/config.c
>> > @@ -1596,7 +1596,9 @@ static int git_default_core_config(const char *var, const char *value,
>> >  		else if (value[0]) {
>> >  			if (strchr(value, '\n'))
>> >  				return error(_("%s cannot contain newline"), var);
>> > -			comment_line_str = xstrdup(value);
>> > +			free(comment_line_str_allocated);
>> > +			comment_line_str = comment_line_str_allocated =
>> > +				xstrdup(value);
>> 
>> If you are to follow the _to_free pattern, you do not have to
>> allocate here, no?  We borrow the value in the configset and point
>> at it via comment_line_str, and clear comment_line_str_to_free
>> because there is nothing to free now.  I.e.
>> 
>> 			comment_line_str = value;
>> 			FREE_AND_NULL(comment_line_str_allocated);
>
> Only if it is guaranteed that the configuration will never be re-read,
> which would end up discarding memory owned by the old string. Which
> should be the case already, but to the best of my knowledge we do not
> document the expected lifetime of config strings anywhere.

Then let's mark it as #leftoverbits to document it.  Many other code
paths depend on it.

>> I still think the approach taken by the previous iteration was
>> simpler and much less error prone, though.
>
> I personally prefer this iteration.

If so, then let's fully take advantage of the fact that you have a
to-free variable dedicated for the comment_line_str variable.

I still think it is a maintenance burden to keep them always in sync
(which is another thing the developers have to remember---when they
are updating _this_ particular variable, an extra rule applies and
they need to take care of this _allocated thing associated with it),
and the first approach, by not forcing all the other assignment code
paths to worry about it, simplifies the mental model for developers
greatly (i.e. we know we do not own the initial value, but
everything else we allocate thus we free everything but the initial
value), in exchange for a slightly wasteful allocation.

The approach in the second patch is worse in two counds compared to
the original.  It does wasteful allocation (which we do not have
to---the fix was shown above).  It also burdens the developers to
know that they have to manually manage the _allocated half of the
two-variable pair.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 00/22] Memory leak fixes (pt.4)
  2024-08-12 15:49     ` Junio C Hamano
@ 2024-08-13  6:27       ` Patrick Steinhardt
  0 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  6:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: karthik nayak, git, James Liu, Phillip Wood

On Mon, Aug 12, 2024 at 08:49:59AM -0700, Junio C Hamano wrote:
> karthik nayak <karthik.188@gmail.com> writes:
> 
> > I went through the series and apart from some typos, everything looked
> > great. I don't expect a reroll for those typos though, since they're
> > minor.
> 
> Thanks for a review.  A final reroll that shows only the typofixes
> in interdiff/range-diff is not a huge burden, but having to deal
> with many separate "here is a typo", "here is another typo" patches
> over a period is annoying and it is even worse that many readers
> have to get their reading distracted when seeing these leftover
> typoes, and get annoyed but yet not annoyed seriously enough to send
> these typofix patches to leave them unfixed to cause other readers'
> reading hiccupped.  Let's always remember that there are 100x more
> readers than those of us who write.

Clue taken, I'll finally bite the bullet and set up spell correction in
my editor. I'll also reroll this series later today.

Patrick

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 02/22] git: fix leaking system paths
  2024-08-12 14:11     ` Taylor Blau
@ 2024-08-13  6:30       ` Patrick Steinhardt
  2024-08-13 16:02         ` Junio C Hamano
  0 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  6:30 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, James Liu, Phillip Wood

On Mon, Aug 12, 2024 at 10:11:05AM -0400, Taylor Blau wrote:
> On Thu, Aug 08, 2024 at 03:04:39PM +0200, Patrick Steinhardt wrote:
> > Git has some flags to make it output system paths as they have been
> > compiled into Git. This is done by calling `system_path()`, which
> > returns an allocated string. This string isn't ever free'd though,
> > creating a memory leak.
> >
> > Plug those leaks. While they are surfaced by t0211, there are more
> > memory leaks looming exposed by that test suite and it thus does not yet
> > pass with the memory leak checker enabled.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  git.c | 12 +++++++++---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/git.c b/git.c
> > index e35af9b0e5..5eab88b472 100644
> > --- a/git.c
> > +++ b/git.c
> > @@ -173,15 +173,21 @@ static int handle_options(const char ***argv, int *argc, int *envchanged)
> >  				exit(0);
> >  			}
> >  		} else if (!strcmp(cmd, "--html-path")) {
> > -			puts(system_path(GIT_HTML_PATH));
> > +			char *path = system_path(GIT_HTML_PATH);
> > +			puts(path);
> > +			free(path);
> >  			trace2_cmd_name("_query_");
> >  			exit(0);
> >  		} else if (!strcmp(cmd, "--man-path")) {
> > -			puts(system_path(GIT_MAN_PATH));
> > +			char *path = system_path(GIT_MAN_PATH);
> > +			puts(path);
> > +			free(path);
> >  			trace2_cmd_name("_query_");
> >  			exit(0);
> >  		} else if (!strcmp(cmd, "--info-path")) {
> > -			puts(system_path(GIT_INFO_PATH));
> > +			char *path = system_path(GIT_INFO_PATH);
> > +			puts(path);
> > +			free(path);
> >  			trace2_cmd_name("_query_");
> >  			exit(0);
> >  		} else if (!strcmp(cmd, "-p") || !strcmp(cmd, "--paginate")) {
> 
> 
> Makes sense, though I wonder if this would be slightly cleaner to write
> like so (applies on top of this patch):

It is cleaner indeed, thanks for the proposal!

Patrick

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 01/22] remote: plug memory leak when aliasing URLs
  2024-08-12 14:37     ` Jeff King
@ 2024-08-13  6:34       ` Patrick Steinhardt
  0 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  6:34 UTC (permalink / raw)
  To: Jeff King; +Cc: git, James Liu, Phillip Wood

On Mon, Aug 12, 2024 at 10:37:10AM -0400, Jeff King wrote:
> On Thu, Aug 08, 2024 at 03:04:33PM +0200, Patrick Steinhardt wrote:
> 
> > When we have a `url.*.insteadOf` configuration, then we end up aliasing
> > URLs when populating remotes. One place where this happens is in
> > `alias_all_urls()`, where we loop through all remotes and then alias
> > each of their URLs. The actual aliasing logic is then contained in
> > `alias_url()`, which returns an allocated string that contains the new
> > URL. This URL replaces the old URL that we have in the strvec that
> > contanis all remote URLs.
> > 
> > We replace the remote URLs via `strvec_replace()`, which does not hand
> > over ownership of the new string to the vector. Still, we didn't free
> > the aliased URL and thus have a memory leak here. Fix it by freeing the
> > aliased string.
> 
> Thanks, this one is my fault. When I replaced the open-coded replacement
> in 8e804415fd (remote: use strvecs to store remote url/pushurl,
> 2024-06-14), for some reason I thought that strvec_replace() would take
> ownership of the pointer. We could make a "_nodup()" variant, but it is
> probably not worth the extra API complexity.
> 
> Curiously, these are the only calls for strvec_replace(). You added it
> in 11ce77b5cc (strvec: add functions to replace and remove strings,
> 2024-05-27) but I don't see them used in any iteration of that patch
> series. So yet another option is to change the semantics of
> strvec_replace(), but I think that is an even worse idea. ;)

Oh, interesting. I certainly wanted to use it back then, but guess that
later iterations removed those callsites again. In any case, it is being
used now, so at least it isn't dead code :)

Patrick

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 08/22] config: fix leaking comment character config
  2024-08-12 20:32         ` Junio C Hamano
@ 2024-08-13  6:54           ` Patrick Steinhardt
  0 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  6:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, James Liu, Phillip Wood

On Mon, Aug 12, 2024 at 01:32:53PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > On Thu, Aug 08, 2024 at 10:12:26AM -0700, Junio C Hamano wrote:
> >> Patrick Steinhardt <ps@pks.im> writes:
> >> 
> >> > diff --git a/config.c b/config.c
> >> > index 6421894614..cb78b652ee 100644
> >> > --- a/config.c
> >> > +++ b/config.c
> >> > @@ -1596,7 +1596,9 @@ static int git_default_core_config(const char *var, const char *value,
> >> >  		else if (value[0]) {
> >> >  			if (strchr(value, '\n'))
> >> >  				return error(_("%s cannot contain newline"), var);
> >> > -			comment_line_str = xstrdup(value);
> >> > +			free(comment_line_str_allocated);
> >> > +			comment_line_str = comment_line_str_allocated =
> >> > +				xstrdup(value);
> >> 
> >> If you are to follow the _to_free pattern, you do not have to
> >> allocate here, no?  We borrow the value in the configset and point
> >> at it via comment_line_str, and clear comment_line_str_to_free
> >> because there is nothing to free now.  I.e.
> >> 
> >> 			comment_line_str = value;
> >> 			FREE_AND_NULL(comment_line_str_allocated);
> >
> > Only if it is guaranteed that the configuration will never be re-read,
> > which would end up discarding memory owned by the old string. Which
> > should be the case already, but to the best of my knowledge we do not
> > document the expected lifetime of config strings anywhere.
> 
> Then let's mark it as #leftoverbits to document it.  Many other code
> paths depend on it.

Okay.

> >> I still think the approach taken by the previous iteration was
> >> simpler and much less error prone, though.
> >
> > I personally prefer this iteration.
> 
> If so, then let's fully take advantage of the fact that you have a
> to-free variable dedicated for the comment_line_str variable.

Can do.

> I still think it is a maintenance burden to keep them always in sync
> (which is another thing the developers have to remember---when they
> are updating _this_ particular variable, an extra rule applies and
> they need to take care of this _allocated thing associated with it),
> and the first approach, by not forcing all the other assignment code
> paths to worry about it, simplifies the mental model for developers
> greatly (i.e. we know we do not own the initial value, but
> everything else we allocate thus we free everything but the initial
> value), in exchange for a slightly wasteful allocation.
> 
> The approach in the second patch is worse in two counds compared to
> the original.  It does wasteful allocation (which we do not have
> to---the fix was shown above).  It also burdens the developers to
> know that they have to manually manage the _allocated half of the
> two-variable pair.

Well, the developer has to manage the allocation in both versions of
this patch series. In the first iteration it just wasn't as bad because
I didn't bother to adjust all sites where we set up the comment string.
So I was just about to convert it back to the first iteration, but then
again saw that we now have to carry this ugly construct everywhere:

    if (comment_line_str != comment_line_str_default)
        free((char *) comment_line_str);
    comme_line_str = xstrdup(value);

vs

    free(comment_line_str_to_free);
    comment_line_str = comment_line_str_to_free = xstrdup(value);

I certainly think that the maintenance headache of the first version is
higher than having to maintain both variables. The worst that can happen
for the second version is that we leak memory because we don't update
the `_to_free` string. The worst that can happen in the above code
snippet is that one isn't aware of the condition of when the string
needs to be freed and then unconditionally frees it, leading to a
segfault.

But as said, ultimately I think neither of these versions is great.

Patrick

^ permalink raw reply	[flat|nested] 146+ messages in thread

* [PATCH v3 00/22] Memory leak fixes (pt.4)
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (24 preceding siblings ...)
  2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
@ 2024-08-13  9:31 ` Patrick Steinhardt
  2024-08-13  9:31   ` [PATCH v3 01/22] remote: plug memory leak when aliasing URLs Patrick Steinhardt
                     ` (22 more replies)
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
  26 siblings, 23 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:31 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

Hi,

this is the third version of my fourth batch of patches that fix
various memory leaks.

Changes compared to v2:

  - Various typo fixes in commit messages.

  - Introduce `print_system_path()` as proposed by Taylor, which removes
    some of the repetition when printing system ptahs.

  - Micro-optimize one allocation for comment char strings away. Also,
    rename the variable to `comment_line_str_to_free` to better match
    how we call such variables in other places.

Thanks!

Patrick

Patrick Steinhardt (22):
  remote: plug memory leak when aliasing URLs
  git: fix leaking system paths
  object-file: fix memory leak when reading corrupted headers
  object-name: fix leaking symlink paths in object context
  bulk-checkin: fix leaking state TODO
  read-cache: fix leaking hashfile when writing index fails
  submodule-config: fix leaking name entry when traversing submodules
  config: fix leaking comment character config
  builtin/rebase: fix leaking `commit.gpgsign` value
  builtin/notes: fix leaking `struct notes_tree` when merging notes
  builtin/fast-import: plug trivial memory leaks
  builtin/fast-export: fix leaking diff options
  builtin/fast-export: plug leaking tag names
  merge-ort: unconditionally release attributes index
  sequencer: release todo list on error paths
  unpack-trees: clear index when not propagating it
  diff: fix leak when parsing invalid ignore regex option
  builtin/format-patch: fix various trivial memory leaks
  userdiff: fix leaking memory for configured diff drivers
  builtin/log: fix leak when showing converted blob contents
  diff: free state populated via options
  builtin/diff: free symmetric diff members

 builtin/commit.c                      |  7 +-
 builtin/diff.c                        | 10 ++-
 builtin/fast-export.c                 | 19 ++++--
 builtin/fast-import.c                 |  8 ++-
 builtin/log.c                         | 13 +++-
 builtin/notes.c                       |  9 ++-
 builtin/rebase.c                      |  1 +
 bulk-checkin.c                        |  2 +
 config.c                              |  3 +-
 csum-file.c                           |  2 +-
 csum-file.h                           | 10 +++
 diff.c                                | 16 ++++-
 environment.c                         |  1 +
 environment.h                         |  1 +
 git.c                                 | 13 +++-
 merge-ort.c                           |  3 +-
 object-file.c                         |  1 +
 object-name.c                         |  1 +
 range-diff.c                          |  6 +-
 read-cache.c                          | 97 ++++++++++++++++-----------
 remote.c                              |  2 +
 sequencer.c                           | 67 ++++++++++++------
 submodule-config.c                    | 18 +++--
 t/t0210-trace2-normal.sh              |  2 +-
 t/t1006-cat-file.sh                   |  1 +
 t/t1050-large.sh                      |  1 +
 t/t1450-fsck.sh                       |  1 +
 t/t1601-index-bogus.sh                |  2 +
 t/t2107-update-index-basic.sh         |  1 +
 t/t3310-notes-merge-manual-resolve.sh |  1 +
 t/t3311-notes-merge-fanout.sh         |  1 +
 t/t3404-rebase-interactive.sh         |  1 +
 t/t3435-rebase-gpg-sign.sh            |  1 +
 t/t3507-cherry-pick-conflict.sh       |  1 +
 t/t3510-cherry-pick-sequence.sh       |  1 +
 t/t3705-add-sparse-checkout.sh        |  1 +
 t/t4013-diff-various.sh               |  1 +
 t/t4014-format-patch.sh               |  1 +
 t/t4018-diff-funcname.sh              |  1 +
 t/t4030-diff-textconv.sh              |  2 +
 t/t4042-diff-textconv-caching.sh      |  2 +
 t/t4048-diff-combined-binary.sh       |  1 +
 t/t4064-diff-oidfind.sh               |  2 +
 t/t4065-diff-anchored.sh              |  1 +
 t/t4068-diff-symmetric-merge-base.sh  |  1 +
 t/t4069-remerge-diff.sh               |  1 +
 t/t4108-apply-threeway.sh             |  1 +
 t/t4209-log-pickaxe.sh                |  2 +
 t/t6421-merge-partial-clone.sh        |  1 +
 t/t6428-merge-conflicts-sparse.sh     |  1 +
 t/t7008-filter-branch-null-sha1.sh    |  1 +
 t/t7030-verify-tag.sh                 |  1 +
 t/t7817-grep-sparse-checkout.sh       |  1 +
 t/t9300-fast-import.sh                |  1 +
 t/t9304-fast-import-marks.sh          |  2 +
 t/t9351-fast-export-anonymize.sh      |  1 +
 unpack-trees.c                        |  2 +
 userdiff.c                            | 38 ++++++++---
 userdiff.h                            |  4 ++
 59 files changed, 288 insertions(+), 106 deletions(-)

Range-diff against v2:
 1:  2afa51f9ff !  1:  02f6da020f remote: plug memory leak when aliasing URLs
    @@ Commit message
         each of their URLs. The actual aliasing logic is then contained in
         `alias_url()`, which returns an allocated string that contains the new
         URL. This URL replaces the old URL that we have in the strvec that
    -    contanis all remote URLs.
    +    contains all remote URLs.
     
         We replace the remote URLs via `strvec_replace()`, which does not hand
         over ownership of the new string to the vector. Still, we didn't free
 2:  324140e4fd !  2:  f36d895948 git: fix leaking system paths
    @@ Commit message
         memory leaks looming exposed by that test suite and it thus does not yet
         pass with the memory leak checker enabled.
     
    +    Helped-by: Taylor Blau <me@ttaylorr.com>
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
      ## git.c ##
    +@@ git.c: void setup_auto_pager(const char *cmd, int def)
    + 	commit_pager_choice();
    + }
    + 
    ++static void print_system_path(const char *path)
    ++{
    ++	char *s_path = system_path(path);
    ++	puts(s_path);
    ++	free(s_path);
    ++}
    ++
    + static int handle_options(const char ***argv, int *argc, int *envchanged)
    + {
    + 	const char **orig_argv = *argv;
     @@ git.c: static int handle_options(const char ***argv, int *argc, int *envchanged)
      				exit(0);
      			}
      		} else if (!strcmp(cmd, "--html-path")) {
     -			puts(system_path(GIT_HTML_PATH));
    -+			char *path = system_path(GIT_HTML_PATH);
    -+			puts(path);
    -+			free(path);
    ++			print_system_path(GIT_HTML_PATH);
      			trace2_cmd_name("_query_");
      			exit(0);
      		} else if (!strcmp(cmd, "--man-path")) {
     -			puts(system_path(GIT_MAN_PATH));
    -+			char *path = system_path(GIT_MAN_PATH);
    -+			puts(path);
    -+			free(path);
    ++			print_system_path(GIT_MAN_PATH);
      			trace2_cmd_name("_query_");
      			exit(0);
      		} else if (!strcmp(cmd, "--info-path")) {
     -			puts(system_path(GIT_INFO_PATH));
    -+			char *path = system_path(GIT_INFO_PATH);
    -+			puts(path);
    -+			free(path);
    ++			print_system_path(GIT_INFO_PATH);
      			trace2_cmd_name("_query_");
      			exit(0);
      		} else if (!strcmp(cmd, "-p") || !strcmp(cmd, "--paginate")) {
 3:  43a38a2281 !  3:  0415ac986d object-file: fix memory leak when reading corrupted headers
    @@ Metadata
      ## Commit message ##
         object-file: fix memory leak when reading corrupted headers
     
    -    When reading corrupt object headers in `read_loose_object()`, then we
    -    bail out immediately. This causes a memory leak though because we would
    -    have already initialized the zstream in `unpack_loose_header()`, and it
    -    is the callers responsibility to finish the zstream even on error. While
    +    When reading corrupt object headers in `read_loose_object()`, we bail
    +    out immediately. This causes a memory leak though because we would have
    +    already initialized the zstream in `unpack_loose_header()`, and it is
    +    the callers responsibility to finish the zstream even on error. While
         this feels weird, other callsites do it correctly already.
     
         Fix this leak by ending the zstream even on errors. We may want to
 4:  9d3dc145e8 =  4:  e5130e50a9 object-name: fix leaking symlink paths in object context
 5:  454139e7a4 =  5:  276c828ad1 bulk-checkin: fix leaking state TODO
 6:  f8b7195796 =  6:  ed0608e705 read-cache: fix leaking hashfile when writing index fails
 7:  762fb5aa73 !  7:  b7a7f88c7d submodule-config: fix leaking name enrty when traversing submodules
    @@ Metadata
     Author: Patrick Steinhardt <ps@pks.im>
     
      ## Commit message ##
    -    submodule-config: fix leaking name enrty when traversing submodules
    +    submodule-config: fix leaking name entry when traversing submodules
     
         We traverse through submodules in the tree via `tree_entry()`, passing
         to it a `struct name_entry` that it is supposed to populate with the
 8:  8fbd72a100 !  8:  9054a459a1 config: fix leaking comment character config
    @@ Commit message
         value in the first place because it may contain a string constant.
     
         Refactor the code such that we track allocated comment character strings
    -    via a separate non-constant variable `comment_line_str_allocated`. Adapt
    +    via a separate non-constant variable `comment_line_str_to_free`. Adapt
         sites that set `comment_line_str` to set both and free the old value
    -    that was stored in `comment_line_str_allocated`.
    +    that was stored in `comment_line_str_to_free`.
     
         This memory leak is being hit in t3404. As there are still other memory
         leaks in that file we cannot yet mark it as passing with leak checking
    @@ builtin/commit.c: static void adjust_comment_line_char(const struct strbuf *sb)
      
      	if (!memchr(sb->buf, candidates[0], sb->len)) {
     -		comment_line_str = xstrfmt("%c", candidates[0]);
    -+		free(comment_line_str_allocated);
    -+		comment_line_str = comment_line_str_allocated =
    ++		free(comment_line_str_to_free);
    ++		comment_line_str = comment_line_str_to_free =
     +			xstrfmt("%c", candidates[0]);
      		return;
      	}
    @@ builtin/commit.c: static void adjust_comment_line_char(const struct strbuf *sb)
      		die(_("unable to select a comment character that is not used\n"
      		      "in the current commit message"));
     -	comment_line_str = xstrfmt("%c", *p);
    -+	free(comment_line_str_allocated);
    -+	comment_line_str = comment_line_str_allocated = xstrfmt("%c", *p);
    ++	free(comment_line_str_to_free);
    ++	comment_line_str = comment_line_str_to_free = xstrfmt("%c", *p);
      }
      
      static void prepare_amend_commit(struct commit *commit, struct strbuf *sb,
    @@ config.c: static int git_default_core_config(const char *var, const char *value,
      			if (strchr(value, '\n'))
      				return error(_("%s cannot contain newline"), var);
     -			comment_line_str = xstrdup(value);
    -+			free(comment_line_str_allocated);
    -+			comment_line_str = comment_line_str_allocated =
    -+				xstrdup(value);
    ++			comment_line_str = value;
    ++			FREE_AND_NULL(comment_line_str_to_free);
      			auto_comment_line_char = 0;
      		} else
      			return error(_("%s must have at least one character"), var);
    @@ environment.c: int protect_ntfs = PROTECT_NTFS_DEFAULT;
       * that is subject to stripspace.
       */
      const char *comment_line_str = "#";
    -+char *comment_line_str_allocated;
    ++char *comment_line_str_to_free;
      int auto_comment_line_char;
      
      /* Parallel index stat data preload? */
    @@ environment.h: struct strvec;
       * that is subject to stripspace.
       */
      extern const char *comment_line_str;
    -+extern char *comment_line_str_allocated;
    ++extern char *comment_line_str_to_free;
      extern int auto_comment_line_char;
      
      /*
 9:  e497b76e9c =  9:  1d3957a5eb builtin/rebase: fix leaking `commit.gpgsign` value
10:  c886b666f7 = 10:  0af1bab5a1 builtin/notes: fix leaking `struct notes_tree` when merging notes
11:  d1c757157b = 11:  30d4e9ed43 builtin/fast-import: plug trivial memory leaks
12:  fa2d5c5d6b ! 12:  9591fb7b5e builtin/fast-export: fix leaking diff options
    @@ Metadata
      ## Commit message ##
         builtin/fast-export: fix leaking diff options
     
    -    Before caling `handle_commit()` in a loop, we set `diffopt.no_free` such
    -    that its contents aren't getting freed inside of `handle_commit()`. We
    -    never unset that flag though, which means that it'll ultimately leak
    +    Before calling `handle_commit()` in a loop, we set `diffopt.no_free`
    +    such that its contents aren't getting freed inside of `handle_commit()`.
    +    We never unset that flag though, which means that it'll ultimately leak
         when calling `release_revisions()`.
     
         Fix this by unsetting the flag after the loop.
13:  d9dd860d2a = 13:  254bbb7f6f builtin/fast-export: plug leaking tag names
14:  8f6860485e = 14:  334c4ed71a merge-ort: unconditionally release attributes index
15:  ea6a350f31 = 15:  9f08a859fb sequencer: release todo list on error paths
16:  2755023742 = 16:  5d4934b1a9 unpack-trees: clear index when not propagating it
17:  edf6f148cd = 17:  e1b6a24fbe diff: fix leak when parsing invalid ignore regex option
18:  343e3bd4df = 18:  c048b54a2c builtin/format-patch: fix various trivial memory leaks
19:  be2c5b0bca = 19:  39b2921e3e userdiff: fix leaking memory for configured diff drivers
20:  7888203833 = 20:  50dea1c98a builtin/log: fix leak when showing converted blob contents
21:  245fc30afb = 21:  d5cb4ad580 diff: free state populated via options
22:  343ddcd17b ! 22:  31e38ba4e1 builtin/diff: free symmetric diff members
    @@ Commit message
     
         We populate a `struct symdiff` in case the user has requested a
         symmetric diff. Part of this is to populate a `skip` bitmap that
    -    indicates whihc commits shall be ignored in the diff. But while this
    +    indicates which commits shall be ignored in the diff. But while this
         bitmap is dynamically allocated, we never free it.
     
         Fix this by introducing and calling a new `symdiff_release()` function
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply	[flat|nested] 146+ messages in thread

* [PATCH v3 01/22] remote: plug memory leak when aliasing URLs
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
@ 2024-08-13  9:31   ` Patrick Steinhardt
  2024-08-13  9:31   ` [PATCH v3 02/22] git: fix leaking system paths Patrick Steinhardt
                     ` (21 subsequent siblings)
  22 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:31 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

When we have a `url.*.insteadOf` configuration, then we end up aliasing
URLs when populating remotes. One place where this happens is in
`alias_all_urls()`, where we loop through all remotes and then alias
each of their URLs. The actual aliasing logic is then contained in
`alias_url()`, which returns an allocated string that contains the new
URL. This URL replaces the old URL that we have in the strvec that
contains all remote URLs.

We replace the remote URLs via `strvec_replace()`, which does not hand
over ownership of the new string to the vector. Still, we didn't free
the aliased URL and thus have a memory leak here. Fix it by freeing the
aliased string.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 remote.c                 | 2 ++
 t/t0210-trace2-normal.sh | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/remote.c b/remote.c
index f43cf5e7a4..3b898edd23 100644
--- a/remote.c
+++ b/remote.c
@@ -499,6 +499,7 @@ static void alias_all_urls(struct remote_state *remote_state)
 			if (alias)
 				strvec_replace(&remote_state->remotes[i]->pushurl,
 					       j, alias);
+			free(alias);
 		}
 		add_pushurl_aliases = remote_state->remotes[i]->pushurl.nr == 0;
 		for (j = 0; j < remote_state->remotes[i]->url.nr; j++) {
@@ -512,6 +513,7 @@ static void alias_all_urls(struct remote_state *remote_state)
 			if (alias)
 				strvec_replace(&remote_state->remotes[i]->url,
 					       j, alias);
+			free(alias);
 		}
 	}
 }
diff --git a/t/t0210-trace2-normal.sh b/t/t0210-trace2-normal.sh
index c312657a12..b9adc94aab 100755
--- a/t/t0210-trace2-normal.sh
+++ b/t/t0210-trace2-normal.sh
@@ -2,7 +2,7 @@
 
 test_description='test trace2 facility (normal target)'
 
-TEST_PASSES_SANITIZE_LEAK=false
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # Turn off any inherited trace2 settings for this test.
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 02/22] git: fix leaking system paths
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
  2024-08-13  9:31   ` [PATCH v3 01/22] remote: plug memory leak when aliasing URLs Patrick Steinhardt
@ 2024-08-13  9:31   ` Patrick Steinhardt
  2024-08-13  9:31   ` [PATCH v3 03/22] object-file: fix memory leak when reading corrupted headers Patrick Steinhardt
                     ` (20 subsequent siblings)
  22 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:31 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

Git has some flags to make it output system paths as they have been
compiled into Git. This is done by calling `system_path()`, which
returns an allocated string. This string isn't ever free'd though,
creating a memory leak.

Plug those leaks. While they are surfaced by t0211, there are more
memory leaks looming exposed by that test suite and it thus does not yet
pass with the memory leak checker enabled.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 git.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/git.c b/git.c
index e35af9b0e5..9a618a2740 100644
--- a/git.c
+++ b/git.c
@@ -143,6 +143,13 @@ void setup_auto_pager(const char *cmd, int def)
 	commit_pager_choice();
 }
 
+static void print_system_path(const char *path)
+{
+	char *s_path = system_path(path);
+	puts(s_path);
+	free(s_path);
+}
+
 static int handle_options(const char ***argv, int *argc, int *envchanged)
 {
 	const char **orig_argv = *argv;
@@ -173,15 +180,15 @@ static int handle_options(const char ***argv, int *argc, int *envchanged)
 				exit(0);
 			}
 		} else if (!strcmp(cmd, "--html-path")) {
-			puts(system_path(GIT_HTML_PATH));
+			print_system_path(GIT_HTML_PATH);
 			trace2_cmd_name("_query_");
 			exit(0);
 		} else if (!strcmp(cmd, "--man-path")) {
-			puts(system_path(GIT_MAN_PATH));
+			print_system_path(GIT_MAN_PATH);
 			trace2_cmd_name("_query_");
 			exit(0);
 		} else if (!strcmp(cmd, "--info-path")) {
-			puts(system_path(GIT_INFO_PATH));
+			print_system_path(GIT_INFO_PATH);
 			trace2_cmd_name("_query_");
 			exit(0);
 		} else if (!strcmp(cmd, "-p") || !strcmp(cmd, "--paginate")) {
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 03/22] object-file: fix memory leak when reading corrupted headers
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
  2024-08-13  9:31   ` [PATCH v3 01/22] remote: plug memory leak when aliasing URLs Patrick Steinhardt
  2024-08-13  9:31   ` [PATCH v3 02/22] git: fix leaking system paths Patrick Steinhardt
@ 2024-08-13  9:31   ` Patrick Steinhardt
  2024-08-13  9:31   ` [PATCH v3 04/22] object-name: fix leaking symlink paths in object context Patrick Steinhardt
                     ` (19 subsequent siblings)
  22 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:31 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

When reading corrupt object headers in `read_loose_object()`, we bail
out immediately. This causes a memory leak though because we would have
already initialized the zstream in `unpack_loose_header()`, and it is
the callers responsibility to finish the zstream even on error. While
this feels weird, other callsites do it correctly already.

Fix this leak by ending the zstream even on errors. We may want to
revisit this interface in the future such that the callee handles this
for us already when there was an error.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 object-file.c   | 1 +
 t/t1450-fsck.sh | 1 +
 2 files changed, 2 insertions(+)

diff --git a/object-file.c b/object-file.c
index 065103be3e..7c65c435cd 100644
--- a/object-file.c
+++ b/object-file.c
@@ -2954,6 +2954,7 @@ int read_loose_object(const char *path,
 	if (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
 				NULL) != ULHR_OK) {
 		error(_("unable to unpack header of %s"), path);
+		git_inflate_end(&stream);
 		goto out;
 	}
 
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 8a456b1142..280cbf3e03 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -6,6 +6,7 @@ test_description='git fsck random collection of tests
 * (main) A
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success setup '
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 04/22] object-name: fix leaking symlink paths in object context
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2024-08-13  9:31   ` [PATCH v3 03/22] object-file: fix memory leak when reading corrupted headers Patrick Steinhardt
@ 2024-08-13  9:31   ` Patrick Steinhardt
  2024-08-13  9:31   ` [PATCH v3 05/22] bulk-checkin: fix leaking state TODO Patrick Steinhardt
                     ` (18 subsequent siblings)
  22 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:31 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

The object context may be populated with symlink contents when reading a
symlink, but the associated strbuf doesn't ever get released when
releasing the object context, causing a memory leak. Plug it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 object-name.c       | 1 +
 t/t1006-cat-file.sh | 1 +
 2 files changed, 2 insertions(+)

diff --git a/object-name.c b/object-name.c
index 240a93e7ce..e39fa50e47 100644
--- a/object-name.c
+++ b/object-name.c
@@ -1765,6 +1765,7 @@ int strbuf_check_branch_ref(struct strbuf *sb, const char *name)
 void object_context_release(struct object_context *ctx)
 {
 	free(ctx->path);
+	strbuf_release(&ctx->symlink_path);
 }
 
 /*
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index ff9bf213aa..d36cd7c086 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -2,6 +2,7 @@
 
 test_description='git cat-file'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_cmdmode_usage () {
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 05/22] bulk-checkin: fix leaking state TODO
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2024-08-13  9:31   ` [PATCH v3 04/22] object-name: fix leaking symlink paths in object context Patrick Steinhardt
@ 2024-08-13  9:31   ` Patrick Steinhardt
  2024-08-13  9:31   ` [PATCH v3 06/22] read-cache: fix leaking hashfile when writing index fails Patrick Steinhardt
                     ` (17 subsequent siblings)
  22 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:31 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

When flushing a bulk-checking to disk we also reset the `struct
bulk_checkin_packfile` state. But while we free some of its members,
others aren't being free'd, leading to memory leaks:

  - The temporary packfile name is not getting freed.

  - The `struct hashfile` only gets freed in case we end up calling
    `finalize_hashfile()`. There are code paths though where that is not
    the case, namely when nothing has been written. For this, we need to
    make `free_hashfile()` public.

Fix those leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 bulk-checkin.c   |  2 ++
 csum-file.c      |  2 +-
 csum-file.h      | 10 ++++++++++
 t/t1050-large.sh |  1 +
 4 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/bulk-checkin.c b/bulk-checkin.c
index da8673199b..9089c214fa 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -61,6 +61,7 @@ static void flush_bulk_checkin_packfile(struct bulk_checkin_packfile *state)
 
 	if (state->nr_written == 0) {
 		close(state->f->fd);
+		free_hashfile(state->f);
 		unlink(state->pack_tmp_name);
 		goto clear_exit;
 	} else if (state->nr_written == 1) {
@@ -83,6 +84,7 @@ static void flush_bulk_checkin_packfile(struct bulk_checkin_packfile *state)
 		free(state->written[i]);
 
 clear_exit:
+	free(state->pack_tmp_name);
 	free(state->written);
 	memset(state, 0, sizeof(*state));
 
diff --git a/csum-file.c b/csum-file.c
index 8abbf01325..7e0ece1305 100644
--- a/csum-file.c
+++ b/csum-file.c
@@ -56,7 +56,7 @@ void hashflush(struct hashfile *f)
 	}
 }
 
-static void free_hashfile(struct hashfile *f)
+void free_hashfile(struct hashfile *f)
 {
 	free(f->buffer);
 	free(f->check_buffer);
diff --git a/csum-file.h b/csum-file.h
index 566e05cbd2..ca553eba17 100644
--- a/csum-file.h
+++ b/csum-file.h
@@ -46,6 +46,16 @@ int hashfile_truncate(struct hashfile *, struct hashfile_checkpoint *);
 struct hashfile *hashfd(int fd, const char *name);
 struct hashfile *hashfd_check(const char *name);
 struct hashfile *hashfd_throughput(int fd, const char *name, struct progress *tp);
+
+/*
+ * Free the hashfile without flushing its contents to disk. This only
+ * needs to be called when not calling `finalize_hashfile()`.
+ */
+void free_hashfile(struct hashfile *f);
+
+/*
+ * Finalize the hashfile by flushing data to disk and free'ing it.
+ */
 int finalize_hashfile(struct hashfile *, unsigned char *, enum fsync_component, unsigned int);
 void hashwrite(struct hashfile *, const void *, unsigned int);
 void hashflush(struct hashfile *f);
diff --git a/t/t1050-large.sh b/t/t1050-large.sh
index c71932b024..ed638f6644 100755
--- a/t/t1050-large.sh
+++ b/t/t1050-large.sh
@@ -3,6 +3,7 @@
 
 test_description='adding and checking out large blobs'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'core.bigFileThreshold must be non-negative' '
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 06/22] read-cache: fix leaking hashfile when writing index fails
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2024-08-13  9:31   ` [PATCH v3 05/22] bulk-checkin: fix leaking state TODO Patrick Steinhardt
@ 2024-08-13  9:31   ` Patrick Steinhardt
  2024-08-13  9:31   ` [PATCH v3 07/22] submodule-config: fix leaking name entry when traversing submodules Patrick Steinhardt
                     ` (16 subsequent siblings)
  22 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:31 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

In `do_write_index()`, we use a `struct hashfile` to write the index
with a trailer hash. In case the write fails though, we never clean up
the allocated `hashfile` state and thus leak memory.

Refactor the code to have a common exit path where we can free this and
other allocated memory. While at it, refactor our use of `strbuf`s such
that we reuse the same buffer to avoid some unneeded allocations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 read-cache.c                       | 97 ++++++++++++++++++------------
 t/t1601-index-bogus.sh             |  2 +
 t/t2107-update-index-basic.sh      |  1 +
 t/t7008-filter-branch-null-sha1.sh |  1 +
 4 files changed, 62 insertions(+), 39 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 48bf24f87c..36821fe5b5 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2840,8 +2840,9 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	int csum_fsync_flag;
 	int ieot_entries = 1;
 	struct index_entry_offset_table *ieot = NULL;
-	int nr, nr_threads;
 	struct repository *r = istate->repo;
+	struct strbuf sb = STRBUF_INIT;
+	int nr, nr_threads, ret;
 
 	f = hashfd(tempfile->fd, tempfile->filename.buf);
 
@@ -2962,8 +2963,8 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	strbuf_release(&previous_name_buf);
 
 	if (err) {
-		free(ieot);
-		return err;
+		ret = err;
+		goto out;
 	}
 
 	offset = hashfile_total(f);
@@ -2985,20 +2986,20 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	 * index.
 	 */
 	if (ieot) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		write_ieot_extension(&sb, ieot);
 		err = write_index_ext_header(f, eoie_c, CACHE_EXT_INDEXENTRYOFFSETTABLE, sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		free(ieot);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 
 	if (write_extensions & WRITE_SPLIT_INDEX_EXTENSION &&
 	    istate->split_index) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		if (istate->sparse_index)
 			die(_("cannot write split index for a sparse index"));
@@ -3007,59 +3008,66 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 			write_index_ext_header(f, eoie_c, CACHE_EXT_LINK,
 					       sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 	if (write_extensions & WRITE_CACHE_TREE_EXTENSION &&
 	    !drop_cache_tree && istate->cache_tree) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		cache_tree_write(&sb, istate->cache_tree);
 		err = write_index_ext_header(f, eoie_c, CACHE_EXT_TREE, sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 	if (write_extensions & WRITE_RESOLVE_UNDO_EXTENSION &&
 	    istate->resolve_undo) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		resolve_undo_write(&sb, istate->resolve_undo);
 		err = write_index_ext_header(f, eoie_c, CACHE_EXT_RESOLVE_UNDO,
 					     sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 	if (write_extensions & WRITE_UNTRACKED_CACHE_EXTENSION &&
 	    istate->untracked) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		write_untracked_extension(&sb, istate->untracked);
 		err = write_index_ext_header(f, eoie_c, CACHE_EXT_UNTRACKED,
 					     sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 	if (write_extensions & WRITE_FSMONITOR_EXTENSION &&
 	    istate->fsmonitor_last_update) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		write_fsmonitor_extension(&sb, istate);
 		err = write_index_ext_header(f, eoie_c, CACHE_EXT_FSMONITOR, sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 	if (istate->sparse_index) {
-		if (write_index_ext_header(f, eoie_c, CACHE_EXT_SPARSE_DIRECTORIES, 0) < 0)
-			return -1;
+		if (write_index_ext_header(f, eoie_c, CACHE_EXT_SPARSE_DIRECTORIES, 0) < 0) {
+			ret = -1;
+			goto out;
+		}
 	}
 
 	/*
@@ -3069,14 +3077,15 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	 * when loading the shared index.
 	 */
 	if (eoie_c) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		write_eoie_extension(&sb, eoie_c, offset);
 		err = write_index_ext_header(f, NULL, CACHE_EXT_ENDOFINDEXENTRIES, sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 
 	csum_fsync_flag = 0;
@@ -3085,13 +3094,16 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 
 	finalize_hashfile(f, istate->oid.hash, FSYNC_COMPONENT_INDEX,
 			  CSUM_HASH_IN_STREAM | csum_fsync_flag);
+	f = NULL;
 
 	if (close_tempfile_gently(tempfile)) {
-		error(_("could not close '%s'"), get_tempfile_path(tempfile));
-		return -1;
+		ret = error(_("could not close '%s'"), get_tempfile_path(tempfile));
+		goto out;
+	}
+	if (stat(get_tempfile_path(tempfile), &st)) {
+		ret = -1;
+		goto out;
 	}
-	if (stat(get_tempfile_path(tempfile), &st))
-		return -1;
 	istate->timestamp.sec = (unsigned int)st.st_mtime;
 	istate->timestamp.nsec = ST_MTIME_NSEC(st);
 	trace_performance_since(start, "write index, changed mask = %x", istate->cache_changed);
@@ -3105,7 +3117,14 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	trace2_data_intmax("index", the_repository, "write/cache_nr",
 			   istate->cache_nr);
 
-	return 0;
+	ret = 0;
+
+out:
+	if (f)
+		free_hashfile(f);
+	strbuf_release(&sb);
+	free(ieot);
+	return ret;
 }
 
 void set_alternate_index_output(const char *name)
diff --git a/t/t1601-index-bogus.sh b/t/t1601-index-bogus.sh
index 4171f1e141..5dcc101882 100755
--- a/t/t1601-index-bogus.sh
+++ b/t/t1601-index-bogus.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test handling of bogus index entries'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'create tree with null sha1' '
diff --git a/t/t2107-update-index-basic.sh b/t/t2107-update-index-basic.sh
index cc72ead79f..f0eab13f96 100755
--- a/t/t2107-update-index-basic.sh
+++ b/t/t2107-update-index-basic.sh
@@ -5,6 +5,7 @@ test_description='basic update-index tests
 Tests for command-line parsing and basic operation.
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'update-index --nonsense fails' '
diff --git a/t/t7008-filter-branch-null-sha1.sh b/t/t7008-filter-branch-null-sha1.sh
index 93fbc92b8d..0ce8fd2c89 100755
--- a/t/t7008-filter-branch-null-sha1.sh
+++ b/t/t7008-filter-branch-null-sha1.sh
@@ -2,6 +2,7 @@
 
 test_description='filter-branch removal of trees with null sha1'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup: base commits' '
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 07/22] submodule-config: fix leaking name entry when traversing submodules
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2024-08-13  9:31   ` [PATCH v3 06/22] read-cache: fix leaking hashfile when writing index fails Patrick Steinhardt
@ 2024-08-13  9:31   ` Patrick Steinhardt
  2024-08-13  9:31   ` [PATCH v3 08/22] config: fix leaking comment character config Patrick Steinhardt
                     ` (15 subsequent siblings)
  22 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:31 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

We traverse through submodules in the tree via `tree_entry()`, passing
to it a `struct name_entry` that it is supposed to populate with the
tree entry's contents. We unnecessarily allocate this variable instead
of passing a variable that is allocated on the stack, and the ultimately
don't even free that variable. This is unnecessary and leaks memory.

Convert the variable to instead be allocated on the stack to plug the
memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 submodule-config.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/submodule-config.c b/submodule-config.c
index 9b0bb0b9f4..c8f2bb2bdd 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -899,27 +899,25 @@ static void traverse_tree_submodules(struct repository *r,
 {
 	struct tree_desc tree;
 	struct submodule_tree_entry *st_entry;
-	struct name_entry *name_entry;
+	struct name_entry name_entry;
 	char *tree_path = NULL;
 
-	name_entry = xmalloc(sizeof(*name_entry));
-
 	fill_tree_descriptor(r, &tree, treeish_name);
-	while (tree_entry(&tree, name_entry)) {
+	while (tree_entry(&tree, &name_entry)) {
 		if (prefix)
 			tree_path =
-				mkpathdup("%s/%s", prefix, name_entry->path);
+				mkpathdup("%s/%s", prefix, name_entry.path);
 		else
-			tree_path = xstrdup(name_entry->path);
+			tree_path = xstrdup(name_entry.path);
 
-		if (S_ISGITLINK(name_entry->mode) &&
+		if (S_ISGITLINK(name_entry.mode) &&
 		    is_tree_submodule_active(r, root_tree, tree_path)) {
 			ALLOC_GROW(out->entries, out->entry_nr + 1,
 				   out->entry_alloc);
 			st_entry = &out->entries[out->entry_nr++];
 
 			st_entry->name_entry = xmalloc(sizeof(*st_entry->name_entry));
-			*st_entry->name_entry = *name_entry;
+			*st_entry->name_entry = name_entry;
 			st_entry->submodule =
 				submodule_from_path(r, root_tree, tree_path);
 			st_entry->repo = xmalloc(sizeof(*st_entry->repo));
@@ -927,9 +925,9 @@ static void traverse_tree_submodules(struct repository *r,
 						root_tree))
 				FREE_AND_NULL(st_entry->repo);
 
-		} else if (S_ISDIR(name_entry->mode))
+		} else if (S_ISDIR(name_entry.mode))
 			traverse_tree_submodules(r, root_tree, tree_path,
-						 &name_entry->oid, out);
+						 &name_entry.oid, out);
 		free(tree_path);
 	}
 }
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 08/22] config: fix leaking comment character config
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2024-08-13  9:31   ` [PATCH v3 07/22] submodule-config: fix leaking name entry when traversing submodules Patrick Steinhardt
@ 2024-08-13  9:31   ` Patrick Steinhardt
  2024-08-13  9:31   ` [PATCH v3 09/22] builtin/rebase: fix leaking `commit.gpgsign` value Patrick Steinhardt
                     ` (14 subsequent siblings)
  22 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:31 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

When the comment line character has been specified multiple times in the
configuration, then `git_default_core_config()` will cause a memory leak
because it unconditionally copies the string into `comment_line_str`
without free'ing the previous value. In fact, it can't easily free the
value in the first place because it may contain a string constant.

Refactor the code such that we track allocated comment character strings
via a separate non-constant variable `comment_line_str_to_free`. Adapt
sites that set `comment_line_str` to set both and free the old value
that was stored in `comment_line_str_to_free`.

This memory leak is being hit in t3404. As there are still other memory
leaks in that file we cannot yet mark it as passing with leak checking
enabled.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/commit.c | 7 +++++--
 config.c         | 3 ++-
 environment.c    | 1 +
 environment.h    | 1 +
 4 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 66427ba82d..b2033c4887 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -684,7 +684,9 @@ static void adjust_comment_line_char(const struct strbuf *sb)
 	const char *p;
 
 	if (!memchr(sb->buf, candidates[0], sb->len)) {
-		comment_line_str = xstrfmt("%c", candidates[0]);
+		free(comment_line_str_to_free);
+		comment_line_str = comment_line_str_to_free =
+			xstrfmt("%c", candidates[0]);
 		return;
 	}
 
@@ -705,7 +707,8 @@ static void adjust_comment_line_char(const struct strbuf *sb)
 	if (!*p)
 		die(_("unable to select a comment character that is not used\n"
 		      "in the current commit message"));
-	comment_line_str = xstrfmt("%c", *p);
+	free(comment_line_str_to_free);
+	comment_line_str = comment_line_str_to_free = xstrfmt("%c", *p);
 }
 
 static void prepare_amend_commit(struct commit *commit, struct strbuf *sb,
diff --git a/config.c b/config.c
index 6421894614..205660a8fb 100644
--- a/config.c
+++ b/config.c
@@ -1596,7 +1596,8 @@ static int git_default_core_config(const char *var, const char *value,
 		else if (value[0]) {
 			if (strchr(value, '\n'))
 				return error(_("%s cannot contain newline"), var);
-			comment_line_str = xstrdup(value);
+			comment_line_str = value;
+			FREE_AND_NULL(comment_line_str_to_free);
 			auto_comment_line_char = 0;
 		} else
 			return error(_("%s must have at least one character"), var);
diff --git a/environment.c b/environment.c
index 5cea2c9f54..1d6c48b52d 100644
--- a/environment.c
+++ b/environment.c
@@ -114,6 +114,7 @@ int protect_ntfs = PROTECT_NTFS_DEFAULT;
  * that is subject to stripspace.
  */
 const char *comment_line_str = "#";
+char *comment_line_str_to_free;
 int auto_comment_line_char;
 
 /* Parallel index stat data preload? */
diff --git a/environment.h b/environment.h
index e9f01d4d11..0148738ed6 100644
--- a/environment.h
+++ b/environment.h
@@ -9,6 +9,7 @@ struct strvec;
  * that is subject to stripspace.
  */
 extern const char *comment_line_str;
+extern char *comment_line_str_to_free;
 extern int auto_comment_line_char;
 
 /*
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 09/22] builtin/rebase: fix leaking `commit.gpgsign` value
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
                     ` (7 preceding siblings ...)
  2024-08-13  9:31   ` [PATCH v3 08/22] config: fix leaking comment character config Patrick Steinhardt
@ 2024-08-13  9:31   ` Patrick Steinhardt
  2024-08-13  9:31   ` [PATCH v3 10/22] builtin/notes: fix leaking `struct notes_tree` when merging notes Patrick Steinhardt
                     ` (13 subsequent siblings)
  22 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:31 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

In `get_replay_opts()`, we override the `gpg_sign` field that already
got populated by `sequencer_init_config()` in case the user has
"commit.gpgsign" set in their config. This creates a memory leak because
we overwrite the previously assigned value, which may have already
pointed to an allocated string.

Let's plug the memory leak by freeing the value before we overwrite it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/rebase.c              | 1 +
 sequencer.c                   | 1 +
 t/t3404-rebase-interactive.sh | 1 +
 t/t3435-rebase-gpg-sign.sh    | 1 +
 t/t7030-verify-tag.sh         | 1 +
 5 files changed, 5 insertions(+)

diff --git a/builtin/rebase.c b/builtin/rebase.c
index e3a8e74cfc..2f01d5d3a6 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -186,6 +186,7 @@ static struct replay_opts get_replay_opts(const struct rebase_options *opts)
 	replay.committer_date_is_author_date =
 					opts->committer_date_is_author_date;
 	replay.ignore_date = opts->ignore_date;
+	free(replay.gpg_sign);
 	replay.gpg_sign = xstrdup_or_null(opts->gpg_sign_opt);
 	replay.reflog_action = xstrdup(opts->reflog_action);
 	if (opts->strategy)
diff --git a/sequencer.c b/sequencer.c
index 0291920f0b..cade9b0ca8 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -303,6 +303,7 @@ static int git_sequencer_config(const char *k, const char *v,
 	}
 
 	if (!strcmp(k, "commit.gpgsign")) {
+		free(opts->gpg_sign);
 		opts->gpg_sign = git_config_bool(k, v) ? xstrdup("") : NULL;
 		return 0;
 	}
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index f92baad138..f171af3061 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -26,6 +26,7 @@ Initial setup:
  touch file "conflict".
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 . "$TEST_DIRECTORY"/lib-rebase.sh
diff --git a/t/t3435-rebase-gpg-sign.sh b/t/t3435-rebase-gpg-sign.sh
index 6aa2aeb628..6e329fea7c 100755
--- a/t/t3435-rebase-gpg-sign.sh
+++ b/t/t3435-rebase-gpg-sign.sh
@@ -8,6 +8,7 @@ test_description='test rebase --[no-]gpg-sign'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY/lib-rebase.sh"
 . "$TEST_DIRECTORY/lib-gpg.sh"
diff --git a/t/t7030-verify-tag.sh b/t/t7030-verify-tag.sh
index 6f526c37c2..effa826744 100755
--- a/t/t7030-verify-tag.sh
+++ b/t/t7030-verify-tag.sh
@@ -4,6 +4,7 @@ test_description='signed tag tests'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY/lib-gpg.sh"
 
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 10/22] builtin/notes: fix leaking `struct notes_tree` when merging notes
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
                     ` (8 preceding siblings ...)
  2024-08-13  9:31   ` [PATCH v3 09/22] builtin/rebase: fix leaking `commit.gpgsign` value Patrick Steinhardt
@ 2024-08-13  9:31   ` Patrick Steinhardt
  2024-08-13  9:31   ` [PATCH v3 11/22] builtin/fast-import: plug trivial memory leaks Patrick Steinhardt
                     ` (12 subsequent siblings)
  22 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:31 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

We allocate a `struct notes_tree` in `merge_commit()` which we then
initialize via `init_notes()`. It's not really necessary to allocate the
structure though given that we never pass ownership to the caller.
Furthermore, the allocation leads to a memory leak because despite its
name, `free_notes()` doesn't free the `notes_tree` but only clears it.

Fix this issue by converting the code to use an on-stack variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/notes.c                       | 9 ++++-----
 t/t3310-notes-merge-manual-resolve.sh | 1 +
 t/t3311-notes-merge-fanout.sh         | 1 +
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/builtin/notes.c b/builtin/notes.c
index d9c356e354..81cbaeec6b 100644
--- a/builtin/notes.c
+++ b/builtin/notes.c
@@ -807,7 +807,7 @@ static int merge_commit(struct notes_merge_options *o)
 {
 	struct strbuf msg = STRBUF_INIT;
 	struct object_id oid, parent_oid;
-	struct notes_tree *t;
+	struct notes_tree t = {0};
 	struct commit *partial;
 	struct pretty_print_context pretty_ctx;
 	void *local_ref_to_free;
@@ -830,8 +830,7 @@ static int merge_commit(struct notes_merge_options *o)
 	else
 		oidclr(&parent_oid, the_repository->hash_algo);
 
-	CALLOC_ARRAY(t, 1);
-	init_notes(t, "NOTES_MERGE_PARTIAL", combine_notes_overwrite, 0);
+	init_notes(&t, "NOTES_MERGE_PARTIAL", combine_notes_overwrite, 0);
 
 	o->local_ref = local_ref_to_free =
 		refs_resolve_refdup(get_main_ref_store(the_repository),
@@ -839,7 +838,7 @@ static int merge_commit(struct notes_merge_options *o)
 	if (!o->local_ref)
 		die(_("failed to resolve NOTES_MERGE_REF"));
 
-	if (notes_merge_commit(o, t, partial, &oid))
+	if (notes_merge_commit(o, &t, partial, &oid))
 		die(_("failed to finalize notes merge"));
 
 	/* Reuse existing commit message in reflog message */
@@ -853,7 +852,7 @@ static int merge_commit(struct notes_merge_options *o)
 			is_null_oid(&parent_oid) ? NULL : &parent_oid,
 			0, UPDATE_REFS_DIE_ON_ERR);
 
-	free_notes(t);
+	free_notes(&t);
 	strbuf_release(&msg);
 	ret = merge_abort(o);
 	free(local_ref_to_free);
diff --git a/t/t3310-notes-merge-manual-resolve.sh b/t/t3310-notes-merge-manual-resolve.sh
index 597df5ebc0..04866b89be 100755
--- a/t/t3310-notes-merge-manual-resolve.sh
+++ b/t/t3310-notes-merge-manual-resolve.sh
@@ -5,6 +5,7 @@
 
 test_description='Test notes merging with manual conflict resolution'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # Set up a notes merge scenario with different kinds of conflicts
diff --git a/t/t3311-notes-merge-fanout.sh b/t/t3311-notes-merge-fanout.sh
index 5b675417e9..ce4144db0f 100755
--- a/t/t3311-notes-merge-fanout.sh
+++ b/t/t3311-notes-merge-fanout.sh
@@ -5,6 +5,7 @@
 
 test_description='Test notes merging at various fanout levels'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 verify_notes () {
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 11/22] builtin/fast-import: plug trivial memory leaks
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
                     ` (9 preceding siblings ...)
  2024-08-13  9:31   ` [PATCH v3 10/22] builtin/notes: fix leaking `struct notes_tree` when merging notes Patrick Steinhardt
@ 2024-08-13  9:31   ` Patrick Steinhardt
  2024-08-13  9:31   ` [PATCH v3 12/22] builtin/fast-export: fix leaking diff options Patrick Steinhardt
                     ` (11 subsequent siblings)
  22 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:31 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

Plug some trivial memory leaks in git-fast-import(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/fast-import.c        | 8 ++++++--
 t/t9300-fast-import.sh       | 1 +
 t/t9304-fast-import-marks.sh | 2 ++
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index d21c4053a7..6dfeb01665 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -206,8 +206,8 @@ static unsigned int object_entry_alloc = 5000;
 static struct object_entry_pool *blocks;
 static struct hashmap object_table;
 static struct mark_set *marks;
-static const char *export_marks_file;
-static const char *import_marks_file;
+static char *export_marks_file;
+static char *import_marks_file;
 static int import_marks_file_from_stream;
 static int import_marks_file_ignore_missing;
 static int import_marks_file_done;
@@ -3274,6 +3274,7 @@ static void option_import_marks(const char *marks,
 			read_marks();
 	}
 
+	free(import_marks_file);
 	import_marks_file = make_fast_import_path(marks);
 	import_marks_file_from_stream = from_stream;
 	import_marks_file_ignore_missing = ignore_missing;
@@ -3316,6 +3317,7 @@ static void option_active_branches(const char *branches)
 
 static void option_export_marks(const char *marks)
 {
+	free(export_marks_file);
 	export_marks_file = make_fast_import_path(marks);
 }
 
@@ -3357,6 +3359,8 @@ static void option_rewrite_submodules(const char *arg, struct string_list *list)
 	free(f);
 
 	string_list_insert(list, s)->util = ms;
+
+	free(s);
 }
 
 static int parse_one_option(const char *option)
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 1e68426852..3b3c371740 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -7,6 +7,7 @@ test_description='test git fast-import utility'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-diff.sh ;# test-lib chdir's into trash
 
diff --git a/t/t9304-fast-import-marks.sh b/t/t9304-fast-import-marks.sh
index 410a871c52..1f776a80f3 100755
--- a/t/t9304-fast-import-marks.sh
+++ b/t/t9304-fast-import-marks.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test exotic situations with marks'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup dump of basic history' '
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 12/22] builtin/fast-export: fix leaking diff options
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
                     ` (10 preceding siblings ...)
  2024-08-13  9:31   ` [PATCH v3 11/22] builtin/fast-import: plug trivial memory leaks Patrick Steinhardt
@ 2024-08-13  9:31   ` Patrick Steinhardt
  2024-08-13 16:34     ` Junio C Hamano
  2024-08-13  9:31   ` [PATCH v3 13/22] builtin/fast-export: plug leaking tag names Patrick Steinhardt
                     ` (10 subsequent siblings)
  22 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:31 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

Before calling `handle_commit()` in a loop, we set `diffopt.no_free`
such that its contents aren't getting freed inside of `handle_commit()`.
We never unset that flag though, which means that it'll ultimately leak
when calling `release_revisions()`.

Fix this by unsetting the flag after the loop.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/fast-export.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 4b6e8c6832..fe92d2436c 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -1278,9 +1278,11 @@ int cmd_fast_export(int argc, const char **argv, const char *prefix)
 	revs.diffopt.format_callback = show_filemodify;
 	revs.diffopt.format_callback_data = &paths_of_changed_objects;
 	revs.diffopt.flags.recursive = 1;
+
 	revs.diffopt.no_free = 1;
 	while ((commit = get_revision(&revs)))
 		handle_commit(commit, &revs, &paths_of_changed_objects);
+	revs.diffopt.no_free = 0;
 
 	handle_tags_and_duplicates(&extra_refs);
 	handle_tags_and_duplicates(&tag_refs);
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 13/22] builtin/fast-export: plug leaking tag names
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
                     ` (11 preceding siblings ...)
  2024-08-13  9:31   ` [PATCH v3 12/22] builtin/fast-export: fix leaking diff options Patrick Steinhardt
@ 2024-08-13  9:31   ` Patrick Steinhardt
  2024-08-13  9:31   ` [PATCH v3 14/22] merge-ort: unconditionally release attributes index Patrick Steinhardt
                     ` (9 subsequent siblings)
  22 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:31 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

When resolving revisions in `get_tags_and_duplicates()`, we only
partially manage the lifetime of `full_name`. In fact, managing its
lifetime properly is almost impossible because we put direct pointers to
that variable into multiple lists without duplicating the string. The
consequence is that these strings will ultimately leak.

Refactor the code to make the lists we put those names into duplicate
the memory. This allows us to properly free the string as required and
thus plugs the memory leak.

While this requires us to allocate more data overall, it shouldn't be
all that bad given that the number of allocations corresponds with the
number of command line parameters, which typically aren't all that many.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/fast-export.c            | 17 ++++++++++++-----
 t/t9351-fast-export-anonymize.sh |  1 +
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index fe92d2436c..f253b79322 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -42,8 +42,8 @@ static int full_tree;
 static int reference_excluded_commits;
 static int show_original_ids;
 static int mark_tags;
-static struct string_list extra_refs = STRING_LIST_INIT_NODUP;
-static struct string_list tag_refs = STRING_LIST_INIT_NODUP;
+static struct string_list extra_refs = STRING_LIST_INIT_DUP;
+static struct string_list tag_refs = STRING_LIST_INIT_DUP;
 static struct refspec refspecs = REFSPEC_INIT_FETCH;
 static int anonymize;
 static struct hashmap anonymized_seeds;
@@ -901,7 +901,7 @@ static void handle_tag(const char *name, struct tag *tag)
 	free(buf);
 }
 
-static struct commit *get_commit(struct rev_cmdline_entry *e, char *full_name)
+static struct commit *get_commit(struct rev_cmdline_entry *e, const char *full_name)
 {
 	switch (e->item->type) {
 	case OBJ_COMMIT:
@@ -932,14 +932,16 @@ static void get_tags_and_duplicates(struct rev_cmdline_info *info)
 		struct rev_cmdline_entry *e = info->rev + i;
 		struct object_id oid;
 		struct commit *commit;
-		char *full_name;
+		char *full_name = NULL;
 
 		if (e->flags & UNINTERESTING)
 			continue;
 
 		if (repo_dwim_ref(the_repository, e->name, strlen(e->name),
-				  &oid, &full_name, 0) != 1)
+				  &oid, &full_name, 0) != 1) {
+			free(full_name);
 			continue;
+		}
 
 		if (refspecs.nr) {
 			char *private;
@@ -955,6 +957,7 @@ static void get_tags_and_duplicates(struct rev_cmdline_info *info)
 			warning("%s: Unexpected object of type %s, skipping.",
 				e->name,
 				type_name(e->item->type));
+			free(full_name);
 			continue;
 		}
 
@@ -963,10 +966,12 @@ static void get_tags_and_duplicates(struct rev_cmdline_info *info)
 			break;
 		case OBJ_BLOB:
 			export_blob(&commit->object.oid);
+			free(full_name);
 			continue;
 		default: /* OBJ_TAG (nested tags) is already handled */
 			warning("Tag points to object of unexpected type %s, skipping.",
 				type_name(commit->object.type));
+			free(full_name);
 			continue;
 		}
 
@@ -979,6 +984,8 @@ static void get_tags_and_duplicates(struct rev_cmdline_info *info)
 
 		if (!*revision_sources_at(&revision_sources, commit))
 			*revision_sources_at(&revision_sources, commit) = full_name;
+		else
+			free(full_name);
 	}
 
 	string_list_sort(&extra_refs);
diff --git a/t/t9351-fast-export-anonymize.sh b/t/t9351-fast-export-anonymize.sh
index 156a647484..c0d9d7be75 100755
--- a/t/t9351-fast-export-anonymize.sh
+++ b/t/t9351-fast-export-anonymize.sh
@@ -4,6 +4,7 @@ test_description='basic tests for fast-export --anonymize'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup simple repo' '
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 14/22] merge-ort: unconditionally release attributes index
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
                     ` (12 preceding siblings ...)
  2024-08-13  9:31   ` [PATCH v3 13/22] builtin/fast-export: plug leaking tag names Patrick Steinhardt
@ 2024-08-13  9:31   ` Patrick Steinhardt
  2024-08-13  9:31   ` [PATCH v3 15/22] sequencer: release todo list on error paths Patrick Steinhardt
                     ` (8 subsequent siblings)
  22 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:31 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

We conditionally release the index used for reading gitattributes in
merge-ort based on whether or the index has been populated. This check
uses `cache_nr` as a condition. This isn't sufficient though, as the
variable may be zero even when some other parts of the index have been
populated. This leads to memory leaks when sparse checkouts are in use,
as we may not end up releasing the sparse checkout patterns.

Fix this issue by unconditionally releasing the index.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 merge-ort.c                       | 3 +--
 t/t3507-cherry-pick-conflict.sh   | 1 +
 t/t6421-merge-partial-clone.sh    | 1 +
 t/t6428-merge-conflicts-sparse.sh | 1 +
 t/t7817-grep-sparse-checkout.sh   | 1 +
 5 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index e9d01ac7f7..3752c7e595 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -689,8 +689,7 @@ static void clear_or_reinit_internal_opts(struct merge_options_internal *opti,
 	 */
 	strmap_clear_func(&opti->conflicted, 0);
 
-	if (opti->attr_index.cache_nr) /* true iff opt->renormalize */
-		discard_index(&opti->attr_index);
+	discard_index(&opti->attr_index);
 
 	/* Free memory used by various renames maps */
 	for (i = MERGE_SIDE1; i <= MERGE_SIDE2; ++i) {
diff --git a/t/t3507-cherry-pick-conflict.sh b/t/t3507-cherry-pick-conflict.sh
index f3947b400a..10e9c91dbb 100755
--- a/t/t3507-cherry-pick-conflict.sh
+++ b/t/t3507-cherry-pick-conflict.sh
@@ -13,6 +13,7 @@ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 TEST_CREATE_REPO_NO_TEMPLATE=1
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 pristine_detach () {
diff --git a/t/t6421-merge-partial-clone.sh b/t/t6421-merge-partial-clone.sh
index 711b709e75..020375c805 100755
--- a/t/t6421-merge-partial-clone.sh
+++ b/t/t6421-merge-partial-clone.sh
@@ -26,6 +26,7 @@ test_description="limiting blob downloads when merging with partial clones"
 #                     underscore notation is to differentiate different
 #                     files that might be renamed into each other's paths.)
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-merge.sh
 
diff --git a/t/t6428-merge-conflicts-sparse.sh b/t/t6428-merge-conflicts-sparse.sh
index 9919c3fa7c..8a79bc2e92 100755
--- a/t/t6428-merge-conflicts-sparse.sh
+++ b/t/t6428-merge-conflicts-sparse.sh
@@ -22,6 +22,7 @@ test_description="merge cases"
 #                     underscore notation is to differentiate different
 #                     files that might be renamed into each other's paths.)
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-merge.sh
 
diff --git a/t/t7817-grep-sparse-checkout.sh b/t/t7817-grep-sparse-checkout.sh
index eb59564565..0ba7817fb7 100755
--- a/t/t7817-grep-sparse-checkout.sh
+++ b/t/t7817-grep-sparse-checkout.sh
@@ -33,6 +33,7 @@ should leave the following structure in the working tree:
 But note that sub2 should have the SKIP_WORKTREE bit set.
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup' '
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 15/22] sequencer: release todo list on error paths
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
                     ` (13 preceding siblings ...)
  2024-08-13  9:31   ` [PATCH v3 14/22] merge-ort: unconditionally release attributes index Patrick Steinhardt
@ 2024-08-13  9:31   ` Patrick Steinhardt
  2024-08-13  9:31   ` [PATCH v3 16/22] unpack-trees: clear index when not propagating it Patrick Steinhardt
                     ` (7 subsequent siblings)
  22 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:31 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

We're not releasing the `todo_list` in `sequencer_pick_revisions()` when
hitting an error path. Restructure the function to have a common exit
path such that we can easily clean up the list and thus plug this memory
leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 sequencer.c                     | 66 +++++++++++++++++++++++----------
 t/t3510-cherry-pick-sequence.sh |  1 +
 2 files changed, 48 insertions(+), 19 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index cade9b0ca8..ea559c31f1 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -5490,8 +5490,10 @@ int sequencer_pick_revisions(struct repository *r,
 	int i, res;
 
 	assert(opts->revs);
-	if (read_and_refresh_cache(r, opts))
-		return -1;
+	if (read_and_refresh_cache(r, opts)) {
+		res = -1;
+		goto out;
+	}
 
 	for (i = 0; i < opts->revs->pending.nr; i++) {
 		struct object_id oid;
@@ -5506,11 +5508,14 @@ int sequencer_pick_revisions(struct repository *r,
 				enum object_type type = oid_object_info(r,
 									&oid,
 									NULL);
-				return error(_("%s: can't cherry-pick a %s"),
-					name, type_name(type));
+				res = error(_("%s: can't cherry-pick a %s"),
+					    name, type_name(type));
+				goto out;
 			}
-		} else
-			return error(_("%s: bad revision"), name);
+		} else {
+			res = error(_("%s: bad revision"), name);
+			goto out;
+		}
 	}
 
 	/*
@@ -5525,14 +5530,23 @@ int sequencer_pick_revisions(struct repository *r,
 	    opts->revs->no_walk &&
 	    !opts->revs->cmdline.rev->flags) {
 		struct commit *cmit;
-		if (prepare_revision_walk(opts->revs))
-			return error(_("revision walk setup failed"));
+
+		if (prepare_revision_walk(opts->revs)) {
+			res = error(_("revision walk setup failed"));
+			goto out;
+		}
+
 		cmit = get_revision(opts->revs);
-		if (!cmit)
-			return error(_("empty commit set passed"));
+		if (!cmit) {
+			res = error(_("empty commit set passed"));
+			goto out;
+		}
+
 		if (get_revision(opts->revs))
 			BUG("unexpected extra commit from walk");
-		return single_pick(r, cmit, opts);
+
+		res = single_pick(r, cmit, opts);
+		goto out;
 	}
 
 	/*
@@ -5542,16 +5556,30 @@ int sequencer_pick_revisions(struct repository *r,
 	 */
 
 	if (walk_revs_populate_todo(&todo_list, opts) ||
-			create_seq_dir(r) < 0)
-		return -1;
-	if (repo_get_oid(r, "HEAD", &oid) && (opts->action == REPLAY_REVERT))
-		return error(_("can't revert as initial commit"));
-	if (save_head(oid_to_hex(&oid)))
-		return -1;
-	if (save_opts(opts))
-		return -1;
+			create_seq_dir(r) < 0) {
+		res = -1;
+		goto out;
+	}
+
+	if (repo_get_oid(r, "HEAD", &oid) && (opts->action == REPLAY_REVERT)) {
+		res = error(_("can't revert as initial commit"));
+		goto out;
+	}
+
+	if (save_head(oid_to_hex(&oid))) {
+		res = -1;
+		goto out;
+	}
+
+	if (save_opts(opts)) {
+		res = -1;
+		goto out;
+	}
+
 	update_abort_safety_file();
 	res = pick_commits(r, &todo_list, opts);
+
+out:
 	todo_list_release(&todo_list);
 	return res;
 }
diff --git a/t/t3510-cherry-pick-sequence.sh b/t/t3510-cherry-pick-sequence.sh
index 7eb52b12ed..93c725bac3 100755
--- a/t/t3510-cherry-pick-sequence.sh
+++ b/t/t3510-cherry-pick-sequence.sh
@@ -12,6 +12,7 @@ test_description='Test cherry-pick continuation features
 
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # Repeat first match 10 times
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 16/22] unpack-trees: clear index when not propagating it
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
                     ` (14 preceding siblings ...)
  2024-08-13  9:31   ` [PATCH v3 15/22] sequencer: release todo list on error paths Patrick Steinhardt
@ 2024-08-13  9:31   ` Patrick Steinhardt
  2024-08-13  9:31   ` [PATCH v3 17/22] diff: fix leak when parsing invalid ignore regex option Patrick Steinhardt
                     ` (6 subsequent siblings)
  22 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:31 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

When provided a pointer to a destination index, then `unpack_trees()`
will end up copying its `o->internal.result` index into the provided
pointer. In those cases it is thus not necessary to free the index, as
we have transferred ownership of it.

There are cases though where we do not end up transferring ownership of
the memory, but `clear_unpack_trees_porcelain()` will never discard the
index in that case and thus cause a memory leak. And right now it cannot
do so in the first place because we have no indicator of whether we did
or didn't transfer ownership of the index.

Adapt the code to zero out the index in case we transfer its ownership.
Like this, we can now unconditionally discard the index when being asked
to clear the `unpack_trees_options`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t3705-add-sparse-checkout.sh | 1 +
 unpack-trees.c                 | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/t/t3705-add-sparse-checkout.sh b/t/t3705-add-sparse-checkout.sh
index 2bade9e804..6ae45a788d 100755
--- a/t/t3705-add-sparse-checkout.sh
+++ b/t/t3705-add-sparse-checkout.sh
@@ -2,6 +2,7 @@
 
 test_description='git add in sparse checked out working trees'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 SPARSE_ENTRY_BLOB=""
diff --git a/unpack-trees.c b/unpack-trees.c
index 7dc884fafd..9a55cb6204 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -210,6 +210,7 @@ void clear_unpack_trees_porcelain(struct unpack_trees_options *opts)
 {
 	strvec_clear(&opts->internal.msgs_to_free);
 	memset(opts->internal.msgs, 0, sizeof(opts->internal.msgs));
+	discard_index(&opts->internal.result);
 }
 
 static int do_add_entry(struct unpack_trees_options *o, struct cache_entry *ce,
@@ -2082,6 +2083,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		o->internal.result.updated_workdir = 1;
 		discard_index(o->dst_index);
 		*o->dst_index = o->internal.result;
+		memset(&o->internal.result, 0, sizeof(o->internal.result));
 	} else {
 		discard_index(&o->internal.result);
 	}
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 17/22] diff: fix leak when parsing invalid ignore regex option
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
                     ` (15 preceding siblings ...)
  2024-08-13  9:31   ` [PATCH v3 16/22] unpack-trees: clear index when not propagating it Patrick Steinhardt
@ 2024-08-13  9:31   ` Patrick Steinhardt
  2024-08-13  9:32   ` [PATCH v3 18/22] builtin/format-patch: fix various trivial memory leaks Patrick Steinhardt
                     ` (5 subsequent siblings)
  22 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:31 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

When parsing invalid ignore regexes passed via the `-I` option we don't
free already-allocated memory, leading to a memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 diff.c                  | 6 +++++-
 t/t4013-diff-various.sh | 1 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index ebb7538e04..9251c47b72 100644
--- a/diff.c
+++ b/diff.c
@@ -5464,9 +5464,13 @@ static int diff_opt_ignore_regex(const struct option *opt,
 	regex_t *regex;
 
 	BUG_ON_OPT_NEG(unset);
+
 	regex = xmalloc(sizeof(*regex));
-	if (regcomp(regex, arg, REG_EXTENDED | REG_NEWLINE))
+	if (regcomp(regex, arg, REG_EXTENDED | REG_NEWLINE)) {
+		free(regex);
 		return error(_("invalid regex given to -I: '%s'"), arg);
+	}
+
 	ALLOC_GROW(options->ignore_regex, options->ignore_regex_nr + 1,
 		   options->ignore_regex_alloc);
 	options->ignore_regex[options->ignore_regex_nr++] = regex;
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index 3855d68dbc..87d248d034 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -8,6 +8,7 @@ test_description='Various diff formatting options'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-diff.sh
 
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 18/22] builtin/format-patch: fix various trivial memory leaks
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
                     ` (16 preceding siblings ...)
  2024-08-13  9:31   ` [PATCH v3 17/22] diff: fix leak when parsing invalid ignore regex option Patrick Steinhardt
@ 2024-08-13  9:32   ` Patrick Steinhardt
  2024-08-13 16:55     ` Junio C Hamano
  2024-08-13 16:55     ` Junio C Hamano
  2024-08-13  9:32   ` [PATCH v3 19/22] userdiff: fix leaking memory for configured diff drivers Patrick Steinhardt
                     ` (4 subsequent siblings)
  22 siblings, 2 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:32 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

There are various memory leaks hit by git-format-patch(1). Basically all
of them are trivial, except that un-setting `diffopt.no_free` requires
us to unset the `diffopt.file` because we manually close it already.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/log.c           | 12 +++++++++---
 t/t4014-format-patch.sh |  1 +
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index a73a767606..ff997a0d0e 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1833,6 +1833,7 @@ static struct commit *get_base_commit(const struct format_config *cfg,
 			}
 
 			rev[i] = merge_base->item;
+			free_commit_list(merge_base);
 		}
 
 		if (rev_nr % 2)
@@ -2023,6 +2024,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	const char *rfc = NULL;
 	int creation_factor = -1;
 	const char *signature = git_version_string;
+	char *signature_to_free = NULL;
 	char *signature_file_arg = NULL;
 	struct keep_callback_data keep_callback_data = {
 		.cfg = &cfg,
@@ -2443,7 +2445,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 
 		if (strbuf_read_file(&buf, signature_file, 128) < 0)
 			die_errno(_("unable to read signature file '%s'"), signature_file);
-		signature = strbuf_detach(&buf, NULL);
+		signature = signature_to_free = strbuf_detach(&buf, NULL);
 	} else if (cfg.signature) {
 		signature = cfg.signature;
 	}
@@ -2548,12 +2550,13 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 			else
 				print_signature(signature, rev.diffopt.file);
 		}
-		if (output_directory)
+		if (output_directory) {
 			fclose(rev.diffopt.file);
+			rev.diffopt.file = NULL;
+		}
 	}
 	stop_progress(&progress);
 	free(list);
-	free(branch_name);
 	if (ignore_if_in_upstream)
 		free_patch_ids(&ids);
 
@@ -2565,11 +2568,14 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	strbuf_release(&rdiff_title);
 	free(description_file);
 	free(signature_file_arg);
+	free(signature_to_free);
+	free(branch_name);
 	free(to_free);
 	free(rev.message_id);
 	if (rev.ref_message_ids)
 		string_list_clear(rev.ref_message_ids, 0);
 	free(rev.ref_message_ids);
+	rev.diffopt.no_free = 0;
 	release_revisions(&rev);
 	format_config_release(&cfg);
 	return 0;
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index 884f83fb8a..1c46e963e4 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -8,6 +8,7 @@ test_description='various format-patch tests'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-terminal.sh
 
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 19/22] userdiff: fix leaking memory for configured diff drivers
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
                     ` (17 preceding siblings ...)
  2024-08-13  9:32   ` [PATCH v3 18/22] builtin/format-patch: fix various trivial memory leaks Patrick Steinhardt
@ 2024-08-13  9:32   ` Patrick Steinhardt
  2024-08-13  9:32   ` [PATCH v3 20/22] builtin/log: fix leak when showing converted blob contents Patrick Steinhardt
                     ` (3 subsequent siblings)
  22 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:32 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

The userdiff structures may be initialized either statically on the
stack or dynamically via configuration keys. In the latter case we end
up leaking memory because we didn't have any infrastructure to discern
those strings which have been allocated statically and those which have
been allocated dynamically.

Refactor the code such that we have two pointers for each of these
strings: one that holds the value as accessed by other subsystems, and
one that points to the same string in case it has been allocated. Like
this, we can safely free the second pointer and thus plug those memory
leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 range-diff.c                     |  6 +++--
 t/t4018-diff-funcname.sh         |  1 +
 t/t4042-diff-textconv-caching.sh |  2 ++
 t/t4048-diff-combined-binary.sh  |  1 +
 t/t4209-log-pickaxe.sh           |  2 ++
 userdiff.c                       | 38 ++++++++++++++++++++++++--------
 userdiff.h                       |  4 ++++
 7 files changed, 43 insertions(+), 11 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index 5f01605550..bbb0952264 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -450,8 +450,10 @@ static void output_pair_header(struct diff_options *diffopt,
 }
 
 static struct userdiff_driver section_headers = {
-	.funcname = { "^ ## (.*) ##$\n"
-		      "^.?@@ (.*)$", REG_EXTENDED }
+	.funcname = {
+		.pattern = "^ ## (.*) ##$\n^.?@@ (.*)$",
+		.cflags = REG_EXTENDED,
+	},
 };
 
 static struct diff_filespec *get_filespec(const char *name, const char *p)
diff --git a/t/t4018-diff-funcname.sh b/t/t4018-diff-funcname.sh
index e026fac1f4..8128c30e7f 100755
--- a/t/t4018-diff-funcname.sh
+++ b/t/t4018-diff-funcname.sh
@@ -5,6 +5,7 @@
 
 test_description='Test custom diff function name patterns'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup' '
diff --git a/t/t4042-diff-textconv-caching.sh b/t/t4042-diff-textconv-caching.sh
index 8ebfa3c1be..a179205394 100755
--- a/t/t4042-diff-textconv-caching.sh
+++ b/t/t4042-diff-textconv-caching.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test textconv caching'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 cat >helper <<'EOF'
diff --git a/t/t4048-diff-combined-binary.sh b/t/t4048-diff-combined-binary.sh
index 0260cf64f5..f399484bce 100755
--- a/t/t4048-diff-combined-binary.sh
+++ b/t/t4048-diff-combined-binary.sh
@@ -4,6 +4,7 @@ test_description='combined and merge diff handle binary files and textconv'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup binary merge conflict' '
diff --git a/t/t4209-log-pickaxe.sh b/t/t4209-log-pickaxe.sh
index 64e1623733..b42fdc54fc 100755
--- a/t/t4209-log-pickaxe.sh
+++ b/t/t4209-log-pickaxe.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='log --grep/--author/--regexp-ignore-case/-S/-G'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_log () {
diff --git a/userdiff.c b/userdiff.c
index c4ebb9ff73..989629149f 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -399,8 +399,11 @@ static struct userdiff_driver *userdiff_find_by_namelen(const char *name, size_t
 static int parse_funcname(struct userdiff_funcname *f, const char *k,
 		const char *v, int cflags)
 {
-	if (git_config_string((char **) &f->pattern, k, v) < 0)
+	f->pattern = NULL;
+	FREE_AND_NULL(f->pattern_owned);
+	if (git_config_string(&f->pattern_owned, k, v) < 0)
 		return -1;
+	f->pattern = f->pattern_owned;
 	f->cflags = cflags;
 	return 0;
 }
@@ -444,20 +447,37 @@ int userdiff_config(const char *k, const char *v)
 		return parse_funcname(&drv->funcname, k, v, REG_EXTENDED);
 	if (!strcmp(type, "binary"))
 		return parse_tristate(&drv->binary, k, v);
-	if (!strcmp(type, "command"))
-		return git_config_string((char **) &drv->external.cmd, k, v);
+	if (!strcmp(type, "command")) {
+		FREE_AND_NULL(drv->external.cmd);
+		return git_config_string(&drv->external.cmd, k, v);
+	}
 	if (!strcmp(type, "trustexitcode")) {
 		drv->external.trust_exit_code = git_config_bool(k, v);
 		return 0;
 	}
-	if (!strcmp(type, "textconv"))
-		return git_config_string((char **) &drv->textconv, k, v);
+	if (!strcmp(type, "textconv")) {
+		int ret;
+		FREE_AND_NULL(drv->textconv_owned);
+		ret = git_config_string(&drv->textconv_owned, k, v);
+		drv->textconv = drv->textconv_owned;
+		return ret;
+	}
 	if (!strcmp(type, "cachetextconv"))
 		return parse_bool(&drv->textconv_want_cache, k, v);
-	if (!strcmp(type, "wordregex"))
-		return git_config_string((char **) &drv->word_regex, k, v);
-	if (!strcmp(type, "algorithm"))
-		return git_config_string((char **) &drv->algorithm, k, v);
+	if (!strcmp(type, "wordregex")) {
+		int ret;
+		FREE_AND_NULL(drv->word_regex_owned);
+		ret = git_config_string(&drv->word_regex_owned, k, v);
+		drv->word_regex = drv->word_regex_owned;
+		return ret;
+	}
+	if (!strcmp(type, "algorithm")) {
+		int ret;
+		FREE_AND_NULL(drv->algorithm_owned);
+		ret = git_config_string(&drv->algorithm_owned, k, v);
+		drv->algorithm = drv->algorithm_owned;
+		return ret;
+	}
 
 	return 0;
 }
diff --git a/userdiff.h b/userdiff.h
index 7565930337..827361b0bc 100644
--- a/userdiff.h
+++ b/userdiff.h
@@ -8,6 +8,7 @@ struct repository;
 
 struct userdiff_funcname {
 	const char *pattern;
+	char *pattern_owned;
 	int cflags;
 };
 
@@ -20,11 +21,14 @@ struct userdiff_driver {
 	const char *name;
 	struct external_diff external;
 	const char *algorithm;
+	char *algorithm_owned;
 	int binary;
 	struct userdiff_funcname funcname;
 	const char *word_regex;
+	char *word_regex_owned;
 	const char *word_regex_multi_byte;
 	const char *textconv;
+	char *textconv_owned;
 	struct notes_cache *textconv_cache;
 	int textconv_want_cache;
 };
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 20/22] builtin/log: fix leak when showing converted blob contents
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
                     ` (18 preceding siblings ...)
  2024-08-13  9:32   ` [PATCH v3 19/22] userdiff: fix leaking memory for configured diff drivers Patrick Steinhardt
@ 2024-08-13  9:32   ` Patrick Steinhardt
  2024-08-13  9:32   ` [PATCH v3 21/22] diff: free state populated via options Patrick Steinhardt
                     ` (2 subsequent siblings)
  22 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:32 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

In `show_blob_object()`, we proactively call `textconv_object()`. In
case we have a textconv driver for this blob we will end up showing the
converted contents, otherwise we'll show the un-converted contents of it
instead.

When the object has been converted we never free the buffer containing
the converted contents. Fix this to plug this memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/log.c            | 1 +
 t/t4030-diff-textconv.sh | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/builtin/log.c b/builtin/log.c
index ff997a0d0e..1a684b68f2 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -707,6 +707,7 @@ static int show_blob_object(const struct object_id *oid, struct rev_info *rev, c
 
 	write_or_die(1, buf, size);
 	object_context_release(&obj_context);
+	free(buf);
 	return 0;
 }
 
diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh
index a39a626664..29f6d610c2 100755
--- a/t/t4030-diff-textconv.sh
+++ b/t/t4030-diff-textconv.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='diff.*.textconv tests'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 find_diff() {
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 21/22] diff: free state populated via options
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
                     ` (19 preceding siblings ...)
  2024-08-13  9:32   ` [PATCH v3 20/22] builtin/log: fix leak when showing converted blob contents Patrick Steinhardt
@ 2024-08-13  9:32   ` Patrick Steinhardt
  2024-08-13 16:31     ` Junio C Hamano
  2024-08-13  9:32   ` [PATCH v3 22/22] builtin/diff: free symmetric diff members Patrick Steinhardt
  2024-08-13 16:58   ` [PATCH v3 00/22] Memory leak fixes (pt.4) Junio C Hamano
  22 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:32 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

The `objfind` and `anchors` members of `struct diff_options` are
populated via option parsing, but are never freed in `diff_free()`. Fix
this to plug those memory leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 diff.c                   | 10 ++++++++++
 t/t4064-diff-oidfind.sh  |  2 ++
 t/t4065-diff-anchored.sh |  1 +
 t/t4069-remerge-diff.sh  |  1 +
 4 files changed, 14 insertions(+)

diff --git a/diff.c b/diff.c
index 9251c47b72..4035a9374d 100644
--- a/diff.c
+++ b/diff.c
@@ -6717,6 +6717,16 @@ void diff_free(struct diff_options *options)
 	if (options->no_free)
 		return;
 
+	if (options->objfind) {
+		oidset_clear(options->objfind);
+		FREE_AND_NULL(options->objfind);
+	}
+
+	for (size_t i = 0; i < options->anchors_nr; i++)
+		free(options->anchors[i]);
+	FREE_AND_NULL(options->anchors);
+	options->anchors_nr = options->anchors_alloc = 0;
+
 	diff_free_file(options);
 	diff_free_ignore_regex(options);
 	clear_pathspec(&options->pathspec);
diff --git a/t/t4064-diff-oidfind.sh b/t/t4064-diff-oidfind.sh
index 6d8c8986fc..846f285f77 100755
--- a/t/t4064-diff-oidfind.sh
+++ b/t/t4064-diff-oidfind.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test finding specific blobs in the revision walking'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup ' '
diff --git a/t/t4065-diff-anchored.sh b/t/t4065-diff-anchored.sh
index b3f510f040..647537c12e 100755
--- a/t/t4065-diff-anchored.sh
+++ b/t/t4065-diff-anchored.sh
@@ -2,6 +2,7 @@
 
 test_description='anchored diff algorithm'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success '--anchored' '
diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
index 07323ebafe..888714bbd3 100755
--- a/t/t4069-remerge-diff.sh
+++ b/t/t4069-remerge-diff.sh
@@ -2,6 +2,7 @@
 
 test_description='remerge-diff handling'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # This test is ort-specific
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v3 22/22] builtin/diff: free symmetric diff members
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
                     ` (20 preceding siblings ...)
  2024-08-13  9:32   ` [PATCH v3 21/22] diff: free state populated via options Patrick Steinhardt
@ 2024-08-13  9:32   ` Patrick Steinhardt
  2024-08-13 16:25     ` Junio C Hamano
  2024-08-13 16:58   ` [PATCH v3 00/22] Memory leak fixes (pt.4) Junio C Hamano
  22 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  9:32 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

We populate a `struct symdiff` in case the user has requested a
symmetric diff. Part of this is to populate a `skip` bitmap that
indicates which commits shall be ignored in the diff. But while this
bitmap is dynamically allocated, we never free it.

Fix this by introducing and calling a new `symdiff_release()` function
that does this for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/diff.c                       | 10 +++++++++-
 t/t4068-diff-symmetric-merge-base.sh |  1 +
 t/t4108-apply-threeway.sh            |  1 +
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/builtin/diff.c b/builtin/diff.c
index 9b6cdabe15..f87f68a5bc 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -388,6 +388,13 @@ static void symdiff_prepare(struct rev_info *rev, struct symdiff *sym)
 	sym->skip = map;
 }
 
+static void symdiff_release(struct symdiff *sdiff)
+{
+	if (!sdiff)
+		return;
+	bitmap_free(sdiff->skip);
+}
+
 int cmd_diff(int argc, const char **argv, const char *prefix)
 {
 	int i;
@@ -398,7 +405,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
 	struct object_array_entry *blob[2];
 	int nongit = 0, no_index = 0;
 	int result;
-	struct symdiff sdiff;
+	struct symdiff sdiff = {0};
 
 	/*
 	 * We could get N tree-ish in the rev.pending_objects list.
@@ -619,6 +626,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
 		refresh_index_quietly();
 	release_revisions(&rev);
 	object_array_clear(&ent);
+	symdiff_release(&sdiff);
 	UNLEAK(blob);
 	return result;
 }
diff --git a/t/t4068-diff-symmetric-merge-base.sh b/t/t4068-diff-symmetric-merge-base.sh
index eff63c16b0..4d6565e728 100755
--- a/t/t4068-diff-symmetric-merge-base.sh
+++ b/t/t4068-diff-symmetric-merge-base.sh
@@ -5,6 +5,7 @@ test_description='behavior of diff with symmetric-diff setups and --merge-base'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # build these situations:
diff --git a/t/t4108-apply-threeway.sh b/t/t4108-apply-threeway.sh
index c558282bc0..3211e1e65f 100755
--- a/t/t4108-apply-threeway.sh
+++ b/t/t4108-apply-threeway.sh
@@ -5,6 +5,7 @@ test_description='git apply --3way'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 print_sanitized_conflicted_diff () {
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 02/22] git: fix leaking system paths
  2024-08-13  6:30       ` Patrick Steinhardt
@ 2024-08-13 16:02         ` Junio C Hamano
  0 siblings, 0 replies; 146+ messages in thread
From: Junio C Hamano @ 2024-08-13 16:02 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Taylor Blau, git, James Liu, Phillip Wood

Patrick Steinhardt <ps@pks.im> writes:

>> Makes sense, though I wonder if this would be slightly cleaner to write
>> like so (applies on top of this patch):
>
> It is cleaner indeed, thanks for the proposal!

Yup, it is only 3 repetitions but they are repetitions nevertheless.
A small helper function that gets rid of them is worth it.

Thanks.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v3 22/22] builtin/diff: free symmetric diff members
  2024-08-13  9:32   ` [PATCH v3 22/22] builtin/diff: free symmetric diff members Patrick Steinhardt
@ 2024-08-13 16:25     ` Junio C Hamano
  2024-08-14  5:01       ` Patrick Steinhardt
  0 siblings, 1 reply; 146+ messages in thread
From: Junio C Hamano @ 2024-08-13 16:25 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, James Liu, karthik nayak, Phillip Wood, Taylor Blau

Patrick Steinhardt <ps@pks.im> writes:

> We populate a `struct symdiff` in case the user has requested a
> symmetric diff. Part of this is to populate a `skip` bitmap that
> indicates which commits shall be ignored in the diff. But while this
> bitmap is dynamically allocated, we never free it.
>
> Fix this by introducing and calling a new `symdiff_release()` function
> that does this for us.

OK.

> +static void symdiff_release(struct symdiff *sdiff)
> +{
> +	if (!sdiff)
> +		return;
> +	bitmap_free(sdiff->skip);
> +}

Hmph, wouldn't it be a BUG if any caller feeds a NULL pointer to it,
though?  When symdiff was prepared but not used, sdiff->skip will be
NULL but sdiff is never NULL even in such a case.

> @@ -398,7 +405,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
>  	struct object_array_entry *blob[2];
>  	int nongit = 0, no_index = 0;
>  	int result;
> -	struct symdiff sdiff;
> +	struct symdiff sdiff = {0};

And symdiff_prepare() at least clears its .skip member to NULL, so
this pre-initialization is probably not needed.  If we are preparing
ourselves for future changes of the flow in this function (e.g.
goto's that jump to the clean-up label from which symdiff_release()
is always called, even when we did not call symdiff_prepare() on
this thing), this is probably not sufficient to convey that
intention (instead I'd use an explicit ".skip = NULL" to say "we
might not even call _prepare() but this one is prepared to be passed
to _release() even in such a case").

Given that there is no such goto exists, and that _prepare() always
sets up the .skip member appropriately, I wonder if we are much
better off leaving sdiff uninitialized at the declaration site here.
If we add such a goto that bypasses _prepare() in the future, the
compiler will notice that we are passing an uninitialized sdiff to
_release(), no?

> @@ -619,6 +626,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
>  		refresh_index_quietly();
>  	release_revisions(&rev);
>  	object_array_clear(&ent);
> +	symdiff_release(&sdiff);
>  	UNLEAK(blob);
>  	return result;
>  }

Other than that, this looks cleanly done.  Thanks.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v3 21/22] diff: free state populated via options
  2024-08-13  9:32   ` [PATCH v3 21/22] diff: free state populated via options Patrick Steinhardt
@ 2024-08-13 16:31     ` Junio C Hamano
  0 siblings, 0 replies; 146+ messages in thread
From: Junio C Hamano @ 2024-08-13 16:31 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, James Liu, karthik nayak, Phillip Wood, Taylor Blau

Patrick Steinhardt <ps@pks.im> writes:

> The `objfind` and `anchors` members of `struct diff_options` are
> populated via option parsing, but are never freed in `diff_free()`. Fix
> this to plug those memory leaks.

Thanks.

Even though "diff" is generally my bailiwick, I've never paid
attention to these features (hence the resources they consume and
leak).  The patch looks good to me.


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v3 12/22] builtin/fast-export: fix leaking diff options
  2024-08-13  9:31   ` [PATCH v3 12/22] builtin/fast-export: fix leaking diff options Patrick Steinhardt
@ 2024-08-13 16:34     ` Junio C Hamano
  2024-08-14  4:49       ` Patrick Steinhardt
  0 siblings, 1 reply; 146+ messages in thread
From: Junio C Hamano @ 2024-08-13 16:34 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, James Liu, karthik nayak, Phillip Wood, Taylor Blau

Patrick Steinhardt <ps@pks.im> writes:

> Before calling `handle_commit()` in a loop, we set `diffopt.no_free`
> such that its contents aren't getting freed inside of `handle_commit()`.
> We never unset that flag though, which means that it'll ultimately leak
> when calling `release_revisions()`.
>
> Fix this by unsetting the flag after the loop.

If I grep for 

    $ git grep -nH -E -e '(\.|->)no_free' \*.c

I notice that in a lot of places there is a pattern of doing

    set .no_free to 1
    cause a bunch of diff using the same set of options
    set .no_free to 0
    call diff_free().

I am curious why we do not need any diff_free() here?

> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  builtin/fast-export.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/builtin/fast-export.c b/builtin/fast-export.c
> index 4b6e8c6832..fe92d2436c 100644
> --- a/builtin/fast-export.c
> +++ b/builtin/fast-export.c
> @@ -1278,9 +1278,11 @@ int cmd_fast_export(int argc, const char **argv, const char *prefix)
>  	revs.diffopt.format_callback = show_filemodify;
>  	revs.diffopt.format_callback_data = &paths_of_changed_objects;
>  	revs.diffopt.flags.recursive = 1;
> +
>  	revs.diffopt.no_free = 1;
>  	while ((commit = get_revision(&revs)))
>  		handle_commit(commit, &revs, &paths_of_changed_objects);
> +	revs.diffopt.no_free = 0;
>  
>  	handle_tags_and_duplicates(&extra_refs);
>  	handle_tags_and_duplicates(&tag_refs);

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v3 18/22] builtin/format-patch: fix various trivial memory leaks
  2024-08-13  9:32   ` [PATCH v3 18/22] builtin/format-patch: fix various trivial memory leaks Patrick Steinhardt
@ 2024-08-13 16:55     ` Junio C Hamano
  2024-08-14  4:56       ` Patrick Steinhardt
  2024-08-13 16:55     ` Junio C Hamano
  1 sibling, 1 reply; 146+ messages in thread
From: Junio C Hamano @ 2024-08-13 16:55 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, James Liu, karthik nayak, Phillip Wood, Taylor Blau

Patrick Steinhardt <ps@pks.im> writes:

> There are various memory leaks hit by git-format-patch(1). Basically all
> of them are trivial, except that un-setting `diffopt.no_free` requires
> us to unset the `diffopt.file` because we manually close it already.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  builtin/log.c           | 12 +++++++++---
>  t/t4014-format-patch.sh |  1 +
>  2 files changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/builtin/log.c b/builtin/log.c
> index a73a767606..ff997a0d0e 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -1833,6 +1833,7 @@ static struct commit *get_base_commit(const struct format_config *cfg,
>  			}
>  
>  			rev[i] = merge_base->item;
> +			free_commit_list(merge_base);
>  		}
>  
>  		if (rev_nr % 2)

This is correct, but isn't merge_base leaking when there are
multiple and we are not dying on failure?  Perhaps something along
this line?

			struct commit_list *merge_base = NULL;
			if (repo_get_merge_bases(the_repository,
						 rev[2 * i],
						 rev[2 * i + 1], &merge_base) < 0 ||
			    !merge_base || merge_base->next) {
				if (die_on_failure) {
					die(_("failed to find exact merge base"));
				} else {
                 +               	free_commit_list(merge_base);
					free(rev);
					return NULL;
				}
			}

> @@ -2548,12 +2550,13 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>  			else
>  				print_signature(signature, rev.diffopt.file);
>  		}
> -		if (output_directory)
> +		if (output_directory) {
>  			fclose(rev.diffopt.file);
> +			rev.diffopt.file = NULL;

Is this a leakfix, or just a general code hygiene improvement?

> +		}
>  	}
>  	stop_progress(&progress);
>  	free(list);
> -	free(branch_name);
>  	if (ignore_if_in_upstream)
>  		free_patch_ids(&ids);

Good eyes. branch_name can be set and then "goto done" can jump this
one over, so it makes sense to move it below and make it part of the
centralized clean-up.  list is not leaking in the current code, and
there is no "goto done" or "return" after it gets allocated before
this point, so it can stay here.  On the other hand, it appears to
me that everything below stop_progress() we see above can be moved
below the "done:" label, except for that ids may still be left
uninitialized without getting populated by get_patch_ids() when
ignore-if-in-upstream is asked but head and upstream are the same
when we jump to the "done:" label, so it needs a bit more work _if_
we wanted to go that route.  I think the postimage of this patch,
i.e.  freeing the "list" and "ids" here before the "done:" label, is
a good place to stop.

Thanks.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v3 18/22] builtin/format-patch: fix various trivial memory leaks
  2024-08-13  9:32   ` [PATCH v3 18/22] builtin/format-patch: fix various trivial memory leaks Patrick Steinhardt
  2024-08-13 16:55     ` Junio C Hamano
@ 2024-08-13 16:55     ` Junio C Hamano
  1 sibling, 0 replies; 146+ messages in thread
From: Junio C Hamano @ 2024-08-13 16:55 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, James Liu, karthik nayak, Phillip Wood, Taylor Blau

Patrick Steinhardt <ps@pks.im> writes:

> There are various memory leaks hit by git-format-patch(1). Basically all
> of them are trivial, except that un-setting `diffopt.no_free` requires
> us to unset the `diffopt.file` because we manually close it already.

Ah, I misread the patch.  Clearly diffopt.file is about
double-closing.  Makes sense.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v3 00/22] Memory leak fixes (pt.4)
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
                     ` (21 preceding siblings ...)
  2024-08-13  9:32   ` [PATCH v3 22/22] builtin/diff: free symmetric diff members Patrick Steinhardt
@ 2024-08-13 16:58   ` Junio C Hamano
  22 siblings, 0 replies; 146+ messages in thread
From: Junio C Hamano @ 2024-08-13 16:58 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, James Liu, karthik nayak, Phillip Wood, Taylor Blau

Patrick Steinhardt <ps@pks.im> writes:

> Patrick Steinhardt (22):
>   remote: plug memory leak when aliasing URLs
>   git: fix leaking system paths
>   object-file: fix memory leak when reading corrupted headers
>   object-name: fix leaking symlink paths in object context
>   bulk-checkin: fix leaking state TODO
>   read-cache: fix leaking hashfile when writing index fails
>   submodule-config: fix leaking name entry when traversing submodules
>   config: fix leaking comment character config
>   builtin/rebase: fix leaking `commit.gpgsign` value
>   builtin/notes: fix leaking `struct notes_tree` when merging notes
>   builtin/fast-import: plug trivial memory leaks
>   builtin/fast-export: fix leaking diff options
>   builtin/fast-export: plug leaking tag names
>   merge-ort: unconditionally release attributes index
>   sequencer: release todo list on error paths
>   unpack-trees: clear index when not propagating it
>   diff: fix leak when parsing invalid ignore regex option
>   builtin/format-patch: fix various trivial memory leaks
>   userdiff: fix leaking memory for configured diff drivers
>   builtin/log: fix leak when showing converted blob contents
>   diff: free state populated via options
>   builtin/diff: free symmetric diff members

Thanks for a pleasant read.  I had a few minor comments but they
likely show more of my misreading of the patches than real problems
;-)


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v3 12/22] builtin/fast-export: fix leaking diff options
  2024-08-13 16:34     ` Junio C Hamano
@ 2024-08-14  4:49       ` Patrick Steinhardt
  0 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  4:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, James Liu, karthik nayak, Phillip Wood, Taylor Blau

On Tue, Aug 13, 2024 at 09:34:40AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > Before calling `handle_commit()` in a loop, we set `diffopt.no_free`
> > such that its contents aren't getting freed inside of `handle_commit()`.
> > We never unset that flag though, which means that it'll ultimately leak
> > when calling `release_revisions()`.
> >
> > Fix this by unsetting the flag after the loop.
> 
> If I grep for 
> 
>     $ git grep -nH -E -e '(\.|->)no_free' \*.c
> 
> I notice that in a lot of places there is a pattern of doing
> 
>     set .no_free to 1
>     cause a bunch of diff using the same set of options
>     set .no_free to 0
>     call diff_free().
> 
> I am curious why we do not need any diff_free() here?

Because it's already being called via `release_revisions()`.

Patrick

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v3 18/22] builtin/format-patch: fix various trivial memory leaks
  2024-08-13 16:55     ` Junio C Hamano
@ 2024-08-14  4:56       ` Patrick Steinhardt
  0 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  4:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, James Liu, karthik nayak, Phillip Wood, Taylor Blau

On Tue, Aug 13, 2024 at 09:55:10AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > There are various memory leaks hit by git-format-patch(1). Basically all
> > of them are trivial, except that un-setting `diffopt.no_free` requires
> > us to unset the `diffopt.file` because we manually close it already.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  builtin/log.c           | 12 +++++++++---
> >  t/t4014-format-patch.sh |  1 +
> >  2 files changed, 10 insertions(+), 3 deletions(-)
> >
> > diff --git a/builtin/log.c b/builtin/log.c
> > index a73a767606..ff997a0d0e 100644
> > --- a/builtin/log.c
> > +++ b/builtin/log.c
> > @@ -1833,6 +1833,7 @@ static struct commit *get_base_commit(const struct format_config *cfg,
> >  			}
> >  
> >  			rev[i] = merge_base->item;
> > +			free_commit_list(merge_base);
> >  		}
> >  
> >  		if (rev_nr % 2)
> 
> This is correct, but isn't merge_base leaking when there are
> multiple and we are not dying on failure?  Perhaps something along
> this line?

Yes, good catch.

> 			struct commit_list *merge_base = NULL;
> 			if (repo_get_merge_bases(the_repository,
> 						 rev[2 * i],
> 						 rev[2 * i + 1], &merge_base) < 0 ||
> 			    !merge_base || merge_base->next) {
> 				if (die_on_failure) {
> 					die(_("failed to find exact merge base"));
> 				} else {
>                  +               	free_commit_list(merge_base);
> 					free(rev);
> 					return NULL;
> 				}
> 			}
> 
> > @@ -2548,12 +2550,13 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
> >  			else
> >  				print_signature(signature, rev.diffopt.file);
> >  		}
> > -		if (output_directory)
> > +		if (output_directory) {
> >  			fclose(rev.diffopt.file);
> > +			rev.diffopt.file = NULL;
> 
> Is this a leakfix, or just a general code hygiene improvement?

Not a leak fix, but required because of the leak fix. As we now unset
`rev.diffopt.no_free`, `release_revisions()` will call `diff_free()` and
try to close the file pointer. But as we already did, it would cause a
segfault as we now try to close it twice.

Patrick

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v3 22/22] builtin/diff: free symmetric diff members
  2024-08-13 16:25     ` Junio C Hamano
@ 2024-08-14  5:01       ` Patrick Steinhardt
  2024-08-14 15:28         ` Junio C Hamano
  0 siblings, 1 reply; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  5:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, James Liu, karthik nayak, Phillip Wood, Taylor Blau

On Tue, Aug 13, 2024 at 09:25:41AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > We populate a `struct symdiff` in case the user has requested a
> > symmetric diff. Part of this is to populate a `skip` bitmap that
> > indicates which commits shall be ignored in the diff. But while this
> > bitmap is dynamically allocated, we never free it.
> >
> > Fix this by introducing and calling a new `symdiff_release()` function
> > that does this for us.
> 
> OK.
> 
> > +static void symdiff_release(struct symdiff *sdiff)
> > +{
> > +	if (!sdiff)
> > +		return;
> > +	bitmap_free(sdiff->skip);
> > +}
> 
> Hmph, wouldn't it be a BUG if any caller feeds a NULL pointer to it,
> though?  When symdiff was prepared but not used, sdiff->skip will be
> NULL but sdiff is never NULL even in such a case.

Good point. It does make sense for `_free()` functions to handle NULL
pointers, but doesn't quite for `_release()` ones.

> > @@ -398,7 +405,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
> >  	struct object_array_entry *blob[2];
> >  	int nongit = 0, no_index = 0;
> >  	int result;
> > -	struct symdiff sdiff;
> > +	struct symdiff sdiff = {0};
> 
> And symdiff_prepare() at least clears its .skip member to NULL, so
> this pre-initialization is probably not needed.  If we are preparing
> ourselves for future changes of the flow in this function (e.g.
> goto's that jump to the clean-up label from which symdiff_release()
> is always called, even when we did not call symdiff_prepare() on
> this thing), this is probably not sufficient to convey that
> intention (instead I'd use an explicit ".skip = NULL" to say "we
> might not even call _prepare() but this one is prepared to be passed
> to _release() even in such a case").
> 
> Given that there is no such goto exists, and that _prepare() always
> sets up the .skip member appropriately, I wonder if we are much
> better off leaving sdiff uninitialized at the declaration site here.
> If we add such a goto that bypasses _prepare() in the future, the
> compiler will notice that we are passing an uninitialized sdiff to
> _release(), no?

You'd hope it does, but it certainly depends on your compiler flags.
Various hardening flags for example implicitly initialize variables, and
I have a feeling that this also causes them to not emit any warnings
anymore. At least I only spot such warnings in CI.

In any case, yes, we can drop the initialization here.

Patrick

^ permalink raw reply	[flat|nested] 146+ messages in thread

* [PATCH v4 00/22] Memory leak fixes (pt.4)
  2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
                   ` (25 preceding siblings ...)
  2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
@ 2024-08-14  6:51 ` Patrick Steinhardt
  2024-08-14  6:51   ` [PATCH v4 01/22] remote: plug memory leak when aliasing URLs Patrick Steinhardt
                     ` (21 more replies)
  26 siblings, 22 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:51 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

Hi,

this is the fourth version of my fourth set of memory leak fixes. There
are only minor changes compared to v3:

  - Amend a commit message to explain that `release_revisions()` takes
    care of releasing `rev.diffopt` for us.

  - Fix another memory leak in `get_base_commit()`.

  - Remove `NULL` pointer check in `symdiff_release()` and stop
    zero-initializing `struct symdiff`.

Thanks!

Patrick

Patrick Steinhardt (22):
  remote: plug memory leak when aliasing URLs
  git: fix leaking system paths
  object-file: fix memory leak when reading corrupted headers
  object-name: fix leaking symlink paths in object context
  bulk-checkin: fix leaking state TODO
  read-cache: fix leaking hashfile when writing index fails
  submodule-config: fix leaking name entry when traversing submodules
  config: fix leaking comment character config
  builtin/rebase: fix leaking `commit.gpgsign` value
  builtin/notes: fix leaking `struct notes_tree` when merging notes
  builtin/fast-import: plug trivial memory leaks
  builtin/fast-export: fix leaking diff options
  builtin/fast-export: plug leaking tag names
  merge-ort: unconditionally release attributes index
  sequencer: release todo list on error paths
  unpack-trees: clear index when not propagating it
  diff: fix leak when parsing invalid ignore regex option
  builtin/format-patch: fix various trivial memory leaks
  userdiff: fix leaking memory for configured diff drivers
  builtin/log: fix leak when showing converted blob contents
  diff: free state populated via options
  builtin/diff: free symmetric diff members

 builtin/commit.c                      |  7 +-
 builtin/diff.c                        |  6 ++
 builtin/fast-export.c                 | 19 ++++--
 builtin/fast-import.c                 |  8 ++-
 builtin/log.c                         | 14 +++-
 builtin/notes.c                       |  9 ++-
 builtin/rebase.c                      |  1 +
 bulk-checkin.c                        |  2 +
 config.c                              |  3 +-
 csum-file.c                           |  2 +-
 csum-file.h                           | 10 +++
 diff.c                                | 16 ++++-
 environment.c                         |  1 +
 environment.h                         |  1 +
 git.c                                 | 13 +++-
 merge-ort.c                           |  3 +-
 object-file.c                         |  1 +
 object-name.c                         |  1 +
 range-diff.c                          |  6 +-
 read-cache.c                          | 97 ++++++++++++++++-----------
 remote.c                              |  2 +
 sequencer.c                           | 67 ++++++++++++------
 submodule-config.c                    | 18 +++--
 t/t0210-trace2-normal.sh              |  2 +-
 t/t1006-cat-file.sh                   |  1 +
 t/t1050-large.sh                      |  1 +
 t/t1450-fsck.sh                       |  1 +
 t/t1601-index-bogus.sh                |  2 +
 t/t2107-update-index-basic.sh         |  1 +
 t/t3310-notes-merge-manual-resolve.sh |  1 +
 t/t3311-notes-merge-fanout.sh         |  1 +
 t/t3404-rebase-interactive.sh         |  1 +
 t/t3435-rebase-gpg-sign.sh            |  1 +
 t/t3507-cherry-pick-conflict.sh       |  1 +
 t/t3510-cherry-pick-sequence.sh       |  1 +
 t/t3705-add-sparse-checkout.sh        |  1 +
 t/t4013-diff-various.sh               |  1 +
 t/t4014-format-patch.sh               |  1 +
 t/t4018-diff-funcname.sh              |  1 +
 t/t4030-diff-textconv.sh              |  2 +
 t/t4042-diff-textconv-caching.sh      |  2 +
 t/t4048-diff-combined-binary.sh       |  1 +
 t/t4064-diff-oidfind.sh               |  2 +
 t/t4065-diff-anchored.sh              |  1 +
 t/t4068-diff-symmetric-merge-base.sh  |  1 +
 t/t4069-remerge-diff.sh               |  1 +
 t/t4108-apply-threeway.sh             |  1 +
 t/t4209-log-pickaxe.sh                |  2 +
 t/t6421-merge-partial-clone.sh        |  1 +
 t/t6428-merge-conflicts-sparse.sh     |  1 +
 t/t7008-filter-branch-null-sha1.sh    |  1 +
 t/t7030-verify-tag.sh                 |  1 +
 t/t7817-grep-sparse-checkout.sh       |  1 +
 t/t9300-fast-import.sh                |  1 +
 t/t9304-fast-import-marks.sh          |  2 +
 t/t9351-fast-export-anonymize.sh      |  1 +
 unpack-trees.c                        |  2 +
 userdiff.c                            | 38 ++++++++---
 userdiff.h                            |  4 ++
 59 files changed, 286 insertions(+), 105 deletions(-)

Range-diff against v3:
 1:  02f6da020f =  1:  02f6da020f remote: plug memory leak when aliasing URLs
 2:  f36d895948 =  2:  f36d895948 git: fix leaking system paths
 3:  0415ac986d =  3:  0415ac986d object-file: fix memory leak when reading corrupted headers
 4:  e5130e50a9 =  4:  e5130e50a9 object-name: fix leaking symlink paths in object context
 5:  276c828ad1 =  5:  276c828ad1 bulk-checkin: fix leaking state TODO
 6:  ed0608e705 =  6:  ed0608e705 read-cache: fix leaking hashfile when writing index fails
 7:  b7a7f88c7d =  7:  b7a7f88c7d submodule-config: fix leaking name entry when traversing submodules
 8:  9054a459a1 =  8:  9054a459a1 config: fix leaking comment character config
 9:  1d3957a5eb =  9:  1d3957a5eb builtin/rebase: fix leaking `commit.gpgsign` value
10:  0af1bab5a1 = 10:  0af1bab5a1 builtin/notes: fix leaking `struct notes_tree` when merging notes
11:  30d4e9ed43 = 11:  30d4e9ed43 builtin/fast-import: plug trivial memory leaks
12:  9591fb7b5e ! 12:  070813a740 builtin/fast-export: fix leaking diff options
    @@ Commit message
     
         Before calling `handle_commit()` in a loop, we set `diffopt.no_free`
         such that its contents aren't getting freed inside of `handle_commit()`.
    -    We never unset that flag though, which means that it'll ultimately leak
    -    when calling `release_revisions()`.
    +    We never unset that flag though, which means that the structure's
    +    allocated resources will ultimately leak.
     
    -    Fix this by unsetting the flag after the loop.
    +    Fix this by unsetting the flag after the loop such that we release its
    +    resources via `release_revisions()`.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
13:  254bbb7f6f = 13:  b4096e971f builtin/fast-export: plug leaking tag names
14:  334c4ed71a = 14:  bdfdf53313 merge-ort: unconditionally release attributes index
15:  9f08a859fb = 15:  f6c1055805 sequencer: release todo list on error paths
16:  5d4934b1a9 = 16:  9db41181a6 unpack-trees: clear index when not propagating it
17:  e1b6a24fbe = 17:  85f6ffd610 diff: fix leak when parsing invalid ignore regex option
18:  c048b54a2c ! 18:  e00aa1ef06 builtin/format-patch: fix various trivial memory leaks
    @@ Commit message
     
      ## builtin/log.c ##
     @@ builtin/log.c: static struct commit *get_base_commit(const struct format_config *cfg,
    + 				if (die_on_failure) {
    + 					die(_("failed to find exact merge base"));
    + 				} else {
    ++					free_commit_list(merge_base);
    + 					free(rev);
    + 					return NULL;
    + 				}
      			}
      
      			rev[i] = merge_base->item;
19:  39b2921e3e = 19:  cc04751134 userdiff: fix leaking memory for configured diff drivers
20:  50dea1c98a = 20:  0e2d3e523f builtin/log: fix leak when showing converted blob contents
21:  d5cb4ad580 = 21:  9faffa7a62 diff: free state populated via options
22:  31e38ba4e1 ! 22:  ee252e752c builtin/diff: free symmetric diff members
    @@ builtin/diff.c: static void symdiff_prepare(struct rev_info *rev, struct symdiff
      
     +static void symdiff_release(struct symdiff *sdiff)
     +{
    -+	if (!sdiff)
    -+		return;
     +	bitmap_free(sdiff->skip);
     +}
     +
      int cmd_diff(int argc, const char **argv, const char *prefix)
      {
      	int i;
    -@@ builtin/diff.c: int cmd_diff(int argc, const char **argv, const char *prefix)
    - 	struct object_array_entry *blob[2];
    - 	int nongit = 0, no_index = 0;
    - 	int result;
    --	struct symdiff sdiff;
    -+	struct symdiff sdiff = {0};
    - 
    - 	/*
    - 	 * We could get N tree-ish in the rev.pending_objects list.
     @@ builtin/diff.c: int cmd_diff(int argc, const char **argv, const char *prefix)
      		refresh_index_quietly();
      	release_revisions(&rev);
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply	[flat|nested] 146+ messages in thread

* [PATCH v4 01/22] remote: plug memory leak when aliasing URLs
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
@ 2024-08-14  6:51   ` Patrick Steinhardt
  2024-08-14  6:51   ` [PATCH v4 02/22] git: fix leaking system paths Patrick Steinhardt
                     ` (20 subsequent siblings)
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:51 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

When we have a `url.*.insteadOf` configuration, then we end up aliasing
URLs when populating remotes. One place where this happens is in
`alias_all_urls()`, where we loop through all remotes and then alias
each of their URLs. The actual aliasing logic is then contained in
`alias_url()`, which returns an allocated string that contains the new
URL. This URL replaces the old URL that we have in the strvec that
contains all remote URLs.

We replace the remote URLs via `strvec_replace()`, which does not hand
over ownership of the new string to the vector. Still, we didn't free
the aliased URL and thus have a memory leak here. Fix it by freeing the
aliased string.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 remote.c                 | 2 ++
 t/t0210-trace2-normal.sh | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/remote.c b/remote.c
index f43cf5e7a4..3b898edd23 100644
--- a/remote.c
+++ b/remote.c
@@ -499,6 +499,7 @@ static void alias_all_urls(struct remote_state *remote_state)
 			if (alias)
 				strvec_replace(&remote_state->remotes[i]->pushurl,
 					       j, alias);
+			free(alias);
 		}
 		add_pushurl_aliases = remote_state->remotes[i]->pushurl.nr == 0;
 		for (j = 0; j < remote_state->remotes[i]->url.nr; j++) {
@@ -512,6 +513,7 @@ static void alias_all_urls(struct remote_state *remote_state)
 			if (alias)
 				strvec_replace(&remote_state->remotes[i]->url,
 					       j, alias);
+			free(alias);
 		}
 	}
 }
diff --git a/t/t0210-trace2-normal.sh b/t/t0210-trace2-normal.sh
index c312657a12..b9adc94aab 100755
--- a/t/t0210-trace2-normal.sh
+++ b/t/t0210-trace2-normal.sh
@@ -2,7 +2,7 @@
 
 test_description='test trace2 facility (normal target)'
 
-TEST_PASSES_SANITIZE_LEAK=false
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # Turn off any inherited trace2 settings for this test.
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 02/22] git: fix leaking system paths
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
  2024-08-14  6:51   ` [PATCH v4 01/22] remote: plug memory leak when aliasing URLs Patrick Steinhardt
@ 2024-08-14  6:51   ` Patrick Steinhardt
  2024-08-14  6:51   ` [PATCH v4 03/22] object-file: fix memory leak when reading corrupted headers Patrick Steinhardt
                     ` (19 subsequent siblings)
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:51 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

Git has some flags to make it output system paths as they have been
compiled into Git. This is done by calling `system_path()`, which
returns an allocated string. This string isn't ever free'd though,
creating a memory leak.

Plug those leaks. While they are surfaced by t0211, there are more
memory leaks looming exposed by that test suite and it thus does not yet
pass with the memory leak checker enabled.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 git.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/git.c b/git.c
index e35af9b0e5..9a618a2740 100644
--- a/git.c
+++ b/git.c
@@ -143,6 +143,13 @@ void setup_auto_pager(const char *cmd, int def)
 	commit_pager_choice();
 }
 
+static void print_system_path(const char *path)
+{
+	char *s_path = system_path(path);
+	puts(s_path);
+	free(s_path);
+}
+
 static int handle_options(const char ***argv, int *argc, int *envchanged)
 {
 	const char **orig_argv = *argv;
@@ -173,15 +180,15 @@ static int handle_options(const char ***argv, int *argc, int *envchanged)
 				exit(0);
 			}
 		} else if (!strcmp(cmd, "--html-path")) {
-			puts(system_path(GIT_HTML_PATH));
+			print_system_path(GIT_HTML_PATH);
 			trace2_cmd_name("_query_");
 			exit(0);
 		} else if (!strcmp(cmd, "--man-path")) {
-			puts(system_path(GIT_MAN_PATH));
+			print_system_path(GIT_MAN_PATH);
 			trace2_cmd_name("_query_");
 			exit(0);
 		} else if (!strcmp(cmd, "--info-path")) {
-			puts(system_path(GIT_INFO_PATH));
+			print_system_path(GIT_INFO_PATH);
 			trace2_cmd_name("_query_");
 			exit(0);
 		} else if (!strcmp(cmd, "-p") || !strcmp(cmd, "--paginate")) {
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 03/22] object-file: fix memory leak when reading corrupted headers
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
  2024-08-14  6:51   ` [PATCH v4 01/22] remote: plug memory leak when aliasing URLs Patrick Steinhardt
  2024-08-14  6:51   ` [PATCH v4 02/22] git: fix leaking system paths Patrick Steinhardt
@ 2024-08-14  6:51   ` Patrick Steinhardt
  2024-08-14  6:52   ` [PATCH v4 04/22] object-name: fix leaking symlink paths in object context Patrick Steinhardt
                     ` (18 subsequent siblings)
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:51 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

When reading corrupt object headers in `read_loose_object()`, we bail
out immediately. This causes a memory leak though because we would have
already initialized the zstream in `unpack_loose_header()`, and it is
the callers responsibility to finish the zstream even on error. While
this feels weird, other callsites do it correctly already.

Fix this leak by ending the zstream even on errors. We may want to
revisit this interface in the future such that the callee handles this
for us already when there was an error.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 object-file.c   | 1 +
 t/t1450-fsck.sh | 1 +
 2 files changed, 2 insertions(+)

diff --git a/object-file.c b/object-file.c
index 065103be3e..7c65c435cd 100644
--- a/object-file.c
+++ b/object-file.c
@@ -2954,6 +2954,7 @@ int read_loose_object(const char *path,
 	if (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr),
 				NULL) != ULHR_OK) {
 		error(_("unable to unpack header of %s"), path);
+		git_inflate_end(&stream);
 		goto out;
 	}
 
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 8a456b1142..280cbf3e03 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -6,6 +6,7 @@ test_description='git fsck random collection of tests
 * (main) A
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success setup '
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 04/22] object-name: fix leaking symlink paths in object context
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2024-08-14  6:51   ` [PATCH v4 03/22] object-file: fix memory leak when reading corrupted headers Patrick Steinhardt
@ 2024-08-14  6:52   ` Patrick Steinhardt
  2024-08-14  6:52   ` [PATCH v4 05/22] bulk-checkin: fix leaking state TODO Patrick Steinhardt
                     ` (17 subsequent siblings)
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:52 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

The object context may be populated with symlink contents when reading a
symlink, but the associated strbuf doesn't ever get released when
releasing the object context, causing a memory leak. Plug it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 object-name.c       | 1 +
 t/t1006-cat-file.sh | 1 +
 2 files changed, 2 insertions(+)

diff --git a/object-name.c b/object-name.c
index 240a93e7ce..e39fa50e47 100644
--- a/object-name.c
+++ b/object-name.c
@@ -1765,6 +1765,7 @@ int strbuf_check_branch_ref(struct strbuf *sb, const char *name)
 void object_context_release(struct object_context *ctx)
 {
 	free(ctx->path);
+	strbuf_release(&ctx->symlink_path);
 }
 
 /*
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index ff9bf213aa..d36cd7c086 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -2,6 +2,7 @@
 
 test_description='git cat-file'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_cmdmode_usage () {
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 05/22] bulk-checkin: fix leaking state TODO
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2024-08-14  6:52   ` [PATCH v4 04/22] object-name: fix leaking symlink paths in object context Patrick Steinhardt
@ 2024-08-14  6:52   ` Patrick Steinhardt
  2024-08-14  6:52   ` [PATCH v4 06/22] read-cache: fix leaking hashfile when writing index fails Patrick Steinhardt
                     ` (16 subsequent siblings)
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:52 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

When flushing a bulk-checking to disk we also reset the `struct
bulk_checkin_packfile` state. But while we free some of its members,
others aren't being free'd, leading to memory leaks:

  - The temporary packfile name is not getting freed.

  - The `struct hashfile` only gets freed in case we end up calling
    `finalize_hashfile()`. There are code paths though where that is not
    the case, namely when nothing has been written. For this, we need to
    make `free_hashfile()` public.

Fix those leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 bulk-checkin.c   |  2 ++
 csum-file.c      |  2 +-
 csum-file.h      | 10 ++++++++++
 t/t1050-large.sh |  1 +
 4 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/bulk-checkin.c b/bulk-checkin.c
index da8673199b..9089c214fa 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -61,6 +61,7 @@ static void flush_bulk_checkin_packfile(struct bulk_checkin_packfile *state)
 
 	if (state->nr_written == 0) {
 		close(state->f->fd);
+		free_hashfile(state->f);
 		unlink(state->pack_tmp_name);
 		goto clear_exit;
 	} else if (state->nr_written == 1) {
@@ -83,6 +84,7 @@ static void flush_bulk_checkin_packfile(struct bulk_checkin_packfile *state)
 		free(state->written[i]);
 
 clear_exit:
+	free(state->pack_tmp_name);
 	free(state->written);
 	memset(state, 0, sizeof(*state));
 
diff --git a/csum-file.c b/csum-file.c
index 8abbf01325..7e0ece1305 100644
--- a/csum-file.c
+++ b/csum-file.c
@@ -56,7 +56,7 @@ void hashflush(struct hashfile *f)
 	}
 }
 
-static void free_hashfile(struct hashfile *f)
+void free_hashfile(struct hashfile *f)
 {
 	free(f->buffer);
 	free(f->check_buffer);
diff --git a/csum-file.h b/csum-file.h
index 566e05cbd2..ca553eba17 100644
--- a/csum-file.h
+++ b/csum-file.h
@@ -46,6 +46,16 @@ int hashfile_truncate(struct hashfile *, struct hashfile_checkpoint *);
 struct hashfile *hashfd(int fd, const char *name);
 struct hashfile *hashfd_check(const char *name);
 struct hashfile *hashfd_throughput(int fd, const char *name, struct progress *tp);
+
+/*
+ * Free the hashfile without flushing its contents to disk. This only
+ * needs to be called when not calling `finalize_hashfile()`.
+ */
+void free_hashfile(struct hashfile *f);
+
+/*
+ * Finalize the hashfile by flushing data to disk and free'ing it.
+ */
 int finalize_hashfile(struct hashfile *, unsigned char *, enum fsync_component, unsigned int);
 void hashwrite(struct hashfile *, const void *, unsigned int);
 void hashflush(struct hashfile *f);
diff --git a/t/t1050-large.sh b/t/t1050-large.sh
index c71932b024..ed638f6644 100755
--- a/t/t1050-large.sh
+++ b/t/t1050-large.sh
@@ -3,6 +3,7 @@
 
 test_description='adding and checking out large blobs'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'core.bigFileThreshold must be non-negative' '
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 06/22] read-cache: fix leaking hashfile when writing index fails
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2024-08-14  6:52   ` [PATCH v4 05/22] bulk-checkin: fix leaking state TODO Patrick Steinhardt
@ 2024-08-14  6:52   ` Patrick Steinhardt
  2024-08-14  6:52   ` [PATCH v4 07/22] submodule-config: fix leaking name entry when traversing submodules Patrick Steinhardt
                     ` (15 subsequent siblings)
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:52 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

In `do_write_index()`, we use a `struct hashfile` to write the index
with a trailer hash. In case the write fails though, we never clean up
the allocated `hashfile` state and thus leak memory.

Refactor the code to have a common exit path where we can free this and
other allocated memory. While at it, refactor our use of `strbuf`s such
that we reuse the same buffer to avoid some unneeded allocations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 read-cache.c                       | 97 ++++++++++++++++++------------
 t/t1601-index-bogus.sh             |  2 +
 t/t2107-update-index-basic.sh      |  1 +
 t/t7008-filter-branch-null-sha1.sh |  1 +
 4 files changed, 62 insertions(+), 39 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 48bf24f87c..36821fe5b5 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2840,8 +2840,9 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	int csum_fsync_flag;
 	int ieot_entries = 1;
 	struct index_entry_offset_table *ieot = NULL;
-	int nr, nr_threads;
 	struct repository *r = istate->repo;
+	struct strbuf sb = STRBUF_INIT;
+	int nr, nr_threads, ret;
 
 	f = hashfd(tempfile->fd, tempfile->filename.buf);
 
@@ -2962,8 +2963,8 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	strbuf_release(&previous_name_buf);
 
 	if (err) {
-		free(ieot);
-		return err;
+		ret = err;
+		goto out;
 	}
 
 	offset = hashfile_total(f);
@@ -2985,20 +2986,20 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	 * index.
 	 */
 	if (ieot) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		write_ieot_extension(&sb, ieot);
 		err = write_index_ext_header(f, eoie_c, CACHE_EXT_INDEXENTRYOFFSETTABLE, sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		free(ieot);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 
 	if (write_extensions & WRITE_SPLIT_INDEX_EXTENSION &&
 	    istate->split_index) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		if (istate->sparse_index)
 			die(_("cannot write split index for a sparse index"));
@@ -3007,59 +3008,66 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 			write_index_ext_header(f, eoie_c, CACHE_EXT_LINK,
 					       sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 	if (write_extensions & WRITE_CACHE_TREE_EXTENSION &&
 	    !drop_cache_tree && istate->cache_tree) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		cache_tree_write(&sb, istate->cache_tree);
 		err = write_index_ext_header(f, eoie_c, CACHE_EXT_TREE, sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 	if (write_extensions & WRITE_RESOLVE_UNDO_EXTENSION &&
 	    istate->resolve_undo) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		resolve_undo_write(&sb, istate->resolve_undo);
 		err = write_index_ext_header(f, eoie_c, CACHE_EXT_RESOLVE_UNDO,
 					     sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 	if (write_extensions & WRITE_UNTRACKED_CACHE_EXTENSION &&
 	    istate->untracked) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		write_untracked_extension(&sb, istate->untracked);
 		err = write_index_ext_header(f, eoie_c, CACHE_EXT_UNTRACKED,
 					     sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 	if (write_extensions & WRITE_FSMONITOR_EXTENSION &&
 	    istate->fsmonitor_last_update) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		write_fsmonitor_extension(&sb, istate);
 		err = write_index_ext_header(f, eoie_c, CACHE_EXT_FSMONITOR, sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 	if (istate->sparse_index) {
-		if (write_index_ext_header(f, eoie_c, CACHE_EXT_SPARSE_DIRECTORIES, 0) < 0)
-			return -1;
+		if (write_index_ext_header(f, eoie_c, CACHE_EXT_SPARSE_DIRECTORIES, 0) < 0) {
+			ret = -1;
+			goto out;
+		}
 	}
 
 	/*
@@ -3069,14 +3077,15 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	 * when loading the shared index.
 	 */
 	if (eoie_c) {
-		struct strbuf sb = STRBUF_INIT;
+		strbuf_reset(&sb);
 
 		write_eoie_extension(&sb, eoie_c, offset);
 		err = write_index_ext_header(f, NULL, CACHE_EXT_ENDOFINDEXENTRIES, sb.len) < 0;
 		hashwrite(f, sb.buf, sb.len);
-		strbuf_release(&sb);
-		if (err)
-			return -1;
+		if (err) {
+			ret = -1;
+			goto out;
+		}
 	}
 
 	csum_fsync_flag = 0;
@@ -3085,13 +3094,16 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 
 	finalize_hashfile(f, istate->oid.hash, FSYNC_COMPONENT_INDEX,
 			  CSUM_HASH_IN_STREAM | csum_fsync_flag);
+	f = NULL;
 
 	if (close_tempfile_gently(tempfile)) {
-		error(_("could not close '%s'"), get_tempfile_path(tempfile));
-		return -1;
+		ret = error(_("could not close '%s'"), get_tempfile_path(tempfile));
+		goto out;
+	}
+	if (stat(get_tempfile_path(tempfile), &st)) {
+		ret = -1;
+		goto out;
 	}
-	if (stat(get_tempfile_path(tempfile), &st))
-		return -1;
 	istate->timestamp.sec = (unsigned int)st.st_mtime;
 	istate->timestamp.nsec = ST_MTIME_NSEC(st);
 	trace_performance_since(start, "write index, changed mask = %x", istate->cache_changed);
@@ -3105,7 +3117,14 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	trace2_data_intmax("index", the_repository, "write/cache_nr",
 			   istate->cache_nr);
 
-	return 0;
+	ret = 0;
+
+out:
+	if (f)
+		free_hashfile(f);
+	strbuf_release(&sb);
+	free(ieot);
+	return ret;
 }
 
 void set_alternate_index_output(const char *name)
diff --git a/t/t1601-index-bogus.sh b/t/t1601-index-bogus.sh
index 4171f1e141..5dcc101882 100755
--- a/t/t1601-index-bogus.sh
+++ b/t/t1601-index-bogus.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test handling of bogus index entries'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'create tree with null sha1' '
diff --git a/t/t2107-update-index-basic.sh b/t/t2107-update-index-basic.sh
index cc72ead79f..f0eab13f96 100755
--- a/t/t2107-update-index-basic.sh
+++ b/t/t2107-update-index-basic.sh
@@ -5,6 +5,7 @@ test_description='basic update-index tests
 Tests for command-line parsing and basic operation.
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'update-index --nonsense fails' '
diff --git a/t/t7008-filter-branch-null-sha1.sh b/t/t7008-filter-branch-null-sha1.sh
index 93fbc92b8d..0ce8fd2c89 100755
--- a/t/t7008-filter-branch-null-sha1.sh
+++ b/t/t7008-filter-branch-null-sha1.sh
@@ -2,6 +2,7 @@
 
 test_description='filter-branch removal of trees with null sha1'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup: base commits' '
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 07/22] submodule-config: fix leaking name entry when traversing submodules
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2024-08-14  6:52   ` [PATCH v4 06/22] read-cache: fix leaking hashfile when writing index fails Patrick Steinhardt
@ 2024-08-14  6:52   ` Patrick Steinhardt
  2024-08-14  6:52   ` [PATCH v4 08/22] config: fix leaking comment character config Patrick Steinhardt
                     ` (14 subsequent siblings)
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:52 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

We traverse through submodules in the tree via `tree_entry()`, passing
to it a `struct name_entry` that it is supposed to populate with the
tree entry's contents. We unnecessarily allocate this variable instead
of passing a variable that is allocated on the stack, and the ultimately
don't even free that variable. This is unnecessary and leaks memory.

Convert the variable to instead be allocated on the stack to plug the
memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 submodule-config.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/submodule-config.c b/submodule-config.c
index 9b0bb0b9f4..c8f2bb2bdd 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -899,27 +899,25 @@ static void traverse_tree_submodules(struct repository *r,
 {
 	struct tree_desc tree;
 	struct submodule_tree_entry *st_entry;
-	struct name_entry *name_entry;
+	struct name_entry name_entry;
 	char *tree_path = NULL;
 
-	name_entry = xmalloc(sizeof(*name_entry));
-
 	fill_tree_descriptor(r, &tree, treeish_name);
-	while (tree_entry(&tree, name_entry)) {
+	while (tree_entry(&tree, &name_entry)) {
 		if (prefix)
 			tree_path =
-				mkpathdup("%s/%s", prefix, name_entry->path);
+				mkpathdup("%s/%s", prefix, name_entry.path);
 		else
-			tree_path = xstrdup(name_entry->path);
+			tree_path = xstrdup(name_entry.path);
 
-		if (S_ISGITLINK(name_entry->mode) &&
+		if (S_ISGITLINK(name_entry.mode) &&
 		    is_tree_submodule_active(r, root_tree, tree_path)) {
 			ALLOC_GROW(out->entries, out->entry_nr + 1,
 				   out->entry_alloc);
 			st_entry = &out->entries[out->entry_nr++];
 
 			st_entry->name_entry = xmalloc(sizeof(*st_entry->name_entry));
-			*st_entry->name_entry = *name_entry;
+			*st_entry->name_entry = name_entry;
 			st_entry->submodule =
 				submodule_from_path(r, root_tree, tree_path);
 			st_entry->repo = xmalloc(sizeof(*st_entry->repo));
@@ -927,9 +925,9 @@ static void traverse_tree_submodules(struct repository *r,
 						root_tree))
 				FREE_AND_NULL(st_entry->repo);
 
-		} else if (S_ISDIR(name_entry->mode))
+		} else if (S_ISDIR(name_entry.mode))
 			traverse_tree_submodules(r, root_tree, tree_path,
-						 &name_entry->oid, out);
+						 &name_entry.oid, out);
 		free(tree_path);
 	}
 }
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 08/22] config: fix leaking comment character config
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2024-08-14  6:52   ` [PATCH v4 07/22] submodule-config: fix leaking name entry when traversing submodules Patrick Steinhardt
@ 2024-08-14  6:52   ` Patrick Steinhardt
  2024-08-14  6:52   ` [PATCH v4 09/22] builtin/rebase: fix leaking `commit.gpgsign` value Patrick Steinhardt
                     ` (13 subsequent siblings)
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:52 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

When the comment line character has been specified multiple times in the
configuration, then `git_default_core_config()` will cause a memory leak
because it unconditionally copies the string into `comment_line_str`
without free'ing the previous value. In fact, it can't easily free the
value in the first place because it may contain a string constant.

Refactor the code such that we track allocated comment character strings
via a separate non-constant variable `comment_line_str_to_free`. Adapt
sites that set `comment_line_str` to set both and free the old value
that was stored in `comment_line_str_to_free`.

This memory leak is being hit in t3404. As there are still other memory
leaks in that file we cannot yet mark it as passing with leak checking
enabled.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/commit.c | 7 +++++--
 config.c         | 3 ++-
 environment.c    | 1 +
 environment.h    | 1 +
 4 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 66427ba82d..b2033c4887 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -684,7 +684,9 @@ static void adjust_comment_line_char(const struct strbuf *sb)
 	const char *p;
 
 	if (!memchr(sb->buf, candidates[0], sb->len)) {
-		comment_line_str = xstrfmt("%c", candidates[0]);
+		free(comment_line_str_to_free);
+		comment_line_str = comment_line_str_to_free =
+			xstrfmt("%c", candidates[0]);
 		return;
 	}
 
@@ -705,7 +707,8 @@ static void adjust_comment_line_char(const struct strbuf *sb)
 	if (!*p)
 		die(_("unable to select a comment character that is not used\n"
 		      "in the current commit message"));
-	comment_line_str = xstrfmt("%c", *p);
+	free(comment_line_str_to_free);
+	comment_line_str = comment_line_str_to_free = xstrfmt("%c", *p);
 }
 
 static void prepare_amend_commit(struct commit *commit, struct strbuf *sb,
diff --git a/config.c b/config.c
index 6421894614..205660a8fb 100644
--- a/config.c
+++ b/config.c
@@ -1596,7 +1596,8 @@ static int git_default_core_config(const char *var, const char *value,
 		else if (value[0]) {
 			if (strchr(value, '\n'))
 				return error(_("%s cannot contain newline"), var);
-			comment_line_str = xstrdup(value);
+			comment_line_str = value;
+			FREE_AND_NULL(comment_line_str_to_free);
 			auto_comment_line_char = 0;
 		} else
 			return error(_("%s must have at least one character"), var);
diff --git a/environment.c b/environment.c
index 5cea2c9f54..1d6c48b52d 100644
--- a/environment.c
+++ b/environment.c
@@ -114,6 +114,7 @@ int protect_ntfs = PROTECT_NTFS_DEFAULT;
  * that is subject to stripspace.
  */
 const char *comment_line_str = "#";
+char *comment_line_str_to_free;
 int auto_comment_line_char;
 
 /* Parallel index stat data preload? */
diff --git a/environment.h b/environment.h
index e9f01d4d11..0148738ed6 100644
--- a/environment.h
+++ b/environment.h
@@ -9,6 +9,7 @@ struct strvec;
  * that is subject to stripspace.
  */
 extern const char *comment_line_str;
+extern char *comment_line_str_to_free;
 extern int auto_comment_line_char;
 
 /*
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 09/22] builtin/rebase: fix leaking `commit.gpgsign` value
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
                     ` (7 preceding siblings ...)
  2024-08-14  6:52   ` [PATCH v4 08/22] config: fix leaking comment character config Patrick Steinhardt
@ 2024-08-14  6:52   ` Patrick Steinhardt
  2024-08-14  6:52   ` [PATCH v4 10/22] builtin/notes: fix leaking `struct notes_tree` when merging notes Patrick Steinhardt
                     ` (12 subsequent siblings)
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:52 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

In `get_replay_opts()`, we override the `gpg_sign` field that already
got populated by `sequencer_init_config()` in case the user has
"commit.gpgsign" set in their config. This creates a memory leak because
we overwrite the previously assigned value, which may have already
pointed to an allocated string.

Let's plug the memory leak by freeing the value before we overwrite it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/rebase.c              | 1 +
 sequencer.c                   | 1 +
 t/t3404-rebase-interactive.sh | 1 +
 t/t3435-rebase-gpg-sign.sh    | 1 +
 t/t7030-verify-tag.sh         | 1 +
 5 files changed, 5 insertions(+)

diff --git a/builtin/rebase.c b/builtin/rebase.c
index e3a8e74cfc..2f01d5d3a6 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -186,6 +186,7 @@ static struct replay_opts get_replay_opts(const struct rebase_options *opts)
 	replay.committer_date_is_author_date =
 					opts->committer_date_is_author_date;
 	replay.ignore_date = opts->ignore_date;
+	free(replay.gpg_sign);
 	replay.gpg_sign = xstrdup_or_null(opts->gpg_sign_opt);
 	replay.reflog_action = xstrdup(opts->reflog_action);
 	if (opts->strategy)
diff --git a/sequencer.c b/sequencer.c
index 0291920f0b..cade9b0ca8 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -303,6 +303,7 @@ static int git_sequencer_config(const char *k, const char *v,
 	}
 
 	if (!strcmp(k, "commit.gpgsign")) {
+		free(opts->gpg_sign);
 		opts->gpg_sign = git_config_bool(k, v) ? xstrdup("") : NULL;
 		return 0;
 	}
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index f92baad138..f171af3061 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -26,6 +26,7 @@ Initial setup:
  touch file "conflict".
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 . "$TEST_DIRECTORY"/lib-rebase.sh
diff --git a/t/t3435-rebase-gpg-sign.sh b/t/t3435-rebase-gpg-sign.sh
index 6aa2aeb628..6e329fea7c 100755
--- a/t/t3435-rebase-gpg-sign.sh
+++ b/t/t3435-rebase-gpg-sign.sh
@@ -8,6 +8,7 @@ test_description='test rebase --[no-]gpg-sign'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY/lib-rebase.sh"
 . "$TEST_DIRECTORY/lib-gpg.sh"
diff --git a/t/t7030-verify-tag.sh b/t/t7030-verify-tag.sh
index 6f526c37c2..effa826744 100755
--- a/t/t7030-verify-tag.sh
+++ b/t/t7030-verify-tag.sh
@@ -4,6 +4,7 @@ test_description='signed tag tests'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY/lib-gpg.sh"
 
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 10/22] builtin/notes: fix leaking `struct notes_tree` when merging notes
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
                     ` (8 preceding siblings ...)
  2024-08-14  6:52   ` [PATCH v4 09/22] builtin/rebase: fix leaking `commit.gpgsign` value Patrick Steinhardt
@ 2024-08-14  6:52   ` Patrick Steinhardt
  2024-08-14  6:52   ` [PATCH v4 11/22] builtin/fast-import: plug trivial memory leaks Patrick Steinhardt
                     ` (11 subsequent siblings)
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:52 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

We allocate a `struct notes_tree` in `merge_commit()` which we then
initialize via `init_notes()`. It's not really necessary to allocate the
structure though given that we never pass ownership to the caller.
Furthermore, the allocation leads to a memory leak because despite its
name, `free_notes()` doesn't free the `notes_tree` but only clears it.

Fix this issue by converting the code to use an on-stack variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/notes.c                       | 9 ++++-----
 t/t3310-notes-merge-manual-resolve.sh | 1 +
 t/t3311-notes-merge-fanout.sh         | 1 +
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/builtin/notes.c b/builtin/notes.c
index d9c356e354..81cbaeec6b 100644
--- a/builtin/notes.c
+++ b/builtin/notes.c
@@ -807,7 +807,7 @@ static int merge_commit(struct notes_merge_options *o)
 {
 	struct strbuf msg = STRBUF_INIT;
 	struct object_id oid, parent_oid;
-	struct notes_tree *t;
+	struct notes_tree t = {0};
 	struct commit *partial;
 	struct pretty_print_context pretty_ctx;
 	void *local_ref_to_free;
@@ -830,8 +830,7 @@ static int merge_commit(struct notes_merge_options *o)
 	else
 		oidclr(&parent_oid, the_repository->hash_algo);
 
-	CALLOC_ARRAY(t, 1);
-	init_notes(t, "NOTES_MERGE_PARTIAL", combine_notes_overwrite, 0);
+	init_notes(&t, "NOTES_MERGE_PARTIAL", combine_notes_overwrite, 0);
 
 	o->local_ref = local_ref_to_free =
 		refs_resolve_refdup(get_main_ref_store(the_repository),
@@ -839,7 +838,7 @@ static int merge_commit(struct notes_merge_options *o)
 	if (!o->local_ref)
 		die(_("failed to resolve NOTES_MERGE_REF"));
 
-	if (notes_merge_commit(o, t, partial, &oid))
+	if (notes_merge_commit(o, &t, partial, &oid))
 		die(_("failed to finalize notes merge"));
 
 	/* Reuse existing commit message in reflog message */
@@ -853,7 +852,7 @@ static int merge_commit(struct notes_merge_options *o)
 			is_null_oid(&parent_oid) ? NULL : &parent_oid,
 			0, UPDATE_REFS_DIE_ON_ERR);
 
-	free_notes(t);
+	free_notes(&t);
 	strbuf_release(&msg);
 	ret = merge_abort(o);
 	free(local_ref_to_free);
diff --git a/t/t3310-notes-merge-manual-resolve.sh b/t/t3310-notes-merge-manual-resolve.sh
index 597df5ebc0..04866b89be 100755
--- a/t/t3310-notes-merge-manual-resolve.sh
+++ b/t/t3310-notes-merge-manual-resolve.sh
@@ -5,6 +5,7 @@
 
 test_description='Test notes merging with manual conflict resolution'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # Set up a notes merge scenario with different kinds of conflicts
diff --git a/t/t3311-notes-merge-fanout.sh b/t/t3311-notes-merge-fanout.sh
index 5b675417e9..ce4144db0f 100755
--- a/t/t3311-notes-merge-fanout.sh
+++ b/t/t3311-notes-merge-fanout.sh
@@ -5,6 +5,7 @@
 
 test_description='Test notes merging at various fanout levels'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 verify_notes () {
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 11/22] builtin/fast-import: plug trivial memory leaks
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
                     ` (9 preceding siblings ...)
  2024-08-14  6:52   ` [PATCH v4 10/22] builtin/notes: fix leaking `struct notes_tree` when merging notes Patrick Steinhardt
@ 2024-08-14  6:52   ` Patrick Steinhardt
  2024-08-14  6:52   ` [PATCH v4 12/22] builtin/fast-export: fix leaking diff options Patrick Steinhardt
                     ` (10 subsequent siblings)
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:52 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

Plug some trivial memory leaks in git-fast-import(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/fast-import.c        | 8 ++++++--
 t/t9300-fast-import.sh       | 1 +
 t/t9304-fast-import-marks.sh | 2 ++
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index d21c4053a7..6dfeb01665 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -206,8 +206,8 @@ static unsigned int object_entry_alloc = 5000;
 static struct object_entry_pool *blocks;
 static struct hashmap object_table;
 static struct mark_set *marks;
-static const char *export_marks_file;
-static const char *import_marks_file;
+static char *export_marks_file;
+static char *import_marks_file;
 static int import_marks_file_from_stream;
 static int import_marks_file_ignore_missing;
 static int import_marks_file_done;
@@ -3274,6 +3274,7 @@ static void option_import_marks(const char *marks,
 			read_marks();
 	}
 
+	free(import_marks_file);
 	import_marks_file = make_fast_import_path(marks);
 	import_marks_file_from_stream = from_stream;
 	import_marks_file_ignore_missing = ignore_missing;
@@ -3316,6 +3317,7 @@ static void option_active_branches(const char *branches)
 
 static void option_export_marks(const char *marks)
 {
+	free(export_marks_file);
 	export_marks_file = make_fast_import_path(marks);
 }
 
@@ -3357,6 +3359,8 @@ static void option_rewrite_submodules(const char *arg, struct string_list *list)
 	free(f);
 
 	string_list_insert(list, s)->util = ms;
+
+	free(s);
 }
 
 static int parse_one_option(const char *option)
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 1e68426852..3b3c371740 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -7,6 +7,7 @@ test_description='test git fast-import utility'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-diff.sh ;# test-lib chdir's into trash
 
diff --git a/t/t9304-fast-import-marks.sh b/t/t9304-fast-import-marks.sh
index 410a871c52..1f776a80f3 100755
--- a/t/t9304-fast-import-marks.sh
+++ b/t/t9304-fast-import-marks.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test exotic situations with marks'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup dump of basic history' '
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 12/22] builtin/fast-export: fix leaking diff options
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
                     ` (10 preceding siblings ...)
  2024-08-14  6:52   ` [PATCH v4 11/22] builtin/fast-import: plug trivial memory leaks Patrick Steinhardt
@ 2024-08-14  6:52   ` Patrick Steinhardt
  2024-08-14  6:52   ` [PATCH v4 13/22] builtin/fast-export: plug leaking tag names Patrick Steinhardt
                     ` (9 subsequent siblings)
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:52 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

Before calling `handle_commit()` in a loop, we set `diffopt.no_free`
such that its contents aren't getting freed inside of `handle_commit()`.
We never unset that flag though, which means that the structure's
allocated resources will ultimately leak.

Fix this by unsetting the flag after the loop such that we release its
resources via `release_revisions()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/fast-export.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 4b6e8c6832..fe92d2436c 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -1278,9 +1278,11 @@ int cmd_fast_export(int argc, const char **argv, const char *prefix)
 	revs.diffopt.format_callback = show_filemodify;
 	revs.diffopt.format_callback_data = &paths_of_changed_objects;
 	revs.diffopt.flags.recursive = 1;
+
 	revs.diffopt.no_free = 1;
 	while ((commit = get_revision(&revs)))
 		handle_commit(commit, &revs, &paths_of_changed_objects);
+	revs.diffopt.no_free = 0;
 
 	handle_tags_and_duplicates(&extra_refs);
 	handle_tags_and_duplicates(&tag_refs);
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 13/22] builtin/fast-export: plug leaking tag names
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
                     ` (11 preceding siblings ...)
  2024-08-14  6:52   ` [PATCH v4 12/22] builtin/fast-export: fix leaking diff options Patrick Steinhardt
@ 2024-08-14  6:52   ` Patrick Steinhardt
  2024-08-14  6:52   ` [PATCH v4 14/22] merge-ort: unconditionally release attributes index Patrick Steinhardt
                     ` (8 subsequent siblings)
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:52 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

When resolving revisions in `get_tags_and_duplicates()`, we only
partially manage the lifetime of `full_name`. In fact, managing its
lifetime properly is almost impossible because we put direct pointers to
that variable into multiple lists without duplicating the string. The
consequence is that these strings will ultimately leak.

Refactor the code to make the lists we put those names into duplicate
the memory. This allows us to properly free the string as required and
thus plugs the memory leak.

While this requires us to allocate more data overall, it shouldn't be
all that bad given that the number of allocations corresponds with the
number of command line parameters, which typically aren't all that many.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/fast-export.c            | 17 ++++++++++++-----
 t/t9351-fast-export-anonymize.sh |  1 +
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index fe92d2436c..f253b79322 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -42,8 +42,8 @@ static int full_tree;
 static int reference_excluded_commits;
 static int show_original_ids;
 static int mark_tags;
-static struct string_list extra_refs = STRING_LIST_INIT_NODUP;
-static struct string_list tag_refs = STRING_LIST_INIT_NODUP;
+static struct string_list extra_refs = STRING_LIST_INIT_DUP;
+static struct string_list tag_refs = STRING_LIST_INIT_DUP;
 static struct refspec refspecs = REFSPEC_INIT_FETCH;
 static int anonymize;
 static struct hashmap anonymized_seeds;
@@ -901,7 +901,7 @@ static void handle_tag(const char *name, struct tag *tag)
 	free(buf);
 }
 
-static struct commit *get_commit(struct rev_cmdline_entry *e, char *full_name)
+static struct commit *get_commit(struct rev_cmdline_entry *e, const char *full_name)
 {
 	switch (e->item->type) {
 	case OBJ_COMMIT:
@@ -932,14 +932,16 @@ static void get_tags_and_duplicates(struct rev_cmdline_info *info)
 		struct rev_cmdline_entry *e = info->rev + i;
 		struct object_id oid;
 		struct commit *commit;
-		char *full_name;
+		char *full_name = NULL;
 
 		if (e->flags & UNINTERESTING)
 			continue;
 
 		if (repo_dwim_ref(the_repository, e->name, strlen(e->name),
-				  &oid, &full_name, 0) != 1)
+				  &oid, &full_name, 0) != 1) {
+			free(full_name);
 			continue;
+		}
 
 		if (refspecs.nr) {
 			char *private;
@@ -955,6 +957,7 @@ static void get_tags_and_duplicates(struct rev_cmdline_info *info)
 			warning("%s: Unexpected object of type %s, skipping.",
 				e->name,
 				type_name(e->item->type));
+			free(full_name);
 			continue;
 		}
 
@@ -963,10 +966,12 @@ static void get_tags_and_duplicates(struct rev_cmdline_info *info)
 			break;
 		case OBJ_BLOB:
 			export_blob(&commit->object.oid);
+			free(full_name);
 			continue;
 		default: /* OBJ_TAG (nested tags) is already handled */
 			warning("Tag points to object of unexpected type %s, skipping.",
 				type_name(commit->object.type));
+			free(full_name);
 			continue;
 		}
 
@@ -979,6 +984,8 @@ static void get_tags_and_duplicates(struct rev_cmdline_info *info)
 
 		if (!*revision_sources_at(&revision_sources, commit))
 			*revision_sources_at(&revision_sources, commit) = full_name;
+		else
+			free(full_name);
 	}
 
 	string_list_sort(&extra_refs);
diff --git a/t/t9351-fast-export-anonymize.sh b/t/t9351-fast-export-anonymize.sh
index 156a647484..c0d9d7be75 100755
--- a/t/t9351-fast-export-anonymize.sh
+++ b/t/t9351-fast-export-anonymize.sh
@@ -4,6 +4,7 @@ test_description='basic tests for fast-export --anonymize'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup simple repo' '
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 14/22] merge-ort: unconditionally release attributes index
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
                     ` (12 preceding siblings ...)
  2024-08-14  6:52   ` [PATCH v4 13/22] builtin/fast-export: plug leaking tag names Patrick Steinhardt
@ 2024-08-14  6:52   ` Patrick Steinhardt
  2024-08-14  6:52   ` [PATCH v4 15/22] sequencer: release todo list on error paths Patrick Steinhardt
                     ` (7 subsequent siblings)
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:52 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

We conditionally release the index used for reading gitattributes in
merge-ort based on whether or the index has been populated. This check
uses `cache_nr` as a condition. This isn't sufficient though, as the
variable may be zero even when some other parts of the index have been
populated. This leads to memory leaks when sparse checkouts are in use,
as we may not end up releasing the sparse checkout patterns.

Fix this issue by unconditionally releasing the index.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 merge-ort.c                       | 3 +--
 t/t3507-cherry-pick-conflict.sh   | 1 +
 t/t6421-merge-partial-clone.sh    | 1 +
 t/t6428-merge-conflicts-sparse.sh | 1 +
 t/t7817-grep-sparse-checkout.sh   | 1 +
 5 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index e9d01ac7f7..3752c7e595 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -689,8 +689,7 @@ static void clear_or_reinit_internal_opts(struct merge_options_internal *opti,
 	 */
 	strmap_clear_func(&opti->conflicted, 0);
 
-	if (opti->attr_index.cache_nr) /* true iff opt->renormalize */
-		discard_index(&opti->attr_index);
+	discard_index(&opti->attr_index);
 
 	/* Free memory used by various renames maps */
 	for (i = MERGE_SIDE1; i <= MERGE_SIDE2; ++i) {
diff --git a/t/t3507-cherry-pick-conflict.sh b/t/t3507-cherry-pick-conflict.sh
index f3947b400a..10e9c91dbb 100755
--- a/t/t3507-cherry-pick-conflict.sh
+++ b/t/t3507-cherry-pick-conflict.sh
@@ -13,6 +13,7 @@ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 TEST_CREATE_REPO_NO_TEMPLATE=1
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 pristine_detach () {
diff --git a/t/t6421-merge-partial-clone.sh b/t/t6421-merge-partial-clone.sh
index 711b709e75..020375c805 100755
--- a/t/t6421-merge-partial-clone.sh
+++ b/t/t6421-merge-partial-clone.sh
@@ -26,6 +26,7 @@ test_description="limiting blob downloads when merging with partial clones"
 #                     underscore notation is to differentiate different
 #                     files that might be renamed into each other's paths.)
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-merge.sh
 
diff --git a/t/t6428-merge-conflicts-sparse.sh b/t/t6428-merge-conflicts-sparse.sh
index 9919c3fa7c..8a79bc2e92 100755
--- a/t/t6428-merge-conflicts-sparse.sh
+++ b/t/t6428-merge-conflicts-sparse.sh
@@ -22,6 +22,7 @@ test_description="merge cases"
 #                     underscore notation is to differentiate different
 #                     files that might be renamed into each other's paths.)
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-merge.sh
 
diff --git a/t/t7817-grep-sparse-checkout.sh b/t/t7817-grep-sparse-checkout.sh
index eb59564565..0ba7817fb7 100755
--- a/t/t7817-grep-sparse-checkout.sh
+++ b/t/t7817-grep-sparse-checkout.sh
@@ -33,6 +33,7 @@ should leave the following structure in the working tree:
 But note that sub2 should have the SKIP_WORKTREE bit set.
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup' '
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 15/22] sequencer: release todo list on error paths
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
                     ` (13 preceding siblings ...)
  2024-08-14  6:52   ` [PATCH v4 14/22] merge-ort: unconditionally release attributes index Patrick Steinhardt
@ 2024-08-14  6:52   ` Patrick Steinhardt
  2024-08-14  6:52   ` [PATCH v4 16/22] unpack-trees: clear index when not propagating it Patrick Steinhardt
                     ` (6 subsequent siblings)
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:52 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

We're not releasing the `todo_list` in `sequencer_pick_revisions()` when
hitting an error path. Restructure the function to have a common exit
path such that we can easily clean up the list and thus plug this memory
leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 sequencer.c                     | 66 +++++++++++++++++++++++----------
 t/t3510-cherry-pick-sequence.sh |  1 +
 2 files changed, 48 insertions(+), 19 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index cade9b0ca8..ea559c31f1 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -5490,8 +5490,10 @@ int sequencer_pick_revisions(struct repository *r,
 	int i, res;
 
 	assert(opts->revs);
-	if (read_and_refresh_cache(r, opts))
-		return -1;
+	if (read_and_refresh_cache(r, opts)) {
+		res = -1;
+		goto out;
+	}
 
 	for (i = 0; i < opts->revs->pending.nr; i++) {
 		struct object_id oid;
@@ -5506,11 +5508,14 @@ int sequencer_pick_revisions(struct repository *r,
 				enum object_type type = oid_object_info(r,
 									&oid,
 									NULL);
-				return error(_("%s: can't cherry-pick a %s"),
-					name, type_name(type));
+				res = error(_("%s: can't cherry-pick a %s"),
+					    name, type_name(type));
+				goto out;
 			}
-		} else
-			return error(_("%s: bad revision"), name);
+		} else {
+			res = error(_("%s: bad revision"), name);
+			goto out;
+		}
 	}
 
 	/*
@@ -5525,14 +5530,23 @@ int sequencer_pick_revisions(struct repository *r,
 	    opts->revs->no_walk &&
 	    !opts->revs->cmdline.rev->flags) {
 		struct commit *cmit;
-		if (prepare_revision_walk(opts->revs))
-			return error(_("revision walk setup failed"));
+
+		if (prepare_revision_walk(opts->revs)) {
+			res = error(_("revision walk setup failed"));
+			goto out;
+		}
+
 		cmit = get_revision(opts->revs);
-		if (!cmit)
-			return error(_("empty commit set passed"));
+		if (!cmit) {
+			res = error(_("empty commit set passed"));
+			goto out;
+		}
+
 		if (get_revision(opts->revs))
 			BUG("unexpected extra commit from walk");
-		return single_pick(r, cmit, opts);
+
+		res = single_pick(r, cmit, opts);
+		goto out;
 	}
 
 	/*
@@ -5542,16 +5556,30 @@ int sequencer_pick_revisions(struct repository *r,
 	 */
 
 	if (walk_revs_populate_todo(&todo_list, opts) ||
-			create_seq_dir(r) < 0)
-		return -1;
-	if (repo_get_oid(r, "HEAD", &oid) && (opts->action == REPLAY_REVERT))
-		return error(_("can't revert as initial commit"));
-	if (save_head(oid_to_hex(&oid)))
-		return -1;
-	if (save_opts(opts))
-		return -1;
+			create_seq_dir(r) < 0) {
+		res = -1;
+		goto out;
+	}
+
+	if (repo_get_oid(r, "HEAD", &oid) && (opts->action == REPLAY_REVERT)) {
+		res = error(_("can't revert as initial commit"));
+		goto out;
+	}
+
+	if (save_head(oid_to_hex(&oid))) {
+		res = -1;
+		goto out;
+	}
+
+	if (save_opts(opts)) {
+		res = -1;
+		goto out;
+	}
+
 	update_abort_safety_file();
 	res = pick_commits(r, &todo_list, opts);
+
+out:
 	todo_list_release(&todo_list);
 	return res;
 }
diff --git a/t/t3510-cherry-pick-sequence.sh b/t/t3510-cherry-pick-sequence.sh
index 7eb52b12ed..93c725bac3 100755
--- a/t/t3510-cherry-pick-sequence.sh
+++ b/t/t3510-cherry-pick-sequence.sh
@@ -12,6 +12,7 @@ test_description='Test cherry-pick continuation features
 
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # Repeat first match 10 times
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 16/22] unpack-trees: clear index when not propagating it
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
                     ` (14 preceding siblings ...)
  2024-08-14  6:52   ` [PATCH v4 15/22] sequencer: release todo list on error paths Patrick Steinhardt
@ 2024-08-14  6:52   ` Patrick Steinhardt
  2024-08-14  6:52   ` [PATCH v4 17/22] diff: fix leak when parsing invalid ignore regex option Patrick Steinhardt
                     ` (5 subsequent siblings)
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:52 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

When provided a pointer to a destination index, then `unpack_trees()`
will end up copying its `o->internal.result` index into the provided
pointer. In those cases it is thus not necessary to free the index, as
we have transferred ownership of it.

There are cases though where we do not end up transferring ownership of
the memory, but `clear_unpack_trees_porcelain()` will never discard the
index in that case and thus cause a memory leak. And right now it cannot
do so in the first place because we have no indicator of whether we did
or didn't transfer ownership of the index.

Adapt the code to zero out the index in case we transfer its ownership.
Like this, we can now unconditionally discard the index when being asked
to clear the `unpack_trees_options`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t3705-add-sparse-checkout.sh | 1 +
 unpack-trees.c                 | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/t/t3705-add-sparse-checkout.sh b/t/t3705-add-sparse-checkout.sh
index 2bade9e804..6ae45a788d 100755
--- a/t/t3705-add-sparse-checkout.sh
+++ b/t/t3705-add-sparse-checkout.sh
@@ -2,6 +2,7 @@
 
 test_description='git add in sparse checked out working trees'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 SPARSE_ENTRY_BLOB=""
diff --git a/unpack-trees.c b/unpack-trees.c
index 7dc884fafd..9a55cb6204 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -210,6 +210,7 @@ void clear_unpack_trees_porcelain(struct unpack_trees_options *opts)
 {
 	strvec_clear(&opts->internal.msgs_to_free);
 	memset(opts->internal.msgs, 0, sizeof(opts->internal.msgs));
+	discard_index(&opts->internal.result);
 }
 
 static int do_add_entry(struct unpack_trees_options *o, struct cache_entry *ce,
@@ -2082,6 +2083,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		o->internal.result.updated_workdir = 1;
 		discard_index(o->dst_index);
 		*o->dst_index = o->internal.result;
+		memset(&o->internal.result, 0, sizeof(o->internal.result));
 	} else {
 		discard_index(&o->internal.result);
 	}
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 17/22] diff: fix leak when parsing invalid ignore regex option
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
                     ` (15 preceding siblings ...)
  2024-08-14  6:52   ` [PATCH v4 16/22] unpack-trees: clear index when not propagating it Patrick Steinhardt
@ 2024-08-14  6:52   ` Patrick Steinhardt
  2024-08-14  6:52   ` [PATCH v4 18/22] builtin/format-patch: fix various trivial memory leaks Patrick Steinhardt
                     ` (4 subsequent siblings)
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:52 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

When parsing invalid ignore regexes passed via the `-I` option we don't
free already-allocated memory, leading to a memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 diff.c                  | 6 +++++-
 t/t4013-diff-various.sh | 1 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index ebb7538e04..9251c47b72 100644
--- a/diff.c
+++ b/diff.c
@@ -5464,9 +5464,13 @@ static int diff_opt_ignore_regex(const struct option *opt,
 	regex_t *regex;
 
 	BUG_ON_OPT_NEG(unset);
+
 	regex = xmalloc(sizeof(*regex));
-	if (regcomp(regex, arg, REG_EXTENDED | REG_NEWLINE))
+	if (regcomp(regex, arg, REG_EXTENDED | REG_NEWLINE)) {
+		free(regex);
 		return error(_("invalid regex given to -I: '%s'"), arg);
+	}
+
 	ALLOC_GROW(options->ignore_regex, options->ignore_regex_nr + 1,
 		   options->ignore_regex_alloc);
 	options->ignore_regex[options->ignore_regex_nr++] = regex;
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index 3855d68dbc..87d248d034 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -8,6 +8,7 @@ test_description='Various diff formatting options'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-diff.sh
 
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 18/22] builtin/format-patch: fix various trivial memory leaks
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
                     ` (16 preceding siblings ...)
  2024-08-14  6:52   ` [PATCH v4 17/22] diff: fix leak when parsing invalid ignore regex option Patrick Steinhardt
@ 2024-08-14  6:52   ` Patrick Steinhardt
  2024-08-14  6:52   ` [PATCH v4 19/22] userdiff: fix leaking memory for configured diff drivers Patrick Steinhardt
                     ` (3 subsequent siblings)
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:52 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

There are various memory leaks hit by git-format-patch(1). Basically all
of them are trivial, except that un-setting `diffopt.no_free` requires
us to unset the `diffopt.file` because we manually close it already.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/log.c           | 13 ++++++++++---
 t/t4014-format-patch.sh |  1 +
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index a73a767606..f5cb00c643 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1827,12 +1827,14 @@ static struct commit *get_base_commit(const struct format_config *cfg,
 				if (die_on_failure) {
 					die(_("failed to find exact merge base"));
 				} else {
+					free_commit_list(merge_base);
 					free(rev);
 					return NULL;
 				}
 			}
 
 			rev[i] = merge_base->item;
+			free_commit_list(merge_base);
 		}
 
 		if (rev_nr % 2)
@@ -2023,6 +2025,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	const char *rfc = NULL;
 	int creation_factor = -1;
 	const char *signature = git_version_string;
+	char *signature_to_free = NULL;
 	char *signature_file_arg = NULL;
 	struct keep_callback_data keep_callback_data = {
 		.cfg = &cfg,
@@ -2443,7 +2446,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 
 		if (strbuf_read_file(&buf, signature_file, 128) < 0)
 			die_errno(_("unable to read signature file '%s'"), signature_file);
-		signature = strbuf_detach(&buf, NULL);
+		signature = signature_to_free = strbuf_detach(&buf, NULL);
 	} else if (cfg.signature) {
 		signature = cfg.signature;
 	}
@@ -2548,12 +2551,13 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 			else
 				print_signature(signature, rev.diffopt.file);
 		}
-		if (output_directory)
+		if (output_directory) {
 			fclose(rev.diffopt.file);
+			rev.diffopt.file = NULL;
+		}
 	}
 	stop_progress(&progress);
 	free(list);
-	free(branch_name);
 	if (ignore_if_in_upstream)
 		free_patch_ids(&ids);
 
@@ -2565,11 +2569,14 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	strbuf_release(&rdiff_title);
 	free(description_file);
 	free(signature_file_arg);
+	free(signature_to_free);
+	free(branch_name);
 	free(to_free);
 	free(rev.message_id);
 	if (rev.ref_message_ids)
 		string_list_clear(rev.ref_message_ids, 0);
 	free(rev.ref_message_ids);
+	rev.diffopt.no_free = 0;
 	release_revisions(&rev);
 	format_config_release(&cfg);
 	return 0;
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index 884f83fb8a..1c46e963e4 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -8,6 +8,7 @@ test_description='various format-patch tests'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-terminal.sh
 
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 19/22] userdiff: fix leaking memory for configured diff drivers
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
                     ` (17 preceding siblings ...)
  2024-08-14  6:52   ` [PATCH v4 18/22] builtin/format-patch: fix various trivial memory leaks Patrick Steinhardt
@ 2024-08-14  6:52   ` Patrick Steinhardt
  2024-08-14  6:52   ` [PATCH v4 20/22] builtin/log: fix leak when showing converted blob contents Patrick Steinhardt
                     ` (2 subsequent siblings)
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:52 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

The userdiff structures may be initialized either statically on the
stack or dynamically via configuration keys. In the latter case we end
up leaking memory because we didn't have any infrastructure to discern
those strings which have been allocated statically and those which have
been allocated dynamically.

Refactor the code such that we have two pointers for each of these
strings: one that holds the value as accessed by other subsystems, and
one that points to the same string in case it has been allocated. Like
this, we can safely free the second pointer and thus plug those memory
leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 range-diff.c                     |  6 +++--
 t/t4018-diff-funcname.sh         |  1 +
 t/t4042-diff-textconv-caching.sh |  2 ++
 t/t4048-diff-combined-binary.sh  |  1 +
 t/t4209-log-pickaxe.sh           |  2 ++
 userdiff.c                       | 38 ++++++++++++++++++++++++--------
 userdiff.h                       |  4 ++++
 7 files changed, 43 insertions(+), 11 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index 5f01605550..bbb0952264 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -450,8 +450,10 @@ static void output_pair_header(struct diff_options *diffopt,
 }
 
 static struct userdiff_driver section_headers = {
-	.funcname = { "^ ## (.*) ##$\n"
-		      "^.?@@ (.*)$", REG_EXTENDED }
+	.funcname = {
+		.pattern = "^ ## (.*) ##$\n^.?@@ (.*)$",
+		.cflags = REG_EXTENDED,
+	},
 };
 
 static struct diff_filespec *get_filespec(const char *name, const char *p)
diff --git a/t/t4018-diff-funcname.sh b/t/t4018-diff-funcname.sh
index e026fac1f4..8128c30e7f 100755
--- a/t/t4018-diff-funcname.sh
+++ b/t/t4018-diff-funcname.sh
@@ -5,6 +5,7 @@
 
 test_description='Test custom diff function name patterns'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup' '
diff --git a/t/t4042-diff-textconv-caching.sh b/t/t4042-diff-textconv-caching.sh
index 8ebfa3c1be..a179205394 100755
--- a/t/t4042-diff-textconv-caching.sh
+++ b/t/t4042-diff-textconv-caching.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test textconv caching'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 cat >helper <<'EOF'
diff --git a/t/t4048-diff-combined-binary.sh b/t/t4048-diff-combined-binary.sh
index 0260cf64f5..f399484bce 100755
--- a/t/t4048-diff-combined-binary.sh
+++ b/t/t4048-diff-combined-binary.sh
@@ -4,6 +4,7 @@ test_description='combined and merge diff handle binary files and textconv'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup binary merge conflict' '
diff --git a/t/t4209-log-pickaxe.sh b/t/t4209-log-pickaxe.sh
index 64e1623733..b42fdc54fc 100755
--- a/t/t4209-log-pickaxe.sh
+++ b/t/t4209-log-pickaxe.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='log --grep/--author/--regexp-ignore-case/-S/-G'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_log () {
diff --git a/userdiff.c b/userdiff.c
index c4ebb9ff73..989629149f 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -399,8 +399,11 @@ static struct userdiff_driver *userdiff_find_by_namelen(const char *name, size_t
 static int parse_funcname(struct userdiff_funcname *f, const char *k,
 		const char *v, int cflags)
 {
-	if (git_config_string((char **) &f->pattern, k, v) < 0)
+	f->pattern = NULL;
+	FREE_AND_NULL(f->pattern_owned);
+	if (git_config_string(&f->pattern_owned, k, v) < 0)
 		return -1;
+	f->pattern = f->pattern_owned;
 	f->cflags = cflags;
 	return 0;
 }
@@ -444,20 +447,37 @@ int userdiff_config(const char *k, const char *v)
 		return parse_funcname(&drv->funcname, k, v, REG_EXTENDED);
 	if (!strcmp(type, "binary"))
 		return parse_tristate(&drv->binary, k, v);
-	if (!strcmp(type, "command"))
-		return git_config_string((char **) &drv->external.cmd, k, v);
+	if (!strcmp(type, "command")) {
+		FREE_AND_NULL(drv->external.cmd);
+		return git_config_string(&drv->external.cmd, k, v);
+	}
 	if (!strcmp(type, "trustexitcode")) {
 		drv->external.trust_exit_code = git_config_bool(k, v);
 		return 0;
 	}
-	if (!strcmp(type, "textconv"))
-		return git_config_string((char **) &drv->textconv, k, v);
+	if (!strcmp(type, "textconv")) {
+		int ret;
+		FREE_AND_NULL(drv->textconv_owned);
+		ret = git_config_string(&drv->textconv_owned, k, v);
+		drv->textconv = drv->textconv_owned;
+		return ret;
+	}
 	if (!strcmp(type, "cachetextconv"))
 		return parse_bool(&drv->textconv_want_cache, k, v);
-	if (!strcmp(type, "wordregex"))
-		return git_config_string((char **) &drv->word_regex, k, v);
-	if (!strcmp(type, "algorithm"))
-		return git_config_string((char **) &drv->algorithm, k, v);
+	if (!strcmp(type, "wordregex")) {
+		int ret;
+		FREE_AND_NULL(drv->word_regex_owned);
+		ret = git_config_string(&drv->word_regex_owned, k, v);
+		drv->word_regex = drv->word_regex_owned;
+		return ret;
+	}
+	if (!strcmp(type, "algorithm")) {
+		int ret;
+		FREE_AND_NULL(drv->algorithm_owned);
+		ret = git_config_string(&drv->algorithm_owned, k, v);
+		drv->algorithm = drv->algorithm_owned;
+		return ret;
+	}
 
 	return 0;
 }
diff --git a/userdiff.h b/userdiff.h
index 7565930337..827361b0bc 100644
--- a/userdiff.h
+++ b/userdiff.h
@@ -8,6 +8,7 @@ struct repository;
 
 struct userdiff_funcname {
 	const char *pattern;
+	char *pattern_owned;
 	int cflags;
 };
 
@@ -20,11 +21,14 @@ struct userdiff_driver {
 	const char *name;
 	struct external_diff external;
 	const char *algorithm;
+	char *algorithm_owned;
 	int binary;
 	struct userdiff_funcname funcname;
 	const char *word_regex;
+	char *word_regex_owned;
 	const char *word_regex_multi_byte;
 	const char *textconv;
+	char *textconv_owned;
 	struct notes_cache *textconv_cache;
 	int textconv_want_cache;
 };
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 20/22] builtin/log: fix leak when showing converted blob contents
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
                     ` (18 preceding siblings ...)
  2024-08-14  6:52   ` [PATCH v4 19/22] userdiff: fix leaking memory for configured diff drivers Patrick Steinhardt
@ 2024-08-14  6:52   ` Patrick Steinhardt
  2024-08-14  6:52   ` [PATCH v4 21/22] diff: free state populated via options Patrick Steinhardt
  2024-08-14  6:52   ` [PATCH v4 22/22] builtin/diff: free symmetric diff members Patrick Steinhardt
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:52 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

In `show_blob_object()`, we proactively call `textconv_object()`. In
case we have a textconv driver for this blob we will end up showing the
converted contents, otherwise we'll show the un-converted contents of it
instead.

When the object has been converted we never free the buffer containing
the converted contents. Fix this to plug this memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/log.c            | 1 +
 t/t4030-diff-textconv.sh | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/builtin/log.c b/builtin/log.c
index f5cb00c643..36769bab3b 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -707,6 +707,7 @@ static int show_blob_object(const struct object_id *oid, struct rev_info *rev, c
 
 	write_or_die(1, buf, size);
 	object_context_release(&obj_context);
+	free(buf);
 	return 0;
 }
 
diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh
index a39a626664..29f6d610c2 100755
--- a/t/t4030-diff-textconv.sh
+++ b/t/t4030-diff-textconv.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='diff.*.textconv tests'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 find_diff() {
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 21/22] diff: free state populated via options
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
                     ` (19 preceding siblings ...)
  2024-08-14  6:52   ` [PATCH v4 20/22] builtin/log: fix leak when showing converted blob contents Patrick Steinhardt
@ 2024-08-14  6:52   ` Patrick Steinhardt
  2024-08-14  6:52   ` [PATCH v4 22/22] builtin/diff: free symmetric diff members Patrick Steinhardt
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:52 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

The `objfind` and `anchors` members of `struct diff_options` are
populated via option parsing, but are never freed in `diff_free()`. Fix
this to plug those memory leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 diff.c                   | 10 ++++++++++
 t/t4064-diff-oidfind.sh  |  2 ++
 t/t4065-diff-anchored.sh |  1 +
 t/t4069-remerge-diff.sh  |  1 +
 4 files changed, 14 insertions(+)

diff --git a/diff.c b/diff.c
index 9251c47b72..4035a9374d 100644
--- a/diff.c
+++ b/diff.c
@@ -6717,6 +6717,16 @@ void diff_free(struct diff_options *options)
 	if (options->no_free)
 		return;
 
+	if (options->objfind) {
+		oidset_clear(options->objfind);
+		FREE_AND_NULL(options->objfind);
+	}
+
+	for (size_t i = 0; i < options->anchors_nr; i++)
+		free(options->anchors[i]);
+	FREE_AND_NULL(options->anchors);
+	options->anchors_nr = options->anchors_alloc = 0;
+
 	diff_free_file(options);
 	diff_free_ignore_regex(options);
 	clear_pathspec(&options->pathspec);
diff --git a/t/t4064-diff-oidfind.sh b/t/t4064-diff-oidfind.sh
index 6d8c8986fc..846f285f77 100755
--- a/t/t4064-diff-oidfind.sh
+++ b/t/t4064-diff-oidfind.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test finding specific blobs in the revision walking'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup ' '
diff --git a/t/t4065-diff-anchored.sh b/t/t4065-diff-anchored.sh
index b3f510f040..647537c12e 100755
--- a/t/t4065-diff-anchored.sh
+++ b/t/t4065-diff-anchored.sh
@@ -2,6 +2,7 @@
 
 test_description='anchored diff algorithm'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success '--anchored' '
diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
index 07323ebafe..888714bbd3 100755
--- a/t/t4069-remerge-diff.sh
+++ b/t/t4069-remerge-diff.sh
@@ -2,6 +2,7 @@
 
 test_description='remerge-diff handling'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # This test is ort-specific
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v4 22/22] builtin/diff: free symmetric diff members
  2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
                     ` (20 preceding siblings ...)
  2024-08-14  6:52   ` [PATCH v4 21/22] diff: free state populated via options Patrick Steinhardt
@ 2024-08-14  6:52   ` Patrick Steinhardt
  21 siblings, 0 replies; 146+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  6:52 UTC (permalink / raw)
  To: git; +Cc: James Liu, karthik nayak, Phillip Wood, Junio C Hamano,
	Taylor Blau

We populate a `struct symdiff` in case the user has requested a
symmetric diff. Part of this is to populate a `skip` bitmap that
indicates which commits shall be ignored in the diff. But while this
bitmap is dynamically allocated, we never free it.

Fix this by introducing and calling a new `symdiff_release()` function
that does this for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/diff.c                       | 6 ++++++
 t/t4068-diff-symmetric-merge-base.sh | 1 +
 t/t4108-apply-threeway.sh            | 1 +
 3 files changed, 8 insertions(+)

diff --git a/builtin/diff.c b/builtin/diff.c
index 9b6cdabe15..6eac445579 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -388,6 +388,11 @@ static void symdiff_prepare(struct rev_info *rev, struct symdiff *sym)
 	sym->skip = map;
 }
 
+static void symdiff_release(struct symdiff *sdiff)
+{
+	bitmap_free(sdiff->skip);
+}
+
 int cmd_diff(int argc, const char **argv, const char *prefix)
 {
 	int i;
@@ -619,6 +624,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
 		refresh_index_quietly();
 	release_revisions(&rev);
 	object_array_clear(&ent);
+	symdiff_release(&sdiff);
 	UNLEAK(blob);
 	return result;
 }
diff --git a/t/t4068-diff-symmetric-merge-base.sh b/t/t4068-diff-symmetric-merge-base.sh
index eff63c16b0..4d6565e728 100755
--- a/t/t4068-diff-symmetric-merge-base.sh
+++ b/t/t4068-diff-symmetric-merge-base.sh
@@ -5,6 +5,7 @@ test_description='behavior of diff with symmetric-diff setups and --merge-base'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # build these situations:
diff --git a/t/t4108-apply-threeway.sh b/t/t4108-apply-threeway.sh
index c558282bc0..3211e1e65f 100755
--- a/t/t4108-apply-threeway.sh
+++ b/t/t4108-apply-threeway.sh
@@ -5,6 +5,7 @@ test_description='git apply --3way'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 print_sanitized_conflicted_diff () {
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* Re: [PATCH v3 22/22] builtin/diff: free symmetric diff members
  2024-08-14  5:01       ` Patrick Steinhardt
@ 2024-08-14 15:28         ` Junio C Hamano
  0 siblings, 0 replies; 146+ messages in thread
From: Junio C Hamano @ 2024-08-14 15:28 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, James Liu, karthik nayak, Phillip Wood, Taylor Blau

Patrick Steinhardt <ps@pks.im> writes:

> Good point. It does make sense for `_free()` functions to handle NULL
> pointers, but doesn't quite for `_release()` ones.

I agree that foo_free() should accept NULL and silently become a
no-op.  I do not care deeply whether foo_release() did the same, or
not, as long as all *_release()s behave the same way.  Maybe it is
more convenient if they ignored NULL, as I have a hunch that feeding
a NULL pointer to foo_release() is unlikely to be a bug.

Since we documented our aspiration to use these (and foo_clear())
consistently, we may #leftoverbits want to also document the calling
convention as well.

>> And symdiff_prepare() at least clears its .skip member to NULL, so
>> this pre-initialization is probably not needed.  If we are preparing
>> ourselves for future changes of the flow in this function (e.g.
>> goto's that jump to the clean-up label from which symdiff_release()
>> is always called, even when we did not call symdiff_prepare() on
>> this thing), this is probably not sufficient to convey that
>> intention (instead I'd use an explicit ".skip = NULL" to say "we
>> might not even call _prepare() but this one is prepared to be passed
>> to _release() even in such a case").
>> 
>> Given that there is no such goto exists, and that _prepare() always
>> sets up the .skip member appropriately, I wonder if we are much
>> better off leaving sdiff uninitialized at the declaration site here.
>> If we add such a goto that bypasses _prepare() in the future, the
>> compiler will notice that we are passing an uninitialized sdiff to
>> _release(), no?
>
> You'd hope it does, but it certainly depends on your compiler flags.
> Various hardening flags for example implicitly initialize variables, and
> I have a feeling that this also causes them to not emit any warnings
> anymore. At least I only spot such warnings in CI.

Yeah, that is a sad fact in the real world X-<.  To be defensive, I
think an explicit "{ .skip = NULL }" or "{ 0 }" would not be too bad
and may even serve as a good reminder for developers who may want to
jump over the call to _prepare() in the future.

The explicit ".skip = NULL" says "we know it is safe to call
_release() with a struct that hasn't gone through _prepare(), as
long as its .skip member is cleared", but the story "{ 0 }" tells us
is not much more than "we clear just like everybody else", and that
is why I suggested the former (iow, I know both mean the same thing
to the C compiler---I just care more about what it tells the human
readers).

Thanks.


^ permalink raw reply	[flat|nested] 146+ messages in thread

end of thread, other threads:[~2024-08-14 15:28 UTC | newest]

Thread overview: 146+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-06  8:59 [PATCH 00/22] Memory leak fixes (pt.4) Patrick Steinhardt
2024-08-06  8:59 ` [PATCH 01/22] remote: plug memory leak when aliasing URLs Patrick Steinhardt
2024-08-06  8:59 ` [PATCH 02/22] git: fix leaking system paths Patrick Steinhardt
2024-08-07  4:02   ` James Liu
2024-08-06  8:59 ` [PATCH 03/22] object-file: fix memory leak when reading corrupted headers Patrick Steinhardt
2024-08-06  8:59 ` [PATCH 04/22] object-name: fix leaking symlink paths in object context Patrick Steinhardt
2024-08-06  8:59 ` [PATCH 05/22] bulk-checkin: fix leaking state TODO Patrick Steinhardt
2024-08-06  9:00 ` [PATCH 06/22] read-cache: fix leaking hashfile when writing index fails Patrick Steinhardt
2024-08-07  7:01   ` James Liu
2024-08-08  5:04     ` Patrick Steinhardt
2024-08-06  9:00 ` [PATCH 07/22] submodule-config: fix leaking name enrty when traversing submodules Patrick Steinhardt
2024-08-06  9:00 ` [PATCH 08/22] config: fix leaking comment character config Patrick Steinhardt
2024-08-07  7:11   ` James Liu
2024-08-08  5:04     ` Patrick Steinhardt
2024-08-08 15:54       ` Junio C Hamano
2024-08-06  9:00 ` [PATCH 09/22] builtin/rebase: fix leaking `commit.gpgsign` value Patrick Steinhardt
2024-08-07  7:32   ` James Liu
2024-08-08  5:05     ` Patrick Steinhardt
2024-08-08 10:07   ` Phillip Wood
2024-08-08 12:58     ` Patrick Steinhardt
2024-08-06  9:00 ` [PATCH 10/22] builtin/notes: fix leaking `struct notes_tree` when merging notes Patrick Steinhardt
2024-08-06  9:00 ` [PATCH 11/22] builtin/fast-import: plug trivial memory leaks Patrick Steinhardt
2024-08-06  9:00 ` [PATCH 12/22] builtin/fast-export: fix leaking diff options Patrick Steinhardt
2024-08-06  9:00 ` [PATCH 13/22] builtin/fast-export: plug leaking tag names Patrick Steinhardt
2024-08-07  8:31   ` James Liu
2024-08-08  5:05     ` Patrick Steinhardt
2024-08-06  9:00 ` [PATCH 14/22] merge-ort: unconditionally release attributes index Patrick Steinhardt
2024-08-06  9:00 ` [PATCH 15/22] sequencer: release todo list on error paths Patrick Steinhardt
2024-08-08 10:08   ` Phillip Wood
2024-08-08 16:31     ` Junio C Hamano
2024-08-06  9:00 ` [PATCH 16/22] unpack-trees: clear index when not propagating it Patrick Steinhardt
2024-08-06  9:00 ` [PATCH 17/22] diff: fix leak when parsing invalid ignore regex option Patrick Steinhardt
2024-08-06  9:00 ` [PATCH 18/22] builtin/format-patch: fix various trivial memory leaks Patrick Steinhardt
2024-08-07  8:51   ` James Liu
2024-08-08  5:05     ` Patrick Steinhardt
2024-08-06  9:01 ` [PATCH 19/22] userdiff: fix leaking memory for configured diff drivers Patrick Steinhardt
2024-08-07  9:25   ` James Liu
2024-08-08  5:05     ` Patrick Steinhardt
2024-08-08 16:05       ` Junio C Hamano
2024-08-06  9:01 ` [PATCH 20/22] builtin/log: fix leak when showing converted blob contents Patrick Steinhardt
2024-08-06  9:01 ` [PATCH 21/22] diff: free state populated via options Patrick Steinhardt
2024-08-06  9:01 ` [PATCH 22/22] builtin/diff: free symmetric diff members Patrick Steinhardt
2024-08-07  9:27 ` [PATCH 00/22] Memory leak fixes (pt.4) James Liu
2024-08-08  5:05   ` Patrick Steinhardt
2024-08-08  6:00     ` James Liu
2024-08-07 16:59 ` Junio C Hamano
2024-08-07 17:03   ` Patrick Steinhardt
2024-08-08  0:32     ` Junio C Hamano
2024-08-08 13:04 ` [PATCH v2 " Patrick Steinhardt
2024-08-08 13:04   ` [PATCH v2 01/22] remote: plug memory leak when aliasing URLs Patrick Steinhardt
2024-08-12  8:27     ` karthik nayak
2024-08-12 14:08     ` Taylor Blau
2024-08-12 14:37     ` Jeff King
2024-08-13  6:34       ` Patrick Steinhardt
2024-08-08 13:04   ` [PATCH v2 02/22] git: fix leaking system paths Patrick Steinhardt
2024-08-12 14:11     ` Taylor Blau
2024-08-13  6:30       ` Patrick Steinhardt
2024-08-13 16:02         ` Junio C Hamano
2024-08-08 13:04   ` [PATCH v2 03/22] object-file: fix memory leak when reading corrupted headers Patrick Steinhardt
2024-08-12  8:43     ` karthik nayak
2024-08-08 13:04   ` [PATCH v2 04/22] object-name: fix leaking symlink paths in object context Patrick Steinhardt
2024-08-08 13:04   ` [PATCH v2 05/22] bulk-checkin: fix leaking state TODO Patrick Steinhardt
2024-08-08 13:04   ` [PATCH v2 06/22] read-cache: fix leaking hashfile when writing index fails Patrick Steinhardt
2024-08-08 13:05   ` [PATCH v2 07/22] submodule-config: fix leaking name enrty when traversing submodules Patrick Steinhardt
2024-08-08 13:05   ` [PATCH v2 08/22] config: fix leaking comment character config Patrick Steinhardt
2024-08-08 17:12     ` Junio C Hamano
2024-08-12  7:45       ` Patrick Steinhardt
2024-08-12 20:32         ` Junio C Hamano
2024-08-13  6:54           ` Patrick Steinhardt
2024-08-08 13:05   ` [PATCH v2 09/22] builtin/rebase: fix leaking `commit.gpgsign` value Patrick Steinhardt
2024-08-08 13:05   ` [PATCH v2 10/22] builtin/notes: fix leaking `struct notes_tree` when merging notes Patrick Steinhardt
2024-08-08 13:05   ` [PATCH v2 11/22] builtin/fast-import: plug trivial memory leaks Patrick Steinhardt
2024-08-08 13:05   ` [PATCH v2 12/22] builtin/fast-export: fix leaking diff options Patrick Steinhardt
2024-08-12  9:05     ` karthik nayak
2024-08-08 13:05   ` [PATCH v2 13/22] builtin/fast-export: plug leaking tag names Patrick Steinhardt
2024-08-08 13:05   ` [PATCH v2 14/22] merge-ort: unconditionally release attributes index Patrick Steinhardt
2024-08-08 13:05   ` [PATCH v2 15/22] sequencer: release todo list on error paths Patrick Steinhardt
2024-08-08 13:05   ` [PATCH v2 16/22] unpack-trees: clear index when not propagating it Patrick Steinhardt
2024-08-08 13:05   ` [PATCH v2 17/22] diff: fix leak when parsing invalid ignore regex option Patrick Steinhardt
2024-08-08 13:05   ` [PATCH v2 18/22] builtin/format-patch: fix various trivial memory leaks Patrick Steinhardt
2024-08-08 13:05   ` [PATCH v2 19/22] userdiff: fix leaking memory for configured diff drivers Patrick Steinhardt
2024-08-08 13:06   ` [PATCH v2 20/22] builtin/log: fix leak when showing converted blob contents Patrick Steinhardt
2024-08-08 13:06   ` [PATCH v2 21/22] diff: free state populated via options Patrick Steinhardt
2024-08-08 13:06   ` [PATCH v2 22/22] builtin/diff: free symmetric diff members Patrick Steinhardt
2024-08-12  9:12     ` karthik nayak
2024-08-12  9:13   ` [PATCH v2 00/22] Memory leak fixes (pt.4) karthik nayak
2024-08-12 15:49     ` Junio C Hamano
2024-08-13  6:27       ` Patrick Steinhardt
2024-08-12 14:01   ` Phillip Wood
2024-08-12 15:50     ` Junio C Hamano
2024-08-13  9:31 ` [PATCH v3 " Patrick Steinhardt
2024-08-13  9:31   ` [PATCH v3 01/22] remote: plug memory leak when aliasing URLs Patrick Steinhardt
2024-08-13  9:31   ` [PATCH v3 02/22] git: fix leaking system paths Patrick Steinhardt
2024-08-13  9:31   ` [PATCH v3 03/22] object-file: fix memory leak when reading corrupted headers Patrick Steinhardt
2024-08-13  9:31   ` [PATCH v3 04/22] object-name: fix leaking symlink paths in object context Patrick Steinhardt
2024-08-13  9:31   ` [PATCH v3 05/22] bulk-checkin: fix leaking state TODO Patrick Steinhardt
2024-08-13  9:31   ` [PATCH v3 06/22] read-cache: fix leaking hashfile when writing index fails Patrick Steinhardt
2024-08-13  9:31   ` [PATCH v3 07/22] submodule-config: fix leaking name entry when traversing submodules Patrick Steinhardt
2024-08-13  9:31   ` [PATCH v3 08/22] config: fix leaking comment character config Patrick Steinhardt
2024-08-13  9:31   ` [PATCH v3 09/22] builtin/rebase: fix leaking `commit.gpgsign` value Patrick Steinhardt
2024-08-13  9:31   ` [PATCH v3 10/22] builtin/notes: fix leaking `struct notes_tree` when merging notes Patrick Steinhardt
2024-08-13  9:31   ` [PATCH v3 11/22] builtin/fast-import: plug trivial memory leaks Patrick Steinhardt
2024-08-13  9:31   ` [PATCH v3 12/22] builtin/fast-export: fix leaking diff options Patrick Steinhardt
2024-08-13 16:34     ` Junio C Hamano
2024-08-14  4:49       ` Patrick Steinhardt
2024-08-13  9:31   ` [PATCH v3 13/22] builtin/fast-export: plug leaking tag names Patrick Steinhardt
2024-08-13  9:31   ` [PATCH v3 14/22] merge-ort: unconditionally release attributes index Patrick Steinhardt
2024-08-13  9:31   ` [PATCH v3 15/22] sequencer: release todo list on error paths Patrick Steinhardt
2024-08-13  9:31   ` [PATCH v3 16/22] unpack-trees: clear index when not propagating it Patrick Steinhardt
2024-08-13  9:31   ` [PATCH v3 17/22] diff: fix leak when parsing invalid ignore regex option Patrick Steinhardt
2024-08-13  9:32   ` [PATCH v3 18/22] builtin/format-patch: fix various trivial memory leaks Patrick Steinhardt
2024-08-13 16:55     ` Junio C Hamano
2024-08-14  4:56       ` Patrick Steinhardt
2024-08-13 16:55     ` Junio C Hamano
2024-08-13  9:32   ` [PATCH v3 19/22] userdiff: fix leaking memory for configured diff drivers Patrick Steinhardt
2024-08-13  9:32   ` [PATCH v3 20/22] builtin/log: fix leak when showing converted blob contents Patrick Steinhardt
2024-08-13  9:32   ` [PATCH v3 21/22] diff: free state populated via options Patrick Steinhardt
2024-08-13 16:31     ` Junio C Hamano
2024-08-13  9:32   ` [PATCH v3 22/22] builtin/diff: free symmetric diff members Patrick Steinhardt
2024-08-13 16:25     ` Junio C Hamano
2024-08-14  5:01       ` Patrick Steinhardt
2024-08-14 15:28         ` Junio C Hamano
2024-08-13 16:58   ` [PATCH v3 00/22] Memory leak fixes (pt.4) Junio C Hamano
2024-08-14  6:51 ` [PATCH v4 " Patrick Steinhardt
2024-08-14  6:51   ` [PATCH v4 01/22] remote: plug memory leak when aliasing URLs Patrick Steinhardt
2024-08-14  6:51   ` [PATCH v4 02/22] git: fix leaking system paths Patrick Steinhardt
2024-08-14  6:51   ` [PATCH v4 03/22] object-file: fix memory leak when reading corrupted headers Patrick Steinhardt
2024-08-14  6:52   ` [PATCH v4 04/22] object-name: fix leaking symlink paths in object context Patrick Steinhardt
2024-08-14  6:52   ` [PATCH v4 05/22] bulk-checkin: fix leaking state TODO Patrick Steinhardt
2024-08-14  6:52   ` [PATCH v4 06/22] read-cache: fix leaking hashfile when writing index fails Patrick Steinhardt
2024-08-14  6:52   ` [PATCH v4 07/22] submodule-config: fix leaking name entry when traversing submodules Patrick Steinhardt
2024-08-14  6:52   ` [PATCH v4 08/22] config: fix leaking comment character config Patrick Steinhardt
2024-08-14  6:52   ` [PATCH v4 09/22] builtin/rebase: fix leaking `commit.gpgsign` value Patrick Steinhardt
2024-08-14  6:52   ` [PATCH v4 10/22] builtin/notes: fix leaking `struct notes_tree` when merging notes Patrick Steinhardt
2024-08-14  6:52   ` [PATCH v4 11/22] builtin/fast-import: plug trivial memory leaks Patrick Steinhardt
2024-08-14  6:52   ` [PATCH v4 12/22] builtin/fast-export: fix leaking diff options Patrick Steinhardt
2024-08-14  6:52   ` [PATCH v4 13/22] builtin/fast-export: plug leaking tag names Patrick Steinhardt
2024-08-14  6:52   ` [PATCH v4 14/22] merge-ort: unconditionally release attributes index Patrick Steinhardt
2024-08-14  6:52   ` [PATCH v4 15/22] sequencer: release todo list on error paths Patrick Steinhardt
2024-08-14  6:52   ` [PATCH v4 16/22] unpack-trees: clear index when not propagating it Patrick Steinhardt
2024-08-14  6:52   ` [PATCH v4 17/22] diff: fix leak when parsing invalid ignore regex option Patrick Steinhardt
2024-08-14  6:52   ` [PATCH v4 18/22] builtin/format-patch: fix various trivial memory leaks Patrick Steinhardt
2024-08-14  6:52   ` [PATCH v4 19/22] userdiff: fix leaking memory for configured diff drivers Patrick Steinhardt
2024-08-14  6:52   ` [PATCH v4 20/22] builtin/log: fix leak when showing converted blob contents Patrick Steinhardt
2024-08-14  6:52   ` [PATCH v4 21/22] diff: free state populated via options Patrick Steinhardt
2024-08-14  6:52   ` [PATCH v4 22/22] builtin/diff: free symmetric diff members Patrick Steinhardt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).