* [PATCH 00/23] Memory leak fixes (pt.8)
@ 2024-09-30 9:13 Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 01/23] builtin/annotate: fix leaking args vector Patrick Steinhardt
` (22 more replies)
0 siblings, 23 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:13 UTC (permalink / raw)
To: git
Hi,
this is the 8th series of memory leak fixes. As usual, it fixes memory
leaks all over the place. Once applied, we're down to 35 test suites
that do not pass with the memory leak sanitizer.
This series is built on top of 3857aae53f (Git 2.47-rc0, 2024-09-25)
with the following two topisc merged into it:
- ps/leakfixes-part-7 at 12dfc2475c (diffcore-break: fix leaking
filespecs when merging broken pairs, 2024-09-26).
- jk/http-leakfixes at f4c768c639 (http-push: clean up local_refs at
exit, 2024-09-24).
Both of these topics are part of `next`.
Thanks!
Patrick
Patrick Steinhardt (23):
builtin/annotate: fix leaking args vector
read-cache: fix leaking hash context in `do_write_index()`
scalar: fix leaking repositories
shell: fix leaking strings
wt-status: fix leaking buffer with sparse directories
submodule: fix leaking submodule entry list
builtin/stash: fix leaking `pathspec_from_file`
builtin/pack-redundant: fix various memory leaks
builtin/clone: fix leaking repo state when cloning with bundle URIs
t/helper: fix leaking repository in partial-clone helper
builtin/revert: fix leaking `gpg_sign` and `strategy` config
diff: improve lifecycle management of diff queues
line-log: fix several memory leaks
pseudo-merge: fix various memory leaks
pseudo-merge: fix leaking strmap keys
pack-bitmap-write: fix leaking OID array
midx-write: fix leaking buffer
revision: fix memory leaks when rewriting parents
revision: fix leaking saved parents
pack-write: fix return parameter of `write_rev_file_order()`
t/helper: fix leaks in proc-receive helper
remote: fix leaking push reports
builtin/send-pack: fix leaking list of push options
bloom.c | 8 +--
branch.c | 8 ++-
builtin/annotate.c | 20 +++++--
builtin/clone.c | 27 +++++++++
builtin/index-pack.c | 7 +--
builtin/pack-redundant.c | 40 +++++++++++--
builtin/receive-pack.c | 5 +-
builtin/revert.c | 17 ++++--
builtin/send-pack.c | 1 +
builtin/stash.c | 4 +-
diff.c | 22 +++----
diffcore-break.c | 8 +--
diffcore-pickaxe.c | 4 +-
diffcore-rename.c | 3 +-
diffcore-rotate.c | 3 +-
diffcore.h | 10 ++--
line-log.c | 69 +++++++++++++---------
log-tree.c | 4 +-
merge-ort.c | 2 +-
midx-write.c | 5 +-
pack-bitmap-write.c | 9 +++
pack-bitmap.c | 4 +-
pack-write.c | 42 +++++++------
pack.h | 4 +-
pseudo-merge.c | 23 +++++++-
pseudo-merge.h | 2 +
read-cache.c | 1 +
remote.c | 15 +++++
remote.h | 4 +-
revision.c | 14 ++++-
scalar.c | 1 +
shell.c | 6 +-
submodule-config.c | 15 ++++-
submodule-config.h | 3 +
t/helper/test-partial-clone.c | 2 +
t/helper/test-proc-receive.c | 7 +++
t/t0410-partial-clone.sh | 1 +
t/t1092-sparse-checkout-compatibility.sh | 1 +
t/t3207-branch-submodule.sh | 1 +
t/t3427-rebase-subtree.sh | 1 +
t/t3514-cherry-pick-revert-gpg.sh | 1 +
t/t3909-stash-pathspec-file.sh | 1 +
t/t5323-pack-redundant.sh | 1 +
t/t5327-multi-pack-bitmaps-rev.sh | 1 +
t/t5333-pseudo-merge-bitmaps.sh | 1 +
t/t5334-incremental-multi-pack-index.sh | 2 +
t/t5411-proc-receive-hook.sh | 1 +
t/t5545-push-options.sh | 1 +
t/t5730-protocol-v2-bundle-uri-file.sh | 1 +
t/t5731-protocol-v2-bundle-uri-git.sh | 1 +
t/t5732-protocol-v2-bundle-uri-http.sh | 1 +
t/t6012-rev-list-simplify.sh | 1 +
t/t6016-rev-list-graph-simplify-history.sh | 1 +
t/t7003-filter-branch.sh | 1 +
t/t8001-annotate.sh | 1 +
t/t9210-scalar.sh | 1 +
t/t9211-scalar-clone.sh | 1 +
t/t9350-fast-export.sh | 1 +
t/t9400-git-cvsserver-server.sh | 1 +
t/t9402-git-cvsserver-refs.sh | 1 +
t/t9850-shell.sh | 2 +
wt-status.c | 6 +-
62 files changed, 329 insertions(+), 123 deletions(-)
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 01/23] builtin/annotate: fix leaking args vector
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
@ 2024-09-30 9:13 ` Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 02/23] read-cache: fix leaking hash context in `do_write_index()` Patrick Steinhardt
` (21 subsequent siblings)
22 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:13 UTC (permalink / raw)
To: git
We're leaking the args vector in git-annotate(1) because we never clear
it. Fixing it isn't as easy as calling `strvec_clear()` though because
calling `cmd_blame()` will cause the underlying array to be modified.
Instead, we also need to pass a shallow copy of the argv array to the
function.
Do so to plug the memory leaks.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/annotate.c | 20 +++++++++++++++-----
t/t8001-annotate.sh | 1 +
2 files changed, 16 insertions(+), 5 deletions(-)
diff --git a/builtin/annotate.c b/builtin/annotate.c
index a99179fe4d..03413c7df8 100644
--- a/builtin/annotate.c
+++ b/builtin/annotate.c
@@ -15,13 +15,23 @@ int cmd_annotate(int argc,
struct repository *repo UNUSED)
{
struct strvec args = STRVEC_INIT;
- int i;
+ const char **args_copy;
+ int ret;
strvec_pushl(&args, "annotate", "-c", NULL);
-
- for (i = 1; i < argc; i++) {
+ for (int i = 1; i < argc; i++)
strvec_push(&args, argv[i]);
- }
- return cmd_blame(args.nr, args.v, prefix, the_repository);
+ /*
+ * `cmd_blame()` ends up modifying the array, which causes memory leaks
+ * if we didn't copy the array here.
+ */
+ CALLOC_ARRAY(args_copy, args.nr + 1);
+ COPY_ARRAY(args_copy, args.v, args.nr);
+
+ ret = cmd_blame(args.nr, args_copy, prefix, the_repository);
+
+ strvec_clear(&args);
+ free(args_copy);
+ return ret;
}
diff --git a/t/t8001-annotate.sh b/t/t8001-annotate.sh
index d7167f5539..d434698919 100755
--- a/t/t8001-annotate.sh
+++ b/t/t8001-annotate.sh
@@ -5,6 +5,7 @@ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
TEST_CREATE_REPO_NO_TEMPLATE=1
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
PROG='git annotate'
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 02/23] read-cache: fix leaking hash context in `do_write_index()`
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 01/23] builtin/annotate: fix leaking args vector Patrick Steinhardt
@ 2024-09-30 9:13 ` Patrick Steinhardt
2024-09-30 14:05 ` Derrick Stolee
2024-09-30 9:13 ` [PATCH 03/23] scalar: fix leaking repositories Patrick Steinhardt
` (20 subsequent siblings)
22 siblings, 1 reply; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:13 UTC (permalink / raw)
To: git
When writing an index with the EOIE extension we allocate a separate
hash context. We never free that context though, causing a memory leak.
Plug it.
This leak is exposed by t9210, but plugging it alone does not make the
whole test suite pass.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
read-cache.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/read-cache.c b/read-cache.c
index 764fdfec46..0fb5e0d372 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -3124,6 +3124,7 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
if (f)
free_hashfile(f);
strbuf_release(&sb);
+ free(eoie_c);
free(ieot);
return ret;
}
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 03/23] scalar: fix leaking repositories
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 01/23] builtin/annotate: fix leaking args vector Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 02/23] read-cache: fix leaking hash context in `do_write_index()` Patrick Steinhardt
@ 2024-09-30 9:13 ` Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 04/23] shell: fix leaking strings Patrick Steinhardt
` (19 subsequent siblings)
22 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:13 UTC (permalink / raw)
To: git
In the scalar code we iterate through multiple repositories,
initializing each of them. We never clear them though, causing memory
leaks. Plug them.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
scalar.c | 1 +
t/t9210-scalar.sh | 1 +
t/t9211-scalar-clone.sh | 1 +
3 files changed, 3 insertions(+)
diff --git a/scalar.c b/scalar.c
index 09560aeab5..ede616ad4f 100644
--- a/scalar.c
+++ b/scalar.c
@@ -732,6 +732,7 @@ static int cmd_reconfigure(int argc, const char **argv)
succeeded = 1;
the_repository = old_repo;
+ repo_clear(&r);
loop_end:
if (!succeeded) {
diff --git a/t/t9210-scalar.sh b/t/t9210-scalar.sh
index e8613990e1..a131a6c029 100755
--- a/t/t9210-scalar.sh
+++ b/t/t9210-scalar.sh
@@ -2,6 +2,7 @@
test_description='test the `scalar` command'
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
GIT_TEST_MAINT_SCHEDULER="crontab:test-tool crontab cron.txt,launchctl:true,schtasks:true"
diff --git a/t/t9211-scalar-clone.sh b/t/t9211-scalar-clone.sh
index 7869f45ee6..c16ea67c1d 100755
--- a/t/t9211-scalar-clone.sh
+++ b/t/t9211-scalar-clone.sh
@@ -2,6 +2,7 @@
test_description='test the `scalar clone` subcommand'
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
. "${TEST_DIRECTORY}/lib-terminal.sh"
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 04/23] shell: fix leaking strings
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
` (2 preceding siblings ...)
2024-09-30 9:13 ` [PATCH 03/23] scalar: fix leaking repositories Patrick Steinhardt
@ 2024-09-30 9:13 ` Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 05/23] wt-status: fix leaking buffer with sparse directories Patrick Steinhardt
` (18 subsequent siblings)
22 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:13 UTC (permalink / raw)
To: git
There are two memory leaks in "shell.c". The first one in `run_shell()`
is trivial and fixed without further explanation. The second one in
`cmd_main()` happens because we overwrite the `prog` variable, which
contains an allocated string. In fact though, the memory pointed to by
that variable is still in use because we use `split_cmdline()`, which
may create pointers into the middle of that string. But as we do not
have a direct pointer to the head of the allocated string anymore, we
get a complaint by the leak checker.
Address this by not overwriting the `prog` pointer.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
shell.c | 6 +++---
t/t9400-git-cvsserver-server.sh | 1 +
t/t9850-shell.sh | 2 ++
3 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/shell.c b/shell.c
index 2ece8b16e2..76333c8068 100644
--- a/shell.c
+++ b/shell.c
@@ -143,6 +143,7 @@ static void run_shell(void)
}
free(argv);
+ free(split_args);
free(rawargs);
} while (!done);
}
@@ -216,9 +217,8 @@ int cmd_main(int argc, const char **argv)
count = split_cmdline(prog, &user_argv);
if (count >= 0) {
if (is_valid_cmd_name(user_argv[0])) {
- prog = make_cmd(user_argv[0]);
- user_argv[0] = prog;
- execv(user_argv[0], (char *const *) user_argv);
+ char *cmd = make_cmd(user_argv[0]);
+ execv(cmd, (char *const *) user_argv);
}
free(prog);
free(user_argv);
diff --git a/t/t9400-git-cvsserver-server.sh b/t/t9400-git-cvsserver-server.sh
index e499c7f955..6da7440e73 100755
--- a/t/t9400-git-cvsserver-server.sh
+++ b/t/t9400-git-cvsserver-server.sh
@@ -11,6 +11,7 @@ cvs CLI client via git-cvsserver server'
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
if ! test_have_prereq PERL; then
diff --git a/t/t9850-shell.sh b/t/t9850-shell.sh
index cfc71c3bd4..f503f16d1b 100755
--- a/t/t9850-shell.sh
+++ b/t/t9850-shell.sh
@@ -1,6 +1,8 @@
#!/bin/sh
test_description='git shell tests'
+
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
test_expect_success 'shell allows upload-pack' '
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 05/23] wt-status: fix leaking buffer with sparse directories
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
` (3 preceding siblings ...)
2024-09-30 9:13 ` [PATCH 04/23] shell: fix leaking strings Patrick Steinhardt
@ 2024-09-30 9:13 ` Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 06/23] submodule: fix leaking submodule entry list Patrick Steinhardt
` (17 subsequent siblings)
22 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:13 UTC (permalink / raw)
To: git
When hitting a sparse directory in `wt_status_collect_changes_initial()`
we use a `struct strbuf` to assemble the directory's name. We never free
that buffer though, causing a memory leak.
Fix the leak by releasing the buffer. While at it, move the buffer
outside of the loop and reset it to save on some wasteful allocations.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
t/t1092-sparse-checkout-compatibility.sh | 1 +
wt-status.c | 6 +++++-
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index eb32da2a7f..55efafe4e0 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -5,6 +5,7 @@ test_description='compare full workdir to sparse workdir'
GIT_TEST_SPLIT_INDEX=0
GIT_TEST_SPARSE_INDEX=
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
test_expect_success 'setup' '
diff --git a/wt-status.c b/wt-status.c
index 6a6397ca8f..6a8c05d1cf 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -717,6 +717,7 @@ static int add_file_to_list(const struct object_id *oid,
static void wt_status_collect_changes_initial(struct wt_status *s)
{
struct index_state *istate = s->repo->index;
+ struct strbuf base = STRBUF_INIT;
int i;
for (i = 0; i < istate->cache_nr; i++) {
@@ -735,7 +736,6 @@ static void wt_status_collect_changes_initial(struct wt_status *s)
* expanding the trees to find the elements that are new in this
* tree and marking them with DIFF_STATUS_ADDED.
*/
- struct strbuf base = STRBUF_INIT;
struct pathspec ps = { 0 };
struct tree *tree = lookup_tree(istate->repo, &ce->oid);
@@ -743,9 +743,11 @@ static void wt_status_collect_changes_initial(struct wt_status *s)
ps.has_wildcard = 1;
ps.max_depth = -1;
+ strbuf_reset(&base);
strbuf_add(&base, ce->name, ce->ce_namelen);
read_tree_at(istate->repo, tree, &base, 0, &ps,
add_file_to_list, s);
+
continue;
}
@@ -772,6 +774,8 @@ static void wt_status_collect_changes_initial(struct wt_status *s)
s->committable = 1;
}
}
+
+ strbuf_release(&base);
}
static void wt_status_collect_untracked(struct wt_status *s)
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 06/23] submodule: fix leaking submodule entry list
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
` (4 preceding siblings ...)
2024-09-30 9:13 ` [PATCH 05/23] wt-status: fix leaking buffer with sparse directories Patrick Steinhardt
@ 2024-09-30 9:13 ` Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 07/23] builtin/stash: fix leaking `pathspec_from_file` Patrick Steinhardt
` (16 subsequent siblings)
22 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:13 UTC (permalink / raw)
To: git
The submodule entry list returned by `submodules_of_tree()` is never
completely free'd by its only caller. Introduce a new function that
free's the list for us and call it.
While at it, also fix the leaking `branch_point` string.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
branch.c | 8 ++++++--
submodule-config.c | 15 ++++++++++++++-
submodule-config.h | 3 +++
t/t3207-branch-submodule.sh | 1 +
4 files changed, 24 insertions(+), 3 deletions(-)
diff --git a/branch.c b/branch.c
index 08fa4094d2..44977ad0aa 100644
--- a/branch.c
+++ b/branch.c
@@ -738,6 +738,7 @@ static int submodule_create_branch(struct repository *r,
strbuf_release(&child_err);
strbuf_release(&out_buf);
+ free(out_prefix);
return ret;
}
@@ -794,7 +795,7 @@ void create_branches_recursively(struct repository *r, const char *name,
create_branch(r, name, start_committish, force, 0, reflog, quiet,
BRANCH_TRACK_NEVER, dry_run);
if (dry_run)
- return;
+ goto out;
/*
* NEEDSWORK If tracking was set up in the superproject but not the
* submodule, users might expect "git branch --recurse-submodules" to
@@ -815,8 +816,11 @@ void create_branches_recursively(struct repository *r, const char *name,
die(_("submodule '%s': cannot create branch '%s'"),
submodule_entry_list.entries[i].submodule->name,
name);
- repo_clear(submodule_entry_list.entries[i].repo);
}
+
+out:
+ submodule_entry_list_release(&submodule_entry_list);
+ free(branch_point);
}
void remove_merge_branch_state(struct repository *r)
diff --git a/submodule-config.c b/submodule-config.c
index 471637a725..9c8c37b259 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -901,8 +901,9 @@ static void traverse_tree_submodules(struct repository *r,
struct submodule_tree_entry *st_entry;
struct name_entry name_entry;
char *tree_path = NULL;
+ char *tree_buf;
- fill_tree_descriptor(r, &tree, treeish_name);
+ tree_buf = fill_tree_descriptor(r, &tree, treeish_name);
while (tree_entry(&tree, &name_entry)) {
if (prefix)
tree_path =
@@ -930,6 +931,8 @@ static void traverse_tree_submodules(struct repository *r,
&name_entry.oid, out);
free(tree_path);
}
+
+ free(tree_buf);
}
void submodules_of_tree(struct repository *r,
@@ -943,6 +946,16 @@ void submodules_of_tree(struct repository *r,
traverse_tree_submodules(r, treeish_name, NULL, treeish_name, out);
}
+void submodule_entry_list_release(struct submodule_entry_list *list)
+{
+ for (size_t i = 0; i < list->entry_nr; i++) {
+ free(list->entries[i].name_entry);
+ repo_clear(list->entries[i].repo);
+ free(list->entries[i].repo);
+ }
+ free(list->entries);
+}
+
void submodule_free(struct repository *r)
{
if (r->submodule_cache)
diff --git a/submodule-config.h b/submodule-config.h
index b6133af71b..f55d4e3b61 100644
--- a/submodule-config.h
+++ b/submodule-config.h
@@ -136,4 +136,7 @@ struct submodule_entry_list {
void submodules_of_tree(struct repository *r,
const struct object_id *treeish_name,
struct submodule_entry_list *ret);
+
+void submodule_entry_list_release(struct submodule_entry_list *list);
+
#endif /* SUBMODULE_CONFIG_H */
diff --git a/t/t3207-branch-submodule.sh b/t/t3207-branch-submodule.sh
index fe72b24716..904eea7df5 100755
--- a/t/t3207-branch-submodule.sh
+++ b/t/t3207-branch-submodule.sh
@@ -5,6 +5,7 @@ test_description='git branch submodule tests'
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
. "$TEST_DIRECTORY"/lib-rebase.sh
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 07/23] builtin/stash: fix leaking `pathspec_from_file`
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
` (5 preceding siblings ...)
2024-09-30 9:13 ` [PATCH 06/23] submodule: fix leaking submodule entry list Patrick Steinhardt
@ 2024-09-30 9:13 ` Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 08/23] builtin/pack-redundant: fix various memory leaks Patrick Steinhardt
` (15 subsequent siblings)
22 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:13 UTC (permalink / raw)
To: git
The `OPT_PATHSPEC_FROM_FILE()` option maps to `OPT_FILENAME()`, which we
know will always allocate memory when passed. We never free the memory
though, causing a memory leak. Plug it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/stash.c | 4 +++-
t/t3909-stash-pathspec-file.sh | 1 +
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/builtin/stash.c b/builtin/stash.c
index f1acc918d0..1399a1bbe2 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -1759,7 +1759,7 @@ static int push_stash(int argc, const char **argv, const char *prefix,
int quiet = 0;
int pathspec_file_nul = 0;
const char *stash_msg = NULL;
- const char *pathspec_from_file = NULL;
+ char *pathspec_from_file = NULL;
struct pathspec ps;
struct option options[] = {
OPT_BOOL('k', "keep-index", &keep_index,
@@ -1821,7 +1821,9 @@ static int push_stash(int argc, const char **argv, const char *prefix,
ret = do_push_stash(&ps, stash_msg, quiet, keep_index, patch_mode,
include_untracked, only_staged);
+
clear_pathspec(&ps);
+ free(pathspec_from_file);
return ret;
}
diff --git a/t/t3909-stash-pathspec-file.sh b/t/t3909-stash-pathspec-file.sh
index 73f2dbdeb0..83269d0eb4 100755
--- a/t/t3909-stash-pathspec-file.sh
+++ b/t/t3909-stash-pathspec-file.sh
@@ -2,6 +2,7 @@
test_description='stash --pathspec-from-file'
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
test_tick
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 08/23] builtin/pack-redundant: fix various memory leaks
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
` (6 preceding siblings ...)
2024-09-30 9:13 ` [PATCH 07/23] builtin/stash: fix leaking `pathspec_from_file` Patrick Steinhardt
@ 2024-09-30 9:13 ` Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 09/23] builtin/clone: fix leaking repo state when cloning with bundle URIs Patrick Steinhardt
` (14 subsequent siblings)
22 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:13 UTC (permalink / raw)
To: git
There are various different memory leaks in git-pack-redundant(1),
mostly caused by not even trying to free allocated memory. Fix them.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/pack-redundant.c | 40 +++++++++++++++++++++++++++++++++------
t/t5323-pack-redundant.sh | 1 +
2 files changed, 35 insertions(+), 6 deletions(-)
diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c
index 81f4494d46..5809613002 100644
--- a/builtin/pack-redundant.c
+++ b/builtin/pack-redundant.c
@@ -69,6 +69,15 @@ static inline void llist_init(struct llist **list)
(*list)->size = 0;
}
+static void llist_free(struct llist *list)
+{
+ for (struct llist_item *i = list->front, *next; i; i = next) {
+ next = i->next;
+ llist_item_put(i);
+ }
+ free(list);
+}
+
static struct llist * llist_copy(struct llist *list)
{
struct llist *ret;
@@ -206,6 +215,14 @@ static inline struct pack_list * pack_list_insert(struct pack_list **pl,
return p;
}
+static void pack_list_free(struct pack_list *pl)
+{
+ for (struct pack_list *next; pl; pl = next) {
+ next = pl->next;
+ free(pl);
+ }
+}
+
static inline size_t pack_list_size(struct pack_list *pl)
{
size_t ret = 0;
@@ -419,7 +436,8 @@ static void minimize(struct pack_list **min)
/* return if there are no objects missing from the unique set */
if (missing->size == 0) {
- free(missing);
+ llist_free(missing);
+ pack_list_free(non_unique);
return;
}
@@ -434,6 +452,8 @@ static void minimize(struct pack_list **min)
}
while (non_unique) {
+ struct pack_list *next;
+
/* sort the non_unique packs, greater size of remaining_objects first */
sort_pack_list(&non_unique);
if (non_unique->remaining_objects->size == 0)
@@ -444,8 +464,14 @@ static void minimize(struct pack_list **min)
for (pl = non_unique->next; pl && pl->remaining_objects->size > 0; pl = pl->next)
llist_sorted_difference_inplace(pl->remaining_objects, non_unique->remaining_objects);
- non_unique = non_unique->next;
+ next = non_unique->next;
+ free(non_unique);
+ non_unique = next;
}
+
+ pack_list_free(non_unique);
+ llist_free(unique_pack_objects);
+ llist_free(missing);
}
static void load_all_objects(void)
@@ -565,7 +591,6 @@ static void load_all(void)
int cmd_pack_redundant(int argc, const char **argv, const char *prefix UNUSED, struct repository *repo UNUSED) {
int i; int i_still_use_this = 0; struct pack_list *min = NULL, *red, *pl;
struct llist *ignore;
- struct object_id *oid;
char buf[GIT_MAX_HEXSZ + 2]; /* hex hash + \n + \0 */
if (argc == 2 && !strcmp(argv[1], "-h"))
@@ -625,11 +650,11 @@ int cmd_pack_redundant(int argc, const char **argv, const char *prefix UNUSED, s
/* ignore objects given on stdin */
llist_init(&ignore);
if (!isatty(0)) {
+ struct object_id oid;
while (fgets(buf, sizeof(buf), stdin)) {
- oid = xmalloc(sizeof(*oid));
- if (get_oid_hex(buf, oid))
+ if (get_oid_hex(buf, &oid))
die("Bad object ID on stdin: %s", buf);
- llist_insert_sorted_unique(ignore, oid, NULL);
+ llist_insert_sorted_unique(ignore, &oid, NULL);
}
}
llist_sorted_difference_inplace(all_objects, ignore);
@@ -671,5 +696,8 @@ int cmd_pack_redundant(int argc, const char **argv, const char *prefix UNUSED, s
fprintf(stderr, "%luMB of redundant packs in total.\n",
(unsigned long)pack_set_bytecount(red)/(1024*1024));
+ pack_list_free(red);
+ pack_list_free(min);
+ llist_free(ignore);
return 0;
}
diff --git a/t/t5323-pack-redundant.sh b/t/t5323-pack-redundant.sh
index 8dbbcc5e51..4e18f5490a 100755
--- a/t/t5323-pack-redundant.sh
+++ b/t/t5323-pack-redundant.sh
@@ -34,6 +34,7 @@ relationship between packs and objects is as follows:
Px2 | s s s x x x
'
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
main_repo=main.git
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 09/23] builtin/clone: fix leaking repo state when cloning with bundle URIs
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
` (7 preceding siblings ...)
2024-09-30 9:13 ` [PATCH 08/23] builtin/pack-redundant: fix various memory leaks Patrick Steinhardt
@ 2024-09-30 9:13 ` Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 10/23] t/helper: fix leaking repository in partial-clone helper Patrick Steinhardt
` (13 subsequent siblings)
22 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:13 UTC (permalink / raw)
To: git
When cloning with bundle URIs we re-initialize `the_repository` after
having fetched the bundle. This causes a bunch of memory leaks though
because we do not release its previous state.
These leaks can be plugged by calling `repo_clear()` before we call
`repo_init()`. But this causes another issue because the remote that we
used is tied to the lifetime of the repository's remote state, which
would also get released. We thus have to make sure that it does not get
free'd under our feet.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/clone.c | 27 ++++++++++++++++++++++++++
t/t5730-protocol-v2-bundle-uri-file.sh | 1 +
t/t5731-protocol-v2-bundle-uri-git.sh | 1 +
t/t5732-protocol-v2-bundle-uri-http.sh | 1 +
4 files changed, 30 insertions(+)
diff --git a/builtin/clone.c b/builtin/clone.c
index e77339c847..59fcb317a6 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1403,8 +1403,17 @@ int cmd_clone(int argc,
* data from the --bundle-uri option.
*/
if (bundle_uri) {
+ struct remote_state *state;
int has_heuristic = 0;
+ /*
+ * We need to save the remote state as our remote's lifetime is
+ * tied to it.
+ */
+ state = the_repository->remote_state;
+ the_repository->remote_state = NULL;
+ repo_clear(the_repository);
+
/* At this point, we need the_repository to match the cloned repo. */
if (repo_init(the_repository, git_dir, work_tree))
warning(_("failed to initialize the repo, skipping bundle URI"));
@@ -1413,6 +1422,10 @@ int cmd_clone(int argc,
bundle_uri);
else if (has_heuristic)
git_config_set_gently("fetch.bundleuri", bundle_uri);
+
+ remote_state_clear(the_repository->remote_state);
+ free(the_repository->remote_state);
+ the_repository->remote_state = state;
} else {
/*
* Populate transport->got_remote_bundle_uri and
@@ -1422,12 +1435,26 @@ int cmd_clone(int argc,
if (transport->bundles &&
hashmap_get_size(&transport->bundles->bundles)) {
+ struct remote_state *state;
+
+ /*
+ * We need to save the remote state as our remote's
+ * lifetime is tied to it.
+ */
+ state = the_repository->remote_state;
+ the_repository->remote_state = NULL;
+ repo_clear(the_repository);
+
/* At this point, we need the_repository to match the cloned repo. */
if (repo_init(the_repository, git_dir, work_tree))
warning(_("failed to initialize the repo, skipping bundle URI"));
else if (fetch_bundle_list(the_repository,
transport->bundles))
warning(_("failed to fetch advertised bundles"));
+
+ remote_state_clear(the_repository->remote_state);
+ free(the_repository->remote_state);
+ the_repository->remote_state = state;
} else {
clear_bundle_list(transport->bundles);
FREE_AND_NULL(transport->bundles);
diff --git a/t/t5730-protocol-v2-bundle-uri-file.sh b/t/t5730-protocol-v2-bundle-uri-file.sh
index 37bdb725bc..38396df95b 100755
--- a/t/t5730-protocol-v2-bundle-uri-file.sh
+++ b/t/t5730-protocol-v2-bundle-uri-file.sh
@@ -7,6 +7,7 @@ TEST_NO_CREATE_REPO=1
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
# Test protocol v2 with 'file://' transport
diff --git a/t/t5731-protocol-v2-bundle-uri-git.sh b/t/t5731-protocol-v2-bundle-uri-git.sh
index 8add1b37ab..c199e955fe 100755
--- a/t/t5731-protocol-v2-bundle-uri-git.sh
+++ b/t/t5731-protocol-v2-bundle-uri-git.sh
@@ -7,6 +7,7 @@ TEST_NO_CREATE_REPO=1
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
# Test protocol v2 with 'git://' transport
diff --git a/t/t5732-protocol-v2-bundle-uri-http.sh b/t/t5732-protocol-v2-bundle-uri-http.sh
index 129daa0226..a9403e94c6 100755
--- a/t/t5732-protocol-v2-bundle-uri-http.sh
+++ b/t/t5732-protocol-v2-bundle-uri-http.sh
@@ -7,6 +7,7 @@ TEST_NO_CREATE_REPO=1
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
# Test protocol v2 with 'http://' transport
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 10/23] t/helper: fix leaking repository in partial-clone helper
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
` (8 preceding siblings ...)
2024-09-30 9:13 ` [PATCH 09/23] builtin/clone: fix leaking repo state when cloning with bundle URIs Patrick Steinhardt
@ 2024-09-30 9:13 ` Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 11/23] builtin/revert: fix leaking `gpg_sign` and `strategy` config Patrick Steinhardt
` (12 subsequent siblings)
22 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:13 UTC (permalink / raw)
To: git
We initialize but never clear a repository in the partial-clone test
helper. Plug this leak.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
t/helper/test-partial-clone.c | 2 ++
t/t0410-partial-clone.sh | 1 +
2 files changed, 3 insertions(+)
diff --git a/t/helper/test-partial-clone.c b/t/helper/test-partial-clone.c
index 0ead529167..a1af9710c3 100644
--- a/t/helper/test-partial-clone.c
+++ b/t/helper/test-partial-clone.c
@@ -26,6 +26,8 @@ static void object_info(const char *gitdir, const char *oid_hex)
if (oid_object_info_extended(&r, &oid, &oi, 0))
die("could not obtain object info");
printf("%d\n", (int) size);
+
+ repo_clear(&r);
}
int cmd__partial_clone(int argc, const char **argv)
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index 34bdb3ab1f..818700fbec 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -2,6 +2,7 @@
test_description='partial clone'
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
. "$TEST_DIRECTORY"/lib-terminal.sh
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 11/23] builtin/revert: fix leaking `gpg_sign` and `strategy` config
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
` (9 preceding siblings ...)
2024-09-30 9:13 ` [PATCH 10/23] t/helper: fix leaking repository in partial-clone helper Patrick Steinhardt
@ 2024-09-30 9:13 ` Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 12/23] diff: improve lifecycle management of diff queues Patrick Steinhardt
` (11 subsequent siblings)
22 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:13 UTC (permalink / raw)
To: git
We leak the config values when `gpg_sign` or `strategy` options are
being overridden via the command line. To fix this we need to free the
old value, which requires us to figure out whether the value was changed
via an option in the first place. The easy way to do this, which is to
initialize local variables with `NULL`, doesn't work because we cannot
tell the case where the user has passed e.g. `--no-gpg-sign`. Instead,
we use a sentinel value for both values that we can compare against to
check whether the user has passed the option.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/revert.c | 17 +++++++++++++----
t/t3514-cherry-pick-revert-gpg.sh | 1 +
2 files changed, 14 insertions(+), 4 deletions(-)
diff --git a/builtin/revert.c b/builtin/revert.c
index 55ba1092c5..b7917dddd3 100644
--- a/builtin/revert.c
+++ b/builtin/revert.c
@@ -110,6 +110,9 @@ static int run_sequencer(int argc, const char **argv, const char *prefix,
const char * const * usage_str = revert_or_cherry_pick_usage(opts);
const char *me = action_name(opts);
const char *cleanup_arg = NULL;
+ const char sentinel_value;
+ const char *strategy = &sentinel_value;
+ const char *gpg_sign = &sentinel_value;
enum empty_action empty_opt = EMPTY_COMMIT_UNSPECIFIED;
int cmd = 0;
struct option base_options[] = {
@@ -125,10 +128,10 @@ static int run_sequencer(int argc, const char **argv, const char *prefix,
OPT_CALLBACK('m', "mainline", opts, N_("parent-number"),
N_("select mainline parent"), option_parse_m),
OPT_RERERE_AUTOUPDATE(&opts->allow_rerere_auto),
- OPT_STRING(0, "strategy", &opts->strategy, N_("strategy"), N_("merge strategy")),
+ OPT_STRING(0, "strategy", &strategy, N_("strategy"), N_("merge strategy")),
OPT_STRVEC('X', "strategy-option", &opts->xopts, N_("option"),
N_("option for merge strategy")),
- { OPTION_STRING, 'S', "gpg-sign", &opts->gpg_sign, N_("key-id"),
+ { OPTION_STRING, 'S', "gpg-sign", &gpg_sign, N_("key-id"),
N_("GPG sign commit"), PARSE_OPT_OPTARG, NULL, (intptr_t) "" },
OPT_END()
};
@@ -240,8 +243,14 @@ static int run_sequencer(int argc, const char **argv, const char *prefix,
usage_with_options(usage_str, options);
/* These option values will be free()d */
- opts->gpg_sign = xstrdup_or_null(opts->gpg_sign);
- opts->strategy = xstrdup_or_null(opts->strategy);
+ if (gpg_sign != &sentinel_value) {
+ free(opts->gpg_sign);
+ opts->gpg_sign = xstrdup_or_null(gpg_sign);
+ }
+ if (strategy != &sentinel_value) {
+ free(opts->strategy);
+ opts->strategy = xstrdup_or_null(strategy);
+ }
if (!opts->strategy && getenv("GIT_TEST_MERGE_ALGORITHM"))
opts->strategy = xstrdup(getenv("GIT_TEST_MERGE_ALGORITHM"));
free(options);
diff --git a/t/t3514-cherry-pick-revert-gpg.sh b/t/t3514-cherry-pick-revert-gpg.sh
index 5b2e250eaa..133dc07217 100755
--- a/t/t3514-cherry-pick-revert-gpg.sh
+++ b/t/t3514-cherry-pick-revert-gpg.sh
@@ -5,6 +5,7 @@
test_description='test {cherry-pick,revert} --[no-]gpg-sign'
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
. "$TEST_DIRECTORY/lib-gpg.sh"
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 12/23] diff: improve lifecycle management of diff queues
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
` (10 preceding siblings ...)
2024-09-30 9:13 ` [PATCH 11/23] builtin/revert: fix leaking `gpg_sign` and `strategy` config Patrick Steinhardt
@ 2024-09-30 9:13 ` Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 13/23] line-log: fix several memory leaks Patrick Steinhardt
` (10 subsequent siblings)
22 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:13 UTC (permalink / raw)
To: git
The lifecycle management of diff queues is somewhat confusing:
- For most of the part this can be attributed to `DIFF_QUEUE_CLEAR()`,
which does not release any memory but rather initializes the queue,
only. This is in contrast to our common naming schema, where
"clearing" means that we release underlying memory and then
re-initialize the data structure such that it is ready to use.
- A second offender is `diff_free_queue()`, which does not free the
queue structure itself. It is rather a release-style function.
Refactor the code to make things less confusing. `DIFF_QUEUE_CLEAR()` is
replaced by `DIFF_QUEUE_INIT` and `diff_queue_init()`, while
`diff_free_queue()` is replaced by `diff_queue_release()`. While on it,
adapt callsites where we call `DIFF_QUEUE_CLEAR()` with the intent to
release underlying memory to instead call `diff_queue_clear()` to fix
memory leaks.
This memory leak is exposed by t4211, but plugging it alone does not
make the whole test suite pass.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
bloom.c | 8 +-------
diff.c | 22 ++++++++++++----------
diffcore-break.c | 8 ++------
diffcore-pickaxe.c | 4 +---
diffcore-rename.c | 3 +--
diffcore-rotate.c | 3 +--
diffcore.h | 10 ++++------
line-log.c | 15 +++++++--------
log-tree.c | 4 ++--
merge-ort.c | 2 +-
10 files changed, 32 insertions(+), 47 deletions(-)
diff --git a/bloom.c b/bloom.c
index c915f8b1ba..c428634105 100644
--- a/bloom.c
+++ b/bloom.c
@@ -476,8 +476,6 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r,
*last_slash = '\0';
} while (*path);
-
- diff_free_filepair(diff_queued_diff.queue[i]);
}
if (hashmap_get_size(&pathmap) > settings->max_changed_paths) {
@@ -508,8 +506,6 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r,
cleanup:
hashmap_clear_and_free(&pathmap, struct pathmap_hash_entry, entry);
} else {
- for (i = 0; i < diff_queued_diff.nr; i++)
- diff_free_filepair(diff_queued_diff.queue[i]);
init_truncated_large_filter(filter, settings->hash_version);
if (computed)
@@ -519,9 +515,7 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r,
if (computed)
*computed |= BLOOM_COMPUTED;
- free(diff_queued_diff.queue);
- DIFF_QUEUE_CLEAR(&diff_queued_diff);
-
+ diff_queue_clear(&diff_queued_diff);
return filter;
}
diff --git a/diff.c b/diff.c
index 173cbe2bed..3e9137ffed 100644
--- a/diff.c
+++ b/diff.c
@@ -5983,11 +5983,18 @@ void diff_free_filepair(struct diff_filepair *p)
free(p);
}
-void diff_free_queue(struct diff_queue_struct *q)
+void diff_queue_init(struct diff_queue_struct *q)
+{
+ struct diff_queue_struct blank = DIFF_QUEUE_INIT;
+ memcpy(q, &blank, sizeof(*q));
+}
+
+void diff_queue_clear(struct diff_queue_struct *q)
{
for (int i = 0; i < q->nr; i++)
diff_free_filepair(q->queue[i]);
free(q->queue);
+ diff_queue_init(q);
}
const char *diff_aligned_abbrev(const struct object_id *oid, int len)
@@ -6551,8 +6558,7 @@ int diff_flush_patch_id(struct diff_options *options, struct object_id *oid, int
struct diff_queue_struct *q = &diff_queued_diff;
int result = diff_get_patch_id(options, oid, diff_header_only);
- diff_free_queue(q);
- DIFF_QUEUE_CLEAR(q);
+ diff_queue_clear(q);
return result;
}
@@ -6835,8 +6841,7 @@ void diff_flush(struct diff_options *options)
}
free_queue:
- diff_free_queue(q);
- DIFF_QUEUE_CLEAR(q);
+ diff_queue_clear(q);
diff_free(options);
/*
@@ -6867,9 +6872,7 @@ static void diffcore_apply_filter(struct diff_options *options)
{
int i;
struct diff_queue_struct *q = &diff_queued_diff;
- struct diff_queue_struct outq;
-
- DIFF_QUEUE_CLEAR(&outq);
+ struct diff_queue_struct outq = DIFF_QUEUE_INIT;
if (!options->filter)
return;
@@ -6962,8 +6965,7 @@ static void diffcore_skip_stat_unmatch(struct diff_options *diffopt)
{
int i;
struct diff_queue_struct *q = &diff_queued_diff;
- struct diff_queue_struct outq;
- DIFF_QUEUE_CLEAR(&outq);
+ struct diff_queue_struct outq = DIFF_QUEUE_INIT;
for (i = 0; i < q->nr; i++) {
struct diff_filepair *p = q->queue[i];
diff --git a/diffcore-break.c b/diffcore-break.c
index 02735f80c6..c4c2173f30 100644
--- a/diffcore-break.c
+++ b/diffcore-break.c
@@ -131,7 +131,7 @@ static int should_break(struct repository *r,
void diffcore_break(struct repository *r, int break_score)
{
struct diff_queue_struct *q = &diff_queued_diff;
- struct diff_queue_struct outq;
+ struct diff_queue_struct outq = DIFF_QUEUE_INIT;
/* When the filepair has this much edit (insert and delete),
* it is first considered to be a rewrite and broken into a
@@ -178,8 +178,6 @@ void diffcore_break(struct repository *r, int break_score)
if (!merge_score)
merge_score = DEFAULT_MERGE_SCORE;
- DIFF_QUEUE_CLEAR(&outq);
-
for (i = 0; i < q->nr; i++) {
struct diff_filepair *p = q->queue[i];
int score;
@@ -275,11 +273,9 @@ static void merge_broken(struct diff_filepair *p,
void diffcore_merge_broken(void)
{
struct diff_queue_struct *q = &diff_queued_diff;
- struct diff_queue_struct outq;
+ struct diff_queue_struct outq = DIFF_QUEUE_INIT;
int i, j;
- DIFF_QUEUE_CLEAR(&outq);
-
for (i = 0; i < q->nr; i++) {
struct diff_filepair *p = q->queue[i];
if (!p)
diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
index b195fa4eb3..43fef8e8ba 100644
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -182,9 +182,7 @@ static void pickaxe(struct diff_queue_struct *q, struct diff_options *o,
regex_t *regexp, kwset_t kws, pickaxe_fn fn)
{
int i;
- struct diff_queue_struct outq;
-
- DIFF_QUEUE_CLEAR(&outq);
+ struct diff_queue_struct outq = DIFF_QUEUE_INIT;
if (o->pickaxe_opts & DIFF_PICKAXE_ALL) {
/* Showing the whole changeset if needle exists */
diff --git a/diffcore-rename.c b/diffcore-rename.c
index 3d6826baa3..1b1c1a6a1f 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -1388,7 +1388,7 @@ void diffcore_rename_extended(struct diff_options *options,
int detect_rename = options->detect_rename;
int minimum_score = options->rename_score;
struct diff_queue_struct *q = &diff_queued_diff;
- struct diff_queue_struct outq;
+ struct diff_queue_struct outq = DIFF_QUEUE_INIT;
struct diff_score *mx;
int i, j, rename_count, skip_unmodified = 0;
int num_destinations, dst_cnt;
@@ -1638,7 +1638,6 @@ void diffcore_rename_extended(struct diff_options *options,
* are recorded in rename_dst. The original list is still in *q.
*/
trace2_region_enter("diff", "write back to queue", options->repo);
- DIFF_QUEUE_CLEAR(&outq);
for (i = 0; i < q->nr; i++) {
struct diff_filepair *p = q->queue[i];
struct diff_filepair *pair_to_free = NULL;
diff --git a/diffcore-rotate.c b/diffcore-rotate.c
index 533986cf63..73ca20b331 100644
--- a/diffcore-rotate.c
+++ b/diffcore-rotate.c
@@ -10,7 +10,7 @@
void diffcore_rotate(struct diff_options *opt)
{
struct diff_queue_struct *q = &diff_queued_diff;
- struct diff_queue_struct outq;
+ struct diff_queue_struct outq = DIFF_QUEUE_INIT;
int rotate_to, i;
if (!q->nr)
@@ -31,7 +31,6 @@ void diffcore_rotate(struct diff_options *opt)
return;
}
- DIFF_QUEUE_CLEAR(&outq);
rotate_to = i;
for (i = rotate_to; i < q->nr; i++)
diff --git a/diffcore.h b/diffcore.h
index 1701ed50b9..2feb325031 100644
--- a/diffcore.h
+++ b/diffcore.h
@@ -153,18 +153,16 @@ struct diff_queue_struct {
int nr;
};
-#define DIFF_QUEUE_CLEAR(q) \
- do { \
- (q)->queue = NULL; \
- (q)->nr = (q)->alloc = 0; \
- } while (0)
+#define DIFF_QUEUE_INIT { 0 }
+
+void diff_queue_init(struct diff_queue_struct *q);
+void diff_queue_clear(struct diff_queue_struct *q);
extern struct diff_queue_struct diff_queued_diff;
struct diff_filepair *diff_queue(struct diff_queue_struct *,
struct diff_filespec *,
struct diff_filespec *);
void diff_q(struct diff_queue_struct *, struct diff_filepair *);
-void diff_free_queue(struct diff_queue_struct *q);
/* dir_rename_relevance: the reason we want rename information for a dir */
enum dir_rename_relevance {
diff --git a/line-log.c b/line-log.c
index 67c80b39a0..89e0ea4562 100644
--- a/line-log.c
+++ b/line-log.c
@@ -787,15 +787,14 @@ static void move_diff_queue(struct diff_queue_struct *dst,
struct diff_queue_struct *src)
{
assert(src != dst);
- memcpy(dst, src, sizeof(struct diff_queue_struct));
- DIFF_QUEUE_CLEAR(src);
+ memcpy(dst, src, sizeof(*dst));
+ diff_queue_init(src);
}
static void filter_diffs_for_paths(struct line_log_data *range, int keep_deletions)
{
int i;
- struct diff_queue_struct outq;
- DIFF_QUEUE_CLEAR(&outq);
+ struct diff_queue_struct outq = DIFF_QUEUE_INIT;
for (i = 0; i < diff_queued_diff.nr; i++) {
struct diff_filepair *p = diff_queued_diff.queue[i];
@@ -850,12 +849,12 @@ static void queue_diffs(struct line_log_data *range,
clear_pathspec(&opt->pathspec);
parse_pathspec_from_ranges(&opt->pathspec, range);
}
- DIFF_QUEUE_CLEAR(&diff_queued_diff);
+ diff_queue_clear(&diff_queued_diff);
diff_tree_oid(parent_tree_oid, tree_oid, "", opt);
if (opt->detect_rename && diff_might_be_rename()) {
/* must look at the full tree diff to detect renames */
clear_pathspec(&opt->pathspec);
- DIFF_QUEUE_CLEAR(&diff_queued_diff);
+ diff_queue_clear(&diff_queued_diff);
diff_tree_oid(parent_tree_oid, tree_oid, "", opt);
@@ -1097,7 +1096,7 @@ static struct diff_filepair *diff_filepair_dup(struct diff_filepair *pair)
static void free_diffqueues(int n, struct diff_queue_struct *dq)
{
for (int i = 0; i < n; i++)
- diff_free_queue(&dq[i]);
+ diff_queue_clear(&dq[i]);
free(dq);
}
@@ -1200,7 +1199,7 @@ static int process_ranges_ordinary_commit(struct rev_info *rev, struct commit *c
if (parent)
add_line_range(rev, parent, parent_range);
free_line_log_data(parent_range);
- diff_free_queue(&queue);
+ diff_queue_clear(&queue);
return changed;
}
diff --git a/log-tree.c b/log-tree.c
index 3758e0d3b8..60774c16b3 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -675,7 +675,7 @@ static void show_diff_of_diff(struct rev_info *opt)
struct diff_queue_struct dq;
memcpy(&dq, &diff_queued_diff, sizeof(diff_queued_diff));
- DIFF_QUEUE_CLEAR(&diff_queued_diff);
+ diff_queue_init(&diff_queued_diff);
fprintf_ln(opt->diffopt.file, "\n%s", opt->idiff_title);
show_interdiff(opt->idiff_oid1, opt->idiff_oid2, 2,
@@ -694,7 +694,7 @@ static void show_diff_of_diff(struct rev_info *opt)
};
memcpy(&dq, &diff_queued_diff, sizeof(diff_queued_diff));
- DIFF_QUEUE_CLEAR(&diff_queued_diff);
+ diff_queue_init(&diff_queued_diff);
fprintf_ln(opt->diffopt.file, "\n%s", opt->rdiff_title);
/*
diff --git a/merge-ort.c b/merge-ort.c
index 8b81153e8f..11029c10be 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -3536,7 +3536,7 @@ static int detect_and_process_renames(struct merge_options *opt)
/* Free memory for renames->pairs[] and combined */
for (s = MERGE_SIDE1; s <= MERGE_SIDE2; s++) {
free(renames->pairs[s].queue);
- DIFF_QUEUE_CLEAR(&renames->pairs[s]);
+ diff_queue_init(&renames->pairs[s]);
}
for (i = 0; i < combined.nr; i++)
pool_diff_free_filepair(&opt->priv->pool, combined.queue[i]);
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 13/23] line-log: fix several memory leaks
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
` (11 preceding siblings ...)
2024-09-30 9:13 ` [PATCH 12/23] diff: improve lifecycle management of diff queues Patrick Steinhardt
@ 2024-09-30 9:13 ` Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 14/23] pseudo-merge: fix various " Patrick Steinhardt
` (9 subsequent siblings)
22 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:13 UTC (permalink / raw)
To: git
As described in "line-log.c" itself, the code is "leaking like a sieve".
These leaks are all of rather trivial nature, so this commit plugs them
without going too much into details for each of those leaks.
The leaks are hit by t4211, but plugging them alone does not make the
full test suite pass. The remaining leaks are unrelated to the line-log
subsystem.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
line-log.c | 54 +++++++++++++++++++++++++++++++++++-------------------
1 file changed, 35 insertions(+), 19 deletions(-)
diff --git a/line-log.c b/line-log.c
index 89e0ea4562..ee48988c66 100644
--- a/line-log.c
+++ b/line-log.c
@@ -248,8 +248,10 @@ static void line_log_data_init(struct line_log_data *r)
static void line_log_data_clear(struct line_log_data *r)
{
range_set_release(&r->ranges);
+ free(r->path);
if (r->pair)
diff_free_filepair(r->pair);
+ diff_ranges_release(&r->diff);
}
static void free_line_log_data(struct line_log_data *r)
@@ -571,7 +573,8 @@ parse_lines(struct repository *r, struct commit *commit,
struct line_log_data *p;
for_each_string_list_item(item, args) {
- const char *name_part, *range_part;
+ const char *name_part;
+ char *range_part;
char *full_name;
struct diff_filespec *spec;
long begin = 0, end = 0;
@@ -615,6 +618,7 @@ parse_lines(struct repository *r, struct commit *commit,
free_filespec(spec);
FREE_AND_NULL(ends);
+ free(range_part);
}
for (p = ranges; p; p = p->next)
@@ -760,15 +764,13 @@ static void parse_pathspec_from_ranges(struct pathspec *pathspec,
{
struct line_log_data *r;
struct strvec array = STRVEC_INIT;
- const char **paths;
for (r = range; r; r = r->next)
strvec_push(&array, r->path);
- paths = strvec_detach(&array);
- parse_pathspec(pathspec, 0, PATHSPEC_PREFER_FULL, "", paths);
- /* strings are now owned by pathspec */
- free(paths);
+ parse_pathspec(pathspec, 0, PATHSPEC_PREFER_FULL, "", array.v);
+
+ strvec_clear(&array);
}
void line_log_init(struct rev_info *rev, const char *prefix, struct string_list *args)
@@ -781,6 +783,8 @@ void line_log_init(struct rev_info *rev, const char *prefix, struct string_list
add_line_range(rev, commit, range);
parse_pathspec_from_ranges(&rev->diffopt.pathspec, range);
+
+ free_line_log_data(range);
}
static void move_diff_queue(struct diff_queue_struct *dst,
@@ -1131,10 +1135,18 @@ static int process_all_files(struct line_log_data **range_out,
while (rg && strcmp(rg->path, pair->two->path))
rg = rg->next;
assert(rg);
+ if (rg->pair)
+ diff_free_filepair(rg->pair);
rg->pair = diff_filepair_dup(queue->queue[i]);
+ diff_ranges_release(&rg->diff);
memcpy(&rg->diff, pairdiff, sizeof(struct diff_ranges));
+ FREE_AND_NULL(pairdiff);
+ }
+
+ if (pairdiff) {
+ diff_ranges_release(pairdiff);
+ free(pairdiff);
}
- free(pairdiff);
}
return changed;
@@ -1212,12 +1224,13 @@ static int process_ranges_merge_commit(struct rev_info *rev, struct commit *comm
struct commit_list *p;
int i;
int nparents = commit_list_count(commit->parents);
+ int ret;
if (nparents > 1 && rev->first_parent_only)
nparents = 1;
ALLOC_ARRAY(diffqueues, nparents);
- ALLOC_ARRAY(cand, nparents);
+ CALLOC_ARRAY(cand, nparents);
ALLOC_ARRAY(parents, nparents);
p = commit->parents;
@@ -1229,7 +1242,6 @@ static int process_ranges_merge_commit(struct rev_info *rev, struct commit *comm
for (i = 0; i < nparents; i++) {
int changed;
- cand[i] = NULL;
changed = process_all_files(&cand[i], rev, &diffqueues[i], range);
if (!changed) {
/*
@@ -1237,13 +1249,10 @@ static int process_ranges_merge_commit(struct rev_info *rev, struct commit *comm
* don't follow any other path in history
*/
add_line_range(rev, parents[i], cand[i]);
- clear_commit_line_range(rev, commit);
commit_list_append(parents[i], &commit->parents);
- free(parents);
- free(cand);
- free_diffqueues(nparents, diffqueues);
- /* NEEDSWORK leaking like a sieve */
- return 0;
+
+ ret = 0;
+ goto out;
}
}
@@ -1251,18 +1260,25 @@ static int process_ranges_merge_commit(struct rev_info *rev, struct commit *comm
* No single parent took the blame. We add the candidates
* from the above loop to the parents.
*/
- for (i = 0; i < nparents; i++) {
+ for (i = 0; i < nparents; i++)
add_line_range(rev, parents[i], cand[i]);
- }
+ ret = 1;
+
+out:
clear_commit_line_range(rev, commit);
free(parents);
+ for (i = 0; i < nparents; i++) {
+ if (!cand[i])
+ continue;
+ line_log_data_clear(cand[i]);
+ free(cand[i]);
+ }
free(cand);
free_diffqueues(nparents, diffqueues);
- return 1;
+ return ret;
/* NEEDSWORK evil merge detection stuff */
- /* NEEDSWORK leaking like a sieve */
}
int line_log_process_ranges_arbitrary_commit(struct rev_info *rev, struct commit *commit)
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 14/23] pseudo-merge: fix various memory leaks
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
` (12 preceding siblings ...)
2024-09-30 9:13 ` [PATCH 13/23] line-log: fix several memory leaks Patrick Steinhardt
@ 2024-09-30 9:13 ` Patrick Steinhardt
2024-09-30 16:21 ` Taylor Blau
2024-09-30 9:13 ` [PATCH 15/23] pseudo-merge: fix leaking strmap keys Patrick Steinhardt
` (8 subsequent siblings)
22 siblings, 1 reply; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:13 UTC (permalink / raw)
To: git
Fix various memory leaks hit by the pseudo-merge machinery. These leaks
are exposed by t5333, but plugging them does not yet make the whole test
suite pass.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
pack-bitmap-write.c | 8 ++++++++
pack-bitmap.c | 4 ++--
pseudo-merge.c | 19 +++++++++++++++++++
pseudo-merge.h | 2 ++
4 files changed, 31 insertions(+), 2 deletions(-)
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 4dc0fe8e40..6413dd1731 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -64,6 +64,12 @@ static void free_pseudo_merge_commit_idx(struct pseudo_merge_commit_idx *idx)
free(idx);
}
+static void pseudo_merge_group_release_cb(void *payload, const char *name UNUSED)
+{
+ pseudo_merge_group_release(payload);
+ free(payload);
+}
+
void bitmap_writer_free(struct bitmap_writer *writer)
{
uint32_t i;
@@ -82,6 +88,8 @@ void bitmap_writer_free(struct bitmap_writer *writer)
kh_foreach_value(writer->pseudo_merge_commits, idx,
free_pseudo_merge_commit_idx(idx));
kh_destroy_oid_map(writer->pseudo_merge_commits);
+ string_list_clear_func(&writer->pseudo_merge_groups,
+ pseudo_merge_group_release_cb);
for (i = 0; i < writer->selected_nr; i++) {
struct bitmapped_commit *bc = &writer->selected[i];
diff --git a/pack-bitmap.c b/pack-bitmap.c
index 9d9b8c4bfb..32b222a7af 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -1390,8 +1390,8 @@ static struct bitmap *find_objects(struct bitmap_index *bitmap_git,
}
base = bitmap_new();
- if (!cascade_pseudo_merges_1(bitmap_git, base, roots_bitmap))
- bitmap_free(roots_bitmap);
+ cascade_pseudo_merges_1(bitmap_git, base, roots_bitmap);
+ bitmap_free(roots_bitmap);
}
/*
diff --git a/pseudo-merge.c b/pseudo-merge.c
index 10ebd9a4e9..28782a31c6 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -97,6 +97,25 @@ static void pseudo_merge_group_init(struct pseudo_merge_group *group)
group->stable_size = DEFAULT_PSEUDO_MERGE_STABLE_SIZE;
}
+void pseudo_merge_group_release(struct pseudo_merge_group *group)
+{
+ struct hashmap_iter iter;
+ struct strmap_entry *e;
+
+ regfree(group->pattern);
+ free(group->pattern);
+
+ strmap_for_each_entry(&group->matches, &iter, e) {
+ struct pseudo_merge_matches *matches = e->value;
+ free(matches->stable);
+ free(matches->unstable);
+ free(matches);
+ }
+ strmap_clear(&group->matches, 0);
+
+ free(group->merges);
+}
+
static int pseudo_merge_config(const char *var, const char *value,
const struct config_context *ctx,
void *cb_data)
diff --git a/pseudo-merge.h b/pseudo-merge.h
index 4b5febaa63..29df8a32ec 100644
--- a/pseudo-merge.h
+++ b/pseudo-merge.h
@@ -51,6 +51,8 @@ struct pseudo_merge_group {
timestamp_t stable_threshold;
};
+void pseudo_merge_group_release(struct pseudo_merge_group *group);
+
struct pseudo_merge_matches {
struct commit **stable;
struct commit **unstable;
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 15/23] pseudo-merge: fix leaking strmap keys
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
` (13 preceding siblings ...)
2024-09-30 9:13 ` [PATCH 14/23] pseudo-merge: fix various " Patrick Steinhardt
@ 2024-09-30 9:13 ` Patrick Steinhardt
2024-09-30 21:22 ` Taylor Blau
2024-09-30 9:13 ` [PATCH 16/23] pack-bitmap-write: fix leaking OID array Patrick Steinhardt
` (7 subsequent siblings)
22 siblings, 1 reply; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:13 UTC (permalink / raw)
To: git
When creating a new pseudo-merge group we collect a set of matchnig
commits and put them into a string map. This strmap is initialized such
that it does not allocate its keys, and instead we try to pass ownership
of the keys to it via `strmap_put()`. This isn't how it works though:
the strmap will never try to release these keys, and consequently they
end up leaking.
Fix this leak by initializing the strmap as duplicating its keys and not
trying to hand over ownership.
The leak is exposed by t5333, but plugging it does not yet make the full
test suite pass.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
pseudo-merge.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/pseudo-merge.c b/pseudo-merge.c
index 28782a31c6..bb59965ed2 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -87,7 +87,7 @@ static void pseudo_merge_group_init(struct pseudo_merge_group *group)
{
memset(group, 0, sizeof(struct pseudo_merge_group));
- strmap_init_with_options(&group->matches, NULL, 0);
+ strmap_init_with_options(&group->matches, NULL, 1);
group->decay = DEFAULT_PSEUDO_MERGE_DECAY;
group->max_merges = DEFAULT_PSEUDO_MERGE_MAX_MERGES;
@@ -275,7 +275,7 @@ static int find_pseudo_merge_group_for_ref(const char *refname,
matches = strmap_get(&group->matches, group_name.buf);
if (!matches) {
matches = xcalloc(1, sizeof(*matches));
- strmap_put(&group->matches, strbuf_detach(&group_name, NULL),
+ strmap_put(&group->matches, group_name.buf,
matches);
}
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 16/23] pack-bitmap-write: fix leaking OID array
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
` (14 preceding siblings ...)
2024-09-30 9:13 ` [PATCH 15/23] pseudo-merge: fix leaking strmap keys Patrick Steinhardt
@ 2024-09-30 9:13 ` Patrick Steinhardt
2024-09-30 21:22 ` Taylor Blau
2024-09-30 9:14 ` [PATCH 17/23] midx-write: fix leaking buffer Patrick Steinhardt
` (6 subsequent siblings)
22 siblings, 1 reply; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:13 UTC (permalink / raw)
To: git
Fix a leaking OID array in `write_pseudo_merges()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
pack-bitmap-write.c | 1 +
t/t5333-pseudo-merge-bitmaps.sh | 1 +
2 files changed, 2 insertions(+)
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 6413dd1731..49758e2525 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -913,6 +913,7 @@ static void write_pseudo_merges(struct bitmap_writer *writer,
for (i = 0; i < writer->pseudo_merges_nr; i++)
bitmap_free(commits_bitmap[i]);
+ oid_array_clear(&commits);
free(pseudo_merge_ofs);
free(commits_bitmap);
}
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index 1dd6284756..eca4a1eb8c 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -4,6 +4,7 @@ test_description='pseudo-merge bitmaps'
GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
test_pseudo_merges () {
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 17/23] midx-write: fix leaking buffer
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
` (15 preceding siblings ...)
2024-09-30 9:13 ` [PATCH 16/23] pack-bitmap-write: fix leaking OID array Patrick Steinhardt
@ 2024-09-30 9:14 ` Patrick Steinhardt
2024-09-30 21:27 ` Taylor Blau
2024-09-30 9:14 ` [PATCH 18/23] revision: fix memory leaks when rewriting parents Patrick Steinhardt
` (5 subsequent siblings)
22 siblings, 1 reply; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:14 UTC (permalink / raw)
To: git
The buffer used to compute the final MIDX name is never released. Plug
this memory leak.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
midx-write.c | 2 ++
t/t5334-incremental-multi-pack-index.sh | 2 ++
2 files changed, 4 insertions(+)
diff --git a/midx-write.c b/midx-write.c
index 1ef62c4f4b..625c7d3137 100644
--- a/midx-write.c
+++ b/midx-write.c
@@ -1445,6 +1445,8 @@ static int write_midx_internal(const char *object_dir,
return -1;
}
+ strbuf_release(&final_midx_name);
+
keep_hashes[ctx.num_multi_pack_indexes_before] =
xstrdup(hash_to_hex(midx_hash));
diff --git a/t/t5334-incremental-multi-pack-index.sh b/t/t5334-incremental-multi-pack-index.sh
index c3b08acc73..471994c4bc 100755
--- a/t/t5334-incremental-multi-pack-index.sh
+++ b/t/t5334-incremental-multi-pack-index.sh
@@ -1,6 +1,8 @@
#!/bin/sh
test_description='incremental multi-pack-index'
+
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
. "$TEST_DIRECTORY"/lib-midx.sh
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 18/23] revision: fix memory leaks when rewriting parents
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
` (16 preceding siblings ...)
2024-09-30 9:14 ` [PATCH 17/23] midx-write: fix leaking buffer Patrick Steinhardt
@ 2024-09-30 9:14 ` Patrick Steinhardt
2024-09-30 9:14 ` [PATCH 19/23] revision: fix leaking saved parents Patrick Steinhardt
` (4 subsequent siblings)
22 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:14 UTC (permalink / raw)
To: git
Both `rewrite_parents()` and `remove_duplicate_parents()` may end up
dropping some parents from a commit without freeing the respective
`struct commit_list` items. This causes a bunch of memory leaks. Plug
these.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
revision.c | 2 ++
t/t3427-rebase-subtree.sh | 1 +
t/t6016-rev-list-graph-simplify-history.sh | 1 +
t/t7003-filter-branch.sh | 1 +
t/t9350-fast-export.sh | 1 +
t/t9402-git-cvsserver-refs.sh | 1 +
6 files changed, 7 insertions(+)
diff --git a/revision.c b/revision.c
index e79f39e555..6b452ea182 100644
--- a/revision.c
+++ b/revision.c
@@ -3250,6 +3250,7 @@ static int remove_duplicate_parents(struct rev_info *revs, struct commit *commit
struct commit *parent = p->item;
if (parent->object.flags & TMP_MARK) {
*pp = p->next;
+ free(p);
if (ts)
compact_treesame(revs, commit, surviving_parents);
continue;
@@ -4005,6 +4006,7 @@ int rewrite_parents(struct rev_info *revs, struct commit *commit,
break;
case rewrite_one_noparents:
*pp = parent->next;
+ free(parent);
continue;
case rewrite_one_error:
return -1;
diff --git a/t/t3427-rebase-subtree.sh b/t/t3427-rebase-subtree.sh
index 1b3e97c875..5e9046e3df 100755
--- a/t/t3427-rebase-subtree.sh
+++ b/t/t3427-rebase-subtree.sh
@@ -7,6 +7,7 @@ This test runs git rebase and tests the subtree strategy.
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
. "$TEST_DIRECTORY"/lib-rebase.sh
diff --git a/t/t6016-rev-list-graph-simplify-history.sh b/t/t6016-rev-list-graph-simplify-history.sh
index 54b0a6f5f8..2656d6a6bc 100755
--- a/t/t6016-rev-list-graph-simplify-history.sh
+++ b/t/t6016-rev-list-graph-simplify-history.sh
@@ -10,6 +10,7 @@ test_description='--graph and simplified history'
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
. "$TEST_DIRECTORY"/lib-log-graph.sh
diff --git a/t/t7003-filter-branch.sh b/t/t7003-filter-branch.sh
index 5ab4d41ee7..bf3e3f0b67 100755
--- a/t/t7003-filter-branch.sh
+++ b/t/t7003-filter-branch.sh
@@ -4,6 +4,7 @@ test_description='git filter-branch'
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
. "$TEST_DIRECTORY/lib-gpg.sh"
diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh
index 1eb035ee4c..2bdc02b459 100755
--- a/t/t9350-fast-export.sh
+++ b/t/t9350-fast-export.sh
@@ -7,6 +7,7 @@ test_description='git fast-export'
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
test_expect_success 'setup' '
diff --git a/t/t9402-git-cvsserver-refs.sh b/t/t9402-git-cvsserver-refs.sh
index 2ee41f9443..c847120d52 100755
--- a/t/t9402-git-cvsserver-refs.sh
+++ b/t/t9402-git-cvsserver-refs.sh
@@ -8,6 +8,7 @@ tags, branches and other git refspecs'
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
#########
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 19/23] revision: fix leaking saved parents
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
` (17 preceding siblings ...)
2024-09-30 9:14 ` [PATCH 18/23] revision: fix memory leaks when rewriting parents Patrick Steinhardt
@ 2024-09-30 9:14 ` Patrick Steinhardt
2024-09-30 9:14 ` [PATCH 20/23] pack-write: fix return parameter of `write_rev_file_order()` Patrick Steinhardt
` (3 subsequent siblings)
22 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:14 UTC (permalink / raw)
To: git
The `saved_parents` slab is used by `--full-diff` to save parents of a
commit which we are about to rewrite. We do not release its contents
once it's not used anymore, causing a memory leak. Plug it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
revision.c | 12 ++++++++++--
t/t6012-rev-list-simplify.sh | 1 +
2 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/revision.c b/revision.c
index 6b452ea182..f5f5b84f2b 100644
--- a/revision.c
+++ b/revision.c
@@ -4207,10 +4207,18 @@ static void save_parents(struct rev_info *revs, struct commit *commit)
*pp = EMPTY_PARENT_LIST;
}
+static void free_saved_parent(struct commit_list **parents)
+{
+ if (*parents != EMPTY_PARENT_LIST)
+ free_commit_list(*parents);
+}
+
static void free_saved_parents(struct rev_info *revs)
{
- if (revs->saved_parents_slab)
- clear_saved_parents(revs->saved_parents_slab);
+ if (!revs->saved_parents_slab)
+ return;
+ deep_clear_saved_parents(revs->saved_parents_slab, free_saved_parent);
+ FREE_AND_NULL(revs->saved_parents_slab);
}
struct commit_list *get_saved_parents(struct rev_info *revs, const struct commit *commit)
diff --git a/t/t6012-rev-list-simplify.sh b/t/t6012-rev-list-simplify.sh
index de1e87f162..8ed1a215da 100755
--- a/t/t6012-rev-list-simplify.sh
+++ b/t/t6012-rev-list-simplify.sh
@@ -5,6 +5,7 @@ test_description='merge simplification'
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
note () {
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 20/23] pack-write: fix return parameter of `write_rev_file_order()`
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
` (18 preceding siblings ...)
2024-09-30 9:14 ` [PATCH 19/23] revision: fix leaking saved parents Patrick Steinhardt
@ 2024-09-30 9:14 ` Patrick Steinhardt
2024-09-30 9:14 ` [PATCH 21/23] t/helper: fix leaks in proc-receive helper Patrick Steinhardt
` (2 subsequent siblings)
22 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:14 UTC (permalink / raw)
To: git
While the return parameter of `write_rev_file_order()` is a string
constant, the function may indeed return an allocated string when its
first parameter is a `NULL` pointer. This makes for a confusing calling
convention, where callers need to be aware of these intricate ownership
rules and cast away the constness to free the string in some cases.
Adapt the function and its caller `write_rev_file()` to always return an
allocated string and adapt callers to always free the return value.
Note that this requires us to also adapt `rename_tmp_packfile()`, which
compares the pointers to packfile data with each other. Now that the
path of the reverse index file gets allocated unconditionally the check
will always fail. This is fixed by using strcmp(3P) instead, which also
feels way less fragile.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/index-pack.c | 7 +++---
midx-write.c | 3 ++-
pack-write.c | 42 +++++++++++++++++--------------
pack.h | 4 +--
t/t5327-multi-pack-bitmaps-rev.sh | 1 +
5 files changed, 31 insertions(+), 26 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index e228c56ff2..9d23b41b3a 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1505,7 +1505,7 @@ static void rename_tmp_packfile(const char **final_name,
struct strbuf *name, unsigned char *hash,
const char *ext, int make_read_only_if_same)
{
- if (*final_name != curr_name) {
+ if (!*final_name || strcmp(*final_name, curr_name)) {
if (!*final_name)
*final_name = odb_pack_name(name, hash, ext);
if (finalize_object_file(curr_name, *final_name))
@@ -1726,7 +1726,7 @@ int cmd_index_pack(int argc,
{
int i, fix_thin_pack = 0, verify = 0, stat_only = 0, rev_index;
const char *curr_index;
- const char *curr_rev_index = NULL;
+ char *curr_rev_index = NULL;
const char *index_name = NULL, *pack_name = NULL, *rev_index_name = NULL;
const char *keep_msg = NULL;
const char *promisor_msg = NULL;
@@ -1968,8 +1968,7 @@ int cmd_index_pack(int argc,
free((void *) curr_pack);
if (!index_name)
free((void *) curr_index);
- if (!rev_index_name)
- free((void *) curr_rev_index);
+ free(curr_rev_index);
/*
* Let the caller know this pack is not self contained
diff --git a/midx-write.c b/midx-write.c
index 625c7d3137..b3a5f6c516 100644
--- a/midx-write.c
+++ b/midx-write.c
@@ -649,7 +649,7 @@ static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash,
struct write_midx_context *ctx)
{
struct strbuf buf = STRBUF_INIT;
- const char *tmp_file;
+ char *tmp_file;
trace2_region_enter("midx", "write_midx_reverse_index", the_repository);
@@ -662,6 +662,7 @@ static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash,
die(_("cannot store reverse index file"));
strbuf_release(&buf);
+ free(tmp_file);
trace2_region_leave("midx", "write_midx_reverse_index", the_repository);
}
diff --git a/pack-write.c b/pack-write.c
index 27965672f1..9961149e65 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -212,15 +212,15 @@ static void write_rev_trailer(struct hashfile *f, const unsigned char *hash)
hashwrite(f, hash, the_hash_algo->rawsz);
}
-const char *write_rev_file(const char *rev_name,
- struct pack_idx_entry **objects,
- uint32_t nr_objects,
- const unsigned char *hash,
- unsigned flags)
+char *write_rev_file(const char *rev_name,
+ struct pack_idx_entry **objects,
+ uint32_t nr_objects,
+ const unsigned char *hash,
+ unsigned flags)
{
uint32_t *pack_order;
uint32_t i;
- const char *ret;
+ char *ret;
if (!(flags & WRITE_REV) && !(flags & WRITE_REV_VERIFY))
return NULL;
@@ -238,13 +238,14 @@ const char *write_rev_file(const char *rev_name,
return ret;
}
-const char *write_rev_file_order(const char *rev_name,
- uint32_t *pack_order,
- uint32_t nr_objects,
- const unsigned char *hash,
- unsigned flags)
+char *write_rev_file_order(const char *rev_name,
+ uint32_t *pack_order,
+ uint32_t nr_objects,
+ const unsigned char *hash,
+ unsigned flags)
{
struct hashfile *f;
+ char *path;
int fd;
if ((flags & WRITE_REV) && (flags & WRITE_REV_VERIFY))
@@ -254,12 +255,13 @@ const char *write_rev_file_order(const char *rev_name,
if (!rev_name) {
struct strbuf tmp_file = STRBUF_INIT;
fd = odb_mkstemp(&tmp_file, "pack/tmp_rev_XXXXXX");
- rev_name = strbuf_detach(&tmp_file, NULL);
+ path = strbuf_detach(&tmp_file, NULL);
} else {
unlink(rev_name);
fd = xopen(rev_name, O_CREAT|O_EXCL|O_WRONLY, 0600);
+ path = xstrdup(rev_name);
}
- f = hashfd(fd, rev_name);
+ f = hashfd(fd, path);
} else if (flags & WRITE_REV_VERIFY) {
struct stat statbuf;
if (stat(rev_name, &statbuf)) {
@@ -270,22 +272,24 @@ const char *write_rev_file_order(const char *rev_name,
die_errno(_("could not stat: %s"), rev_name);
}
f = hashfd_check(rev_name);
- } else
+ path = xstrdup(rev_name);
+ } else {
return NULL;
+ }
write_rev_header(f);
write_rev_index_positions(f, pack_order, nr_objects);
write_rev_trailer(f, hash);
- if (rev_name && adjust_shared_perm(rev_name) < 0)
- die(_("failed to make %s readable"), rev_name);
+ if (adjust_shared_perm(path) < 0)
+ die(_("failed to make %s readable"), path);
finalize_hashfile(f, NULL, FSYNC_COMPONENT_PACK_METADATA,
CSUM_HASH_IN_STREAM | CSUM_CLOSE |
((flags & WRITE_IDX_VERIFY) ? 0 : CSUM_FSYNC));
- return rev_name;
+ return path;
}
static void write_mtimes_header(struct hashfile *f)
@@ -549,7 +553,7 @@ void stage_tmp_packfiles(struct strbuf *name_buffer,
unsigned char hash[],
char **idx_tmp_name)
{
- const char *rev_tmp_name = NULL;
+ char *rev_tmp_name = NULL;
char *mtimes_tmp_name = NULL;
if (adjust_shared_perm(pack_tmp_name))
@@ -575,7 +579,7 @@ void stage_tmp_packfiles(struct strbuf *name_buffer,
if (mtimes_tmp_name)
rename_tmp_packfile(name_buffer, mtimes_tmp_name, "mtimes");
- free((char *)rev_tmp_name);
+ free(rev_tmp_name);
free(mtimes_tmp_name);
}
diff --git a/pack.h b/pack.h
index 3ab9e3f60c..02bbdfb19c 100644
--- a/pack.h
+++ b/pack.h
@@ -96,8 +96,8 @@ struct ref;
void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought);
-const char *write_rev_file(const char *rev_name, struct pack_idx_entry **objects, uint32_t nr_objects, const unsigned char *hash, unsigned flags);
-const char *write_rev_file_order(const char *rev_name, uint32_t *pack_order, uint32_t nr_objects, const unsigned char *hash, unsigned flags);
+char *write_rev_file(const char *rev_name, struct pack_idx_entry **objects, uint32_t nr_objects, const unsigned char *hash, unsigned flags);
+char *write_rev_file_order(const char *rev_name, uint32_t *pack_order, uint32_t nr_objects, const unsigned char *hash, unsigned flags);
/*
* The "hdr" output buffer should be at least this big, which will handle sizes
diff --git a/t/t5327-multi-pack-bitmaps-rev.sh b/t/t5327-multi-pack-bitmaps-rev.sh
index 9cac03a94b..994a8e6be4 100755
--- a/t/t5327-multi-pack-bitmaps-rev.sh
+++ b/t/t5327-multi-pack-bitmaps-rev.sh
@@ -2,6 +2,7 @@
test_description='exercise basic multi-pack bitmap functionality (.rev files)'
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
. "${TEST_DIRECTORY}/lib-bitmap.sh"
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 21/23] t/helper: fix leaks in proc-receive helper
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
` (19 preceding siblings ...)
2024-09-30 9:14 ` [PATCH 20/23] pack-write: fix return parameter of `write_rev_file_order()` Patrick Steinhardt
@ 2024-09-30 9:14 ` Patrick Steinhardt
2024-09-30 9:14 ` [PATCH 22/23] remote: fix leaking push reports Patrick Steinhardt
2024-09-30 9:14 ` [PATCH 23/23] builtin/send-pack: fix leaking list of push options Patrick Steinhardt
22 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:14 UTC (permalink / raw)
To: git
Fix trivial leaks in the proc-receive helpe.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
t/helper/test-proc-receive.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/t/helper/test-proc-receive.c b/t/helper/test-proc-receive.c
index 29361c7aab..3703f734f3 100644
--- a/t/helper/test-proc-receive.c
+++ b/t/helper/test-proc-receive.c
@@ -196,5 +196,12 @@ int cmd__proc_receive(int argc, const char **argv)
packet_flush(1);
sigchain_pop(SIGPIPE);
+ while (commands) {
+ struct command *next = commands->next;
+ free(commands);
+ commands = next;
+ }
+ string_list_clear(&push_options, 0);
+
return 0;
}
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 22/23] remote: fix leaking push reports
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
` (20 preceding siblings ...)
2024-09-30 9:14 ` [PATCH 21/23] t/helper: fix leaks in proc-receive helper Patrick Steinhardt
@ 2024-09-30 9:14 ` Patrick Steinhardt
2024-09-30 9:14 ` [PATCH 23/23] builtin/send-pack: fix leaking list of push options Patrick Steinhardt
22 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:14 UTC (permalink / raw)
To: git
The push reports that report failures to the user when pushing a
reference leak in several places. Plug these leaks by introducing a new
function `ref_push_report_free()` that frees the list of reports and
call it as required. While at it, fix a trivially leaking error string
in the vicinity.
These leaks get hit in t5411, but plugging them does not make the whole
test suite pass.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/receive-pack.c | 5 ++++-
remote.c | 15 +++++++++++++++
remote.h | 4 +++-
3 files changed, 22 insertions(+), 2 deletions(-)
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 536d22761d..ab5b20e39c 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -374,6 +374,7 @@ static void write_head_info(void)
struct command {
struct command *next;
const char *error_string;
+ char *error_string_owned;
struct ref_push_report *report;
unsigned int skip_update:1,
did_not_exist:1,
@@ -1083,7 +1084,7 @@ static int read_proc_receive_report(struct packet_reader *reader,
hint->run_proc_receive |= RUN_PROC_RECEIVE_RETURNED;
if (!strcmp(head, "ng")) {
if (p)
- hint->error_string = xstrdup(p);
+ hint->error_string = hint->error_string_owned = xstrdup(p);
else
hint->error_string = "failed";
code = -1;
@@ -2054,6 +2055,8 @@ static void free_commands(struct command *commands)
while (commands) {
struct command *next = commands->next;
+ ref_push_report_free(commands->report);
+ free(commands->error_string_owned);
free(commands);
commands = next;
}
diff --git a/remote.c b/remote.c
index e291e8ff5c..cd2091ac0c 100644
--- a/remote.c
+++ b/remote.c
@@ -868,6 +868,20 @@ struct strvec *push_url_of_remote(struct remote *remote)
return remote->pushurl.nr ? &remote->pushurl : &remote->url;
}
+void ref_push_report_free(struct ref_push_report *report)
+{
+ while (report) {
+ struct ref_push_report *next = report->next;
+
+ free(report->ref_name);
+ free(report->old_oid);
+ free(report->new_oid);
+ free(report);
+
+ report = next;
+ }
+}
+
static int match_name_with_pattern(const char *key, const char *name,
const char *value, char **result)
{
@@ -1122,6 +1136,7 @@ void free_one_ref(struct ref *ref)
if (!ref)
return;
free_one_ref(ref->peer_ref);
+ ref_push_report_free(ref->report);
free(ref->remote_status);
free(ref->tracking_ref);
free(ref->symref);
diff --git a/remote.h b/remote.h
index ad4513f639..c5e79eb40e 100644
--- a/remote.h
+++ b/remote.h
@@ -126,13 +126,15 @@ int remote_has_url(struct remote *remote, const char *url);
struct strvec *push_url_of_remote(struct remote *remote);
struct ref_push_report {
- const char *ref_name;
+ char *ref_name;
struct object_id *old_oid;
struct object_id *new_oid;
unsigned int forced_update:1;
struct ref_push_report *next;
};
+void ref_push_report_free(struct ref_push_report *);
+
struct ref {
struct ref *next;
struct object_id old_oid;
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH 23/23] builtin/send-pack: fix leaking list of push options
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
` (21 preceding siblings ...)
2024-09-30 9:14 ` [PATCH 22/23] remote: fix leaking push reports Patrick Steinhardt
@ 2024-09-30 9:14 ` Patrick Steinhardt
22 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-09-30 9:14 UTC (permalink / raw)
To: git
The list of push options is leaking. Plug the leak.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/send-pack.c | 1 +
t/t5411-proc-receive-hook.sh | 1 +
t/t5545-push-options.sh | 1 +
3 files changed, 3 insertions(+)
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index 8b1d46e79a..59b626aae8 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -340,6 +340,7 @@ int cmd_send_pack(int argc,
/* stable plumbing output; do not modify or localize */
fprintf(stderr, "Everything up-to-date\n");
+ string_list_clear(&push_options, 0);
free_refs(remote_refs);
free_refs(local_refs);
refspec_clear(&rs);
diff --git a/t/t5411-proc-receive-hook.sh b/t/t5411-proc-receive-hook.sh
index 92cf52c6d4..13d2d310a9 100755
--- a/t/t5411-proc-receive-hook.sh
+++ b/t/t5411-proc-receive-hook.sh
@@ -8,6 +8,7 @@ test_description='Test proc-receive hook'
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
. "$TEST_DIRECTORY"/t5411/common-functions.sh
diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
index fb13549da7..64ce56d3aa 100755
--- a/t/t5545-push-options.sh
+++ b/t/t5545-push-options.sh
@@ -8,6 +8,7 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1
export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
mk_repo_pair () {
--
2.46.2.852.g229c0bf0e5.dirty
^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [PATCH 02/23] read-cache: fix leaking hash context in `do_write_index()`
2024-09-30 9:13 ` [PATCH 02/23] read-cache: fix leaking hash context in `do_write_index()` Patrick Steinhardt
@ 2024-09-30 14:05 ` Derrick Stolee
0 siblings, 0 replies; 33+ messages in thread
From: Derrick Stolee @ 2024-09-30 14:05 UTC (permalink / raw)
To: Patrick Steinhardt, git
On 9/30/24 5:13 AM, Patrick Steinhardt wrote:
> When writing an index with the EOIE extension we allocate a separate
> hash context. We never free that context though, causing a memory leak.
> Plug it.
>
> This leak is exposed by t9210, but plugging it alone does not make the
> whole test suite pass.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> read-cache.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/read-cache.c b/read-cache.c
> index 764fdfec46..0fb5e0d372 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -3124,6 +3124,7 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
> if (f)
> free_hashfile(f);
> strbuf_release(&sb);
> + free(eoie_c);
> free(ieot);
> return ret;
> }
Thank you for finding and fixing this! This version should be
used instead of the similar patch I sent today.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 14/23] pseudo-merge: fix various memory leaks
2024-09-30 9:13 ` [PATCH 14/23] pseudo-merge: fix various " Patrick Steinhardt
@ 2024-09-30 16:21 ` Taylor Blau
0 siblings, 0 replies; 33+ messages in thread
From: Taylor Blau @ 2024-09-30 16:21 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Mon, Sep 30, 2024 at 11:13:51AM +0200, Patrick Steinhardt wrote:
> diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
> index 4dc0fe8e40..6413dd1731 100644
> --- a/pack-bitmap-write.c
> +++ b/pack-bitmap-write.c
> @@ -64,6 +64,12 @@ static void free_pseudo_merge_commit_idx(struct pseudo_merge_commit_idx *idx)
> free(idx);
> }
>
> +static void pseudo_merge_group_release_cb(void *payload, const char *name UNUSED)
> +{
> + pseudo_merge_group_release(payload);
> + free(payload);
> +}
> +
> void bitmap_writer_free(struct bitmap_writer *writer)
> {
> uint32_t i;
> @@ -82,6 +88,8 @@ void bitmap_writer_free(struct bitmap_writer *writer)
> kh_foreach_value(writer->pseudo_merge_commits, idx,
> free_pseudo_merge_commit_idx(idx));
> kh_destroy_oid_map(writer->pseudo_merge_commits);
> + string_list_clear_func(&writer->pseudo_merge_groups,
> + pseudo_merge_group_release_cb);
Looking good, and this is the right spot to free the pseudo-merge
groups. Note for other readers: pseudo-merge "groups" are a utility
structure that is only used while writing pseudo-merge bitmaps, so we
don't expect to see any corresponding pseudo_merge_group_release() calls
sprinkled throughout, e.g., pack-bitmap.c.
> diff --git a/pack-bitmap.c b/pack-bitmap.c
> index 9d9b8c4bfb..32b222a7af 100644
> --- a/pack-bitmap.c
> +++ b/pack-bitmap.c
> @@ -1390,8 +1390,8 @@ static struct bitmap *find_objects(struct bitmap_index *bitmap_git,
> }
>
> base = bitmap_new();
> - if (!cascade_pseudo_merges_1(bitmap_git, base, roots_bitmap))
> - bitmap_free(roots_bitmap);
> + cascade_pseudo_merges_1(bitmap_git, base, roots_bitmap);
> + bitmap_free(roots_bitmap);
I probably would have pulled this leakfix into its own separate patch,
since it's not immediately obvious how it's related to other changes in
the same patch.
But the change here is definitely correct. We initialize the
roots_bitmap field to just the tips of our traversal. I wrote some
details about why in 11d45a6e6a (pack-bitmap.c: use pseudo-merges during
traversal, 2024-05-23), but the gist is as follows: We want to avoid
accidentally thinking that roots which aren't part of some satisfied
pseudo-merge or existing bitmap are part of the reachability closure,
leaving those bits as dangling and leading to incorrect results.
In any event, we definitely do not need the roots_bitmap outside of that
block (regardless of whether or not we successfully cascaded any
pseudo-merges), so free-ing it here is the right thing to do.
> diff --git a/pseudo-merge.c b/pseudo-merge.c
> index 10ebd9a4e9..28782a31c6 100644
> --- a/pseudo-merge.c
> +++ b/pseudo-merge.c
> @@ -97,6 +97,25 @@ static void pseudo_merge_group_init(struct pseudo_merge_group *group)
> group->stable_size = DEFAULT_PSEUDO_MERGE_STABLE_SIZE;
> }
>
> +void pseudo_merge_group_release(struct pseudo_merge_group *group)
> +{
> + struct hashmap_iter iter;
> + struct strmap_entry *e;
> +
> + regfree(group->pattern);
> + free(group->pattern);
> +
> + strmap_for_each_entry(&group->matches, &iter, e) {
> + struct pseudo_merge_matches *matches = e->value;
> + free(matches->stable);
> + free(matches->unstable);
> + free(matches);
> + }
> + strmap_clear(&group->matches, 0);
> +
> + free(group->merges);
> +}
> +
All looks good here, I didn't see any fields that were missed or
over-eagerly free'd here.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 15/23] pseudo-merge: fix leaking strmap keys
2024-09-30 9:13 ` [PATCH 15/23] pseudo-merge: fix leaking strmap keys Patrick Steinhardt
@ 2024-09-30 21:22 ` Taylor Blau
2024-10-07 9:41 ` Patrick Steinhardt
0 siblings, 1 reply; 33+ messages in thread
From: Taylor Blau @ 2024-09-30 21:22 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Mon, Sep 30, 2024 at 11:13:53AM +0200, Patrick Steinhardt wrote:
> When creating a new pseudo-merge group we collect a set of matchnig
s/matchnig/matching/
> commits and put them into a string map. This strmap is initialized such
> that it does not allocate its keys, and instead we try to pass ownership
> of the keys to it via `strmap_put()`. This isn't how it works though:
> the strmap will never try to release these keys, and consequently they
> end up leaking.
>
> Fix this leak by initializing the strmap as duplicating its keys and not
> trying to hand over ownership.
Hmm. I think I am probably splitting hairs, since a few xstrdup() calls
between friends is unlikely to matter here, but... ;-)
It does seem wasteful if we can avoid it. We already allocated heap
space for the matches via the underlying strbuf, and we really do want
to hand ownership over if we can.
Would doing something like this on top of your previous patch do the
trick?
--- >8 ---
diff --git a/pseudo-merge.c b/pseudo-merge.c
index 28782a31c6..6b6605d3dc 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -110,6 +110,7 @@ void pseudo_merge_group_release(struct pseudo_merge_group *group)
free(matches->stable);
free(matches->unstable);
free(matches);
+ free((char*)e->key);
}
strmap_clear(&group->matches, 0);
--- 8< ---
That introduces an ugly const-cast, but I think it's tolerable (and may
be moreso with a comment explaining why it's safe).
> diff --git a/pseudo-merge.c b/pseudo-merge.c
> index 28782a31c6..bb59965ed2 100644
> --- a/pseudo-merge.c
> +++ b/pseudo-merge.c
> @@ -87,7 +87,7 @@ static void pseudo_merge_group_init(struct pseudo_merge_group *group)
> {
> memset(group, 0, sizeof(struct pseudo_merge_group));
>
> - strmap_init_with_options(&group->matches, NULL, 0);
> + strmap_init_with_options(&group->matches, NULL, 1);
>
> group->decay = DEFAULT_PSEUDO_MERGE_DECAY;
> group->max_merges = DEFAULT_PSEUDO_MERGE_MAX_MERGES;
> @@ -275,7 +275,7 @@ static int find_pseudo_merge_group_for_ref(const char *refname,
> matches = strmap_get(&group->matches, group_name.buf);
> if (!matches) {
> matches = xcalloc(1, sizeof(*matches));
> - strmap_put(&group->matches, strbuf_detach(&group_name, NULL),
> + strmap_put(&group->matches, group_name.buf,
> matches);
> }
Otherwise, I think what's written here looks good to me.
Thanks,
Taylor
^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [PATCH 16/23] pack-bitmap-write: fix leaking OID array
2024-09-30 9:13 ` [PATCH 16/23] pack-bitmap-write: fix leaking OID array Patrick Steinhardt
@ 2024-09-30 21:22 ` Taylor Blau
0 siblings, 0 replies; 33+ messages in thread
From: Taylor Blau @ 2024-09-30 21:22 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Mon, Sep 30, 2024 at 11:13:58AM +0200, Patrick Steinhardt wrote:
> diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
> index 1dd6284756..eca4a1eb8c 100755
> --- a/t/t5333-pseudo-merge-bitmaps.sh
> +++ b/t/t5333-pseudo-merge-bitmaps.sh
> @@ -4,6 +4,7 @@ test_description='pseudo-merge bitmaps'
>
> GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0
>
> +TEST_PASSES_SANITIZE_LEAK=true
> . ./test-lib.sh
Very nicely done :-).
Thanks,
Taylor
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 17/23] midx-write: fix leaking buffer
2024-09-30 9:14 ` [PATCH 17/23] midx-write: fix leaking buffer Patrick Steinhardt
@ 2024-09-30 21:27 ` Taylor Blau
2024-10-07 9:41 ` Patrick Steinhardt
0 siblings, 1 reply; 33+ messages in thread
From: Taylor Blau @ 2024-09-30 21:27 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
On Mon, Sep 30, 2024 at 11:14:01AM +0200, Patrick Steinhardt wrote:
> The buffer used to compute the final MIDX name is never released. Plug
> this memory leak.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> midx-write.c | 2 ++
> t/t5334-incremental-multi-pack-index.sh | 2 ++
> 2 files changed, 4 insertions(+)
>
> diff --git a/midx-write.c b/midx-write.c
> index 1ef62c4f4b..625c7d3137 100644
> --- a/midx-write.c
> +++ b/midx-write.c
> @@ -1445,6 +1445,8 @@ static int write_midx_internal(const char *object_dir,
> return -1;
Would you also want to strbuf_release() the final_midx_name buffer here
as well?
I guess the point is moot since there are a number of other finalization
steps that we likewise skip here by returning -1 directly instead of
jumping to the cleanup sub-routine.
In the above sense I'm OK with it as-is, but it would be nice to verify
that this portion of the code is leak-free even when we return early
(e.g., because we couldn't rename() the tempfile into place).
Of course, because final_midx_name is local to the body of this
conditional, I think you'd need to do something like:
if (ctx.incremntal) {
struct strbuf final_midx_name = STRBUF_INIT;
/* ... */
get_split_midx_filename_ext(&final_midx_name, object_dir,
midx_hash, MIDX_EXT_MIDX);
if (rename_tempfile(&incr, final_midx_name.buf) < 0) {
result = error_errno(_("unable to rename new multi-pack-index layer"));
strbuf_release(&final_midx_name); /* <- here */
goto cleanup;
}
strbuf_release(&final_midx_name); /* ... <- and here */
keep_hashes[ctx.num_multi_pack_indexes_before] =
xstrdup(hash_to_hex(midx_hash));
}
Thanks,
Taylor
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 15/23] pseudo-merge: fix leaking strmap keys
2024-09-30 21:22 ` Taylor Blau
@ 2024-10-07 9:41 ` Patrick Steinhardt
2024-10-08 8:54 ` Jeff King
0 siblings, 1 reply; 33+ messages in thread
From: Patrick Steinhardt @ 2024-10-07 9:41 UTC (permalink / raw)
To: Taylor Blau; +Cc: git
On Mon, Sep 30, 2024 at 05:22:28PM -0400, Taylor Blau wrote:
> On Mon, Sep 30, 2024 at 11:13:53AM +0200, Patrick Steinhardt wrote:
> > When creating a new pseudo-merge group we collect a set of matchnig
>
> s/matchnig/matching/
>
> > commits and put them into a string map. This strmap is initialized such
> > that it does not allocate its keys, and instead we try to pass ownership
> > of the keys to it via `strmap_put()`. This isn't how it works though:
> > the strmap will never try to release these keys, and consequently they
> > end up leaking.
> >
> > Fix this leak by initializing the strmap as duplicating its keys and not
> > trying to hand over ownership.
>
> Hmm. I think I am probably splitting hairs, since a few xstrdup() calls
> between friends is unlikely to matter here, but... ;-)
>
> It does seem wasteful if we can avoid it. We already allocated heap
> space for the matches via the underlying strbuf, and we really do want
> to hand ownership over if we can.
>
> Would doing something like this on top of your previous patch do the
> trick?
>
> --- >8 ---
> diff --git a/pseudo-merge.c b/pseudo-merge.c
> index 28782a31c6..6b6605d3dc 100644
> --- a/pseudo-merge.c
> +++ b/pseudo-merge.c
> @@ -110,6 +110,7 @@ void pseudo_merge_group_release(struct pseudo_merge_group *group)
> free(matches->stable);
> free(matches->unstable);
> free(matches);
> + free((char*)e->key);
> }
> strmap_clear(&group->matches, 0);
> --- 8< ---
>
> That introduces an ugly const-cast, but I think it's tolerable (and may
> be moreso with a comment explaining why it's safe).
Well, to me this is a tradeoff between complexity and performance. If
the performance benefit outweighs the additional complexity that this
shared ownership of keys brings along with it then I'm okay with the
code being intimate with each others lifetime requirements.
But as far as I can see the number of entries is determined by the group
pattern, and I expect that in most cases it's going to be quite limited.
So it does not feel like this should lead to all that many extra
allocations, and thus I tend to prefer the simpler solution.
But please let me know in case you disagree with that assessment.
Patrick
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 17/23] midx-write: fix leaking buffer
2024-09-30 21:27 ` Taylor Blau
@ 2024-10-07 9:41 ` Patrick Steinhardt
0 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-10-07 9:41 UTC (permalink / raw)
To: Taylor Blau; +Cc: git
On Mon, Sep 30, 2024 at 05:27:09PM -0400, Taylor Blau wrote:
> On Mon, Sep 30, 2024 at 11:14:01AM +0200, Patrick Steinhardt wrote:
> > The buffer used to compute the final MIDX name is never released. Plug
> > this memory leak.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> > midx-write.c | 2 ++
> > t/t5334-incremental-multi-pack-index.sh | 2 ++
> > 2 files changed, 4 insertions(+)
> >
> > diff --git a/midx-write.c b/midx-write.c
> > index 1ef62c4f4b..625c7d3137 100644
> > --- a/midx-write.c
> > +++ b/midx-write.c
> > @@ -1445,6 +1445,8 @@ static int write_midx_internal(const char *object_dir,
> > return -1;
>
> Would you also want to strbuf_release() the final_midx_name buffer here
> as well?
>
> I guess the point is moot since there are a number of other finalization
> steps that we likewise skip here by returning -1 directly instead of
> jumping to the cleanup sub-routine.
>
> In the above sense I'm OK with it as-is, but it would be nice to verify
> that this portion of the code is leak-free even when we return early
> (e.g., because we couldn't rename() the tempfile into place).
>
> Of course, because final_midx_name is local to the body of this
> conditional, I think you'd need to do something like:
>
> if (ctx.incremntal) {
> struct strbuf final_midx_name = STRBUF_INIT;
>
> /* ... */
>
> get_split_midx_filename_ext(&final_midx_name, object_dir,
> midx_hash, MIDX_EXT_MIDX);
>
> if (rename_tempfile(&incr, final_midx_name.buf) < 0) {
> result = error_errno(_("unable to rename new multi-pack-index layer"));
> strbuf_release(&final_midx_name); /* <- here */
> goto cleanup;
> }
>
> strbuf_release(&final_midx_name); /* ... <- and here */
>
> keep_hashes[ctx.num_multi_pack_indexes_before] =
> xstrdup(hash_to_hex(midx_hash));
> }
Potentially. Until now we never really cared all that much about
releasing resources in error paths, and for all I can see this leak
isn't hit in our test suite. But it is an easy fix to make, so I can
include this in case I'll have to reroll this series due to other
reasons.
Patrick
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 15/23] pseudo-merge: fix leaking strmap keys
2024-10-07 9:41 ` Patrick Steinhardt
@ 2024-10-08 8:54 ` Jeff King
2024-10-09 7:06 ` Patrick Steinhardt
0 siblings, 1 reply; 33+ messages in thread
From: Jeff King @ 2024-10-08 8:54 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Taylor Blau, git
On Mon, Oct 07, 2024 at 11:41:16AM +0200, Patrick Steinhardt wrote:
> > Hmm. I think I am probably splitting hairs, since a few xstrdup() calls
> > between friends is unlikely to matter here, but... ;-)
> >
> > It does seem wasteful if we can avoid it. We already allocated heap
> > space for the matches via the underlying strbuf, and we really do want
> > to hand ownership over if we can.
> >
> > Would doing something like this on top of your previous patch do the
> > trick?
> >
> > --- >8 ---
> > diff --git a/pseudo-merge.c b/pseudo-merge.c
> > index 28782a31c6..6b6605d3dc 100644
> > --- a/pseudo-merge.c
> > +++ b/pseudo-merge.c
> > @@ -110,6 +110,7 @@ void pseudo_merge_group_release(struct pseudo_merge_group *group)
> > free(matches->stable);
> > free(matches->unstable);
> > free(matches);
> > + free((char*)e->key);
> > }
> > strmap_clear(&group->matches, 0);
> > --- 8< ---
> >
> > That introduces an ugly const-cast, but I think it's tolerable (and may
> > be moreso with a comment explaining why it's safe).
>
> Well, to me this is a tradeoff between complexity and performance. If
> the performance benefit outweighs the additional complexity that this
> shared ownership of keys brings along with it then I'm okay with the
> code being intimate with each others lifetime requirements.
>
> But as far as I can see the number of entries is determined by the group
> pattern, and I expect that in most cases it's going to be quite limited.
> So it does not feel like this should lead to all that many extra
> allocations, and thus I tend to prefer the simpler solution.
I'd tend to agree with you that the allocations aren't a big deal here.
But I think we've run into this kind of strbuf-detaching thing before,
and there's another pattern that is efficient without getting too
intimate between modules. Like this (plus your change to set the
strmap's strdup_strings mode):
diff --git a/pseudo-merge.c b/pseudo-merge.c
index 10ebd9a4e9..fb1164edfa 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -210,6 +210,7 @@ static int find_pseudo_merge_group_for_ref(const char *refname,
struct bitmap_writer *writer = _data;
struct object_id peeled;
struct commit *c;
+ struct strbuf group_name = STRBUF_INIT;
uint32_t i;
int has_bitmap;
@@ -227,10 +228,11 @@ static int find_pseudo_merge_group_for_ref(const char *refname,
for (i = 0; i < writer->pseudo_merge_groups.nr; i++) {
struct pseudo_merge_group *group;
struct pseudo_merge_matches *matches;
- struct strbuf group_name = STRBUF_INIT;
regmatch_t captures[16];
size_t j;
+ strbuf_reset(&group_name);
+
group = writer->pseudo_merge_groups.items[i].util;
if (regexec(group->pattern, refname, ARRAY_SIZE(captures),
captures, 0))
@@ -256,8 +258,7 @@ static int find_pseudo_merge_group_for_ref(const char *refname,
matches = strmap_get(&group->matches, group_name.buf);
if (!matches) {
matches = xcalloc(1, sizeof(*matches));
- strmap_put(&group->matches, strbuf_detach(&group_name, NULL),
- matches);
+ strmap_put(&group->matches, group_name.buf, matches);
}
if (c->date <= group->stable_threshold) {
@@ -270,9 +271,9 @@ static int find_pseudo_merge_group_for_ref(const char *refname,
matches->unstable[matches->unstable_nr++] = c;
}
- strbuf_release(&group_name);
}
+ strbuf_release(&group_name);
return 0;
}
Now we skip the duplicate allocations because we are reusing the
group_name scratch buffer in the loop over and over. But wait, there's
more! We're actually _more_ efficient than the original which did one
allocation per entry, because:
1. We can allocate the correct number of bytes for each name, rather
than using the over-estimated buffer made by strbuf.
2. In strdup_strings mode, strmap is smart enough to use a FLEXPTR to
stick the name inside the struct. So we've actually reduced the
number of allocations per entry by 1.
Now possibly it is not even worth doing this optimization, because this
is not really a hot path. But it doesn't violate any module boundaries,
and I think the "loop over a reusable strbuf" trick is a general idiom
for solving these kinds of allocation problems. So a good thing to keep
in our toolbox.
-Peff
^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [PATCH 15/23] pseudo-merge: fix leaking strmap keys
2024-10-08 8:54 ` Jeff King
@ 2024-10-09 7:06 ` Patrick Steinhardt
0 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2024-10-09 7:06 UTC (permalink / raw)
To: Jeff King; +Cc: Taylor Blau, git, Junio C Hamano
On Tue, Oct 08, 2024 at 04:54:23AM -0400, Jeff King wrote:
> On Mon, Oct 07, 2024 at 11:41:16AM +0200, Patrick Steinhardt wrote:
>
> > > Hmm. I think I am probably splitting hairs, since a few xstrdup() calls
> > > between friends is unlikely to matter here, but... ;-)
> > >
> > > It does seem wasteful if we can avoid it. We already allocated heap
> > > space for the matches via the underlying strbuf, and we really do want
> > > to hand ownership over if we can.
> > >
> > > Would doing something like this on top of your previous patch do the
> > > trick?
> > >
> > > --- >8 ---
> > > diff --git a/pseudo-merge.c b/pseudo-merge.c
> > > index 28782a31c6..6b6605d3dc 100644
> > > --- a/pseudo-merge.c
> > > +++ b/pseudo-merge.c
> > > @@ -110,6 +110,7 @@ void pseudo_merge_group_release(struct pseudo_merge_group *group)
> > > free(matches->stable);
> > > free(matches->unstable);
> > > free(matches);
> > > + free((char*)e->key);
> > > }
> > > strmap_clear(&group->matches, 0);
> > > --- 8< ---
> > >
> > > That introduces an ugly const-cast, but I think it's tolerable (and may
> > > be moreso with a comment explaining why it's safe).
> >
> > Well, to me this is a tradeoff between complexity and performance. If
> > the performance benefit outweighs the additional complexity that this
> > shared ownership of keys brings along with it then I'm okay with the
> > code being intimate with each others lifetime requirements.
> >
> > But as far as I can see the number of entries is determined by the group
> > pattern, and I expect that in most cases it's going to be quite limited.
> > So it does not feel like this should lead to all that many extra
> > allocations, and thus I tend to prefer the simpler solution.
>
> I'd tend to agree with you that the allocations aren't a big deal here.
> But I think we've run into this kind of strbuf-detaching thing before,
> and there's another pattern that is efficient without getting too
> intimate between modules. Like this (plus your change to set the
> strmap's strdup_strings mode):
>
> diff --git a/pseudo-merge.c b/pseudo-merge.c
> index 10ebd9a4e9..fb1164edfa 100644
> --- a/pseudo-merge.c
> +++ b/pseudo-merge.c
> @@ -210,6 +210,7 @@ static int find_pseudo_merge_group_for_ref(const char *refname,
> struct bitmap_writer *writer = _data;
> struct object_id peeled;
> struct commit *c;
> + struct strbuf group_name = STRBUF_INIT;
> uint32_t i;
> int has_bitmap;
>
> @@ -227,10 +228,11 @@ static int find_pseudo_merge_group_for_ref(const char *refname,
> for (i = 0; i < writer->pseudo_merge_groups.nr; i++) {
> struct pseudo_merge_group *group;
> struct pseudo_merge_matches *matches;
> - struct strbuf group_name = STRBUF_INIT;
> regmatch_t captures[16];
> size_t j;
>
> + strbuf_reset(&group_name);
> +
> group = writer->pseudo_merge_groups.items[i].util;
> if (regexec(group->pattern, refname, ARRAY_SIZE(captures),
> captures, 0))
> @@ -256,8 +258,7 @@ static int find_pseudo_merge_group_for_ref(const char *refname,
> matches = strmap_get(&group->matches, group_name.buf);
> if (!matches) {
> matches = xcalloc(1, sizeof(*matches));
> - strmap_put(&group->matches, strbuf_detach(&group_name, NULL),
> - matches);
> + strmap_put(&group->matches, group_name.buf, matches);
> }
>
> if (c->date <= group->stable_threshold) {
> @@ -270,9 +271,9 @@ static int find_pseudo_merge_group_for_ref(const char *refname,
> matches->unstable[matches->unstable_nr++] = c;
> }
>
> - strbuf_release(&group_name);
> }
>
> + strbuf_release(&group_name);
> return 0;
> }
>
>
> Now we skip the duplicate allocations because we are reusing the
> group_name scratch buffer in the loop over and over. But wait, there's
> more! We're actually _more_ efficient than the original which did one
> allocation per entry, because:
>
> 1. We can allocate the correct number of bytes for each name, rather
> than using the over-estimated buffer made by strbuf.
>
> 2. In strdup_strings mode, strmap is smart enough to use a FLEXPTR to
> stick the name inside the struct. So we've actually reduced the
> number of allocations per entry by 1.
>
> Now possibly it is not even worth doing this optimization, because this
> is not really a hot path. But it doesn't violate any module boundaries,
> and I think the "loop over a reusable strbuf" trick is a general idiom
> for solving these kinds of allocation problems. So a good thing to keep
> in our toolbox.
Nice, thanks for digging! I see that Junio has already merged the topic
to `next` though -- is this a mere "Let this cook before the next cycle
starts" or will it stay in next?
If the latter then I'll leave this as-is, otherwise I'll send a revised
version to make this a bit more efficient.
Thanks!
Patrick
^ permalink raw reply [flat|nested] 33+ messages in thread
end of thread, other threads:[~2024-10-09 7:06 UTC | newest]
Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-30 9:13 [PATCH 00/23] Memory leak fixes (pt.8) Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 01/23] builtin/annotate: fix leaking args vector Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 02/23] read-cache: fix leaking hash context in `do_write_index()` Patrick Steinhardt
2024-09-30 14:05 ` Derrick Stolee
2024-09-30 9:13 ` [PATCH 03/23] scalar: fix leaking repositories Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 04/23] shell: fix leaking strings Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 05/23] wt-status: fix leaking buffer with sparse directories Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 06/23] submodule: fix leaking submodule entry list Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 07/23] builtin/stash: fix leaking `pathspec_from_file` Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 08/23] builtin/pack-redundant: fix various memory leaks Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 09/23] builtin/clone: fix leaking repo state when cloning with bundle URIs Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 10/23] t/helper: fix leaking repository in partial-clone helper Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 11/23] builtin/revert: fix leaking `gpg_sign` and `strategy` config Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 12/23] diff: improve lifecycle management of diff queues Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 13/23] line-log: fix several memory leaks Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 14/23] pseudo-merge: fix various " Patrick Steinhardt
2024-09-30 16:21 ` Taylor Blau
2024-09-30 9:13 ` [PATCH 15/23] pseudo-merge: fix leaking strmap keys Patrick Steinhardt
2024-09-30 21:22 ` Taylor Blau
2024-10-07 9:41 ` Patrick Steinhardt
2024-10-08 8:54 ` Jeff King
2024-10-09 7:06 ` Patrick Steinhardt
2024-09-30 9:13 ` [PATCH 16/23] pack-bitmap-write: fix leaking OID array Patrick Steinhardt
2024-09-30 21:22 ` Taylor Blau
2024-09-30 9:14 ` [PATCH 17/23] midx-write: fix leaking buffer Patrick Steinhardt
2024-09-30 21:27 ` Taylor Blau
2024-10-07 9:41 ` Patrick Steinhardt
2024-09-30 9:14 ` [PATCH 18/23] revision: fix memory leaks when rewriting parents Patrick Steinhardt
2024-09-30 9:14 ` [PATCH 19/23] revision: fix leaking saved parents Patrick Steinhardt
2024-09-30 9:14 ` [PATCH 20/23] pack-write: fix return parameter of `write_rev_file_order()` Patrick Steinhardt
2024-09-30 9:14 ` [PATCH 21/23] t/helper: fix leaks in proc-receive helper Patrick Steinhardt
2024-09-30 9:14 ` [PATCH 22/23] remote: fix leaking push reports Patrick Steinhardt
2024-09-30 9:14 ` [PATCH 23/23] builtin/send-pack: fix leaking list of push options Patrick Steinhardt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).