From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: avarab@gmail.com, dstolee@microsoft.com, peff@peff.net
Subject: [PATCH v2 0/9] midx: clean up t5319 under 'SANITIZE=leak'
Date: Tue, 26 Oct 2021 17:01:00 -0400 [thread overview]
Message-ID: <cover.1635282024.git.me@ttaylorr.com> (raw)
Here is a small-ish update to my series which makes t5319 leak-free (i.e., it
passes even when Git is compiled with SANITIZE=leak). It is based on
ab/only-single-progress-at-once (so I dropped the cherry-picked patch towards
the end from Ævar).
Most of the fixes are straightforward from review, but the biggest change is in
7/10, which converts get_midx_filename() to write its result to a strbuf per
Junio's helpful suggestion.
Thinking ahead, there are a few things I noted while reading through replies for
future improvements to the MIDX and pack bitmap code. From my ~/notes, they are:
- load_multi_pack_index() should not die when missing the packfile names chunk,
instead should pretend that no MIDX exists (maybe warn?)
- double-close when reaching the cleanup_fail codepath after xmmap returns
- static `verify_midx_error` is ugly
- load_bitmap() does not clean up after itself very well (e.g.,
bitmap_git->trees and others)
I'm pretty happy with the state of this topic, and I'll plan on taking on the
above in separate series in the future.
Taylor Blau (9):
midx.c: clean up chunkfile after reading the MIDX
midx.c: don't leak MIDX from verify_midx_file
t/helper/test-read-midx.c: free MIDX within read_midx_file()
builtin/pack-objects.c: don't leak memory via arguments
builtin/repack.c: avoid leaking child arguments
builtin/multi-pack-index.c: don't leak concatenated options
midx.c: write MIDX filenames to strbuf
pack-bitmap.c: don't leak type-level bitmaps
pack-bitmap.c: more aggressively free in free_bitmap_index()
builtin/multi-pack-index.c | 4 +++
builtin/pack-objects.c | 11 ++++---
builtin/repack.c | 23 +++++++++----
midx.c | 66 ++++++++++++++++++++++----------------
midx.h | 4 +--
pack-bitmap.c | 29 ++++++++++++++---
pack-revindex.c | 8 ++---
t/helper/test-read-midx.c | 3 +-
8 files changed, 97 insertions(+), 51 deletions(-)
Range-diff against v1:
1: 30f6f23daf = 1: dcc5998072 midx.c: clean up chunkfile after reading the MIDX
2: b0c79904ab ! 2: 258a9e2e57 midx.c: don't leak MIDX from verify_midx_file
@@ Metadata
## Commit message ##
midx.c: don't leak MIDX from verify_midx_file
- The function midx.c:verify_midx_file() allocate a MIDX struct by calling
- load_multi_pack_index(). But when cleaning up, it calls free() without
- freeing any resources associated with the MIDX.
+ The function midx.c:verify_midx_file() allocates a MIDX struct by
+ calling load_multi_pack_index(). But when cleaning up, it calls free()
+ without freeing any resources associated with the MIDX.
Call the more appropriate close_midx() which does free those resources,
which causes t5319.3 to pass when Git is compiled with SANITIZE=leak.
3: 5157edb41e = 3: 84859d5b53 t/helper/test-read-midx.c: free MIDX within read_midx_file()
4: dd3b9a949e = 4: aedb1713b4 builtin/pack-objects.c: don't leak memory via arguments
5: a68c77c006 ! 5: bcd12ecab8 builtin/repack.c: avoid leaking child arguments
@@ Commit message
In none of these cases do we bother to call child_process_clear(), which
frees the memory associated with each child's arguments and environment.
- In order to do so, tweak each function that spawns a child process to
- have a `cleanup` label that we always visit before returning from each
- function. Then, make sure that we call child_process_clear() as a part
- of that label.
+ Make sure that we call child_process_clear() in any functions which
+ initialize a struct child_process before returning along a path which
+ did not call finish_command().
+
+ In cmd_repack(), take a slightly different approach to use a cleanup
+ label to clear the child_process, unless finish_command() was called.
+ This allows us to free other memory allocated during the lifetime of
+ that function. But it avoids calling child_process_clear() twice (the
+ other call coming from inside of finish_command()) to avoid assuming the
+ function's implementation is idempotent.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
## builtin/repack.c ##
@@ builtin/repack.c: static void repack_promisor_objects(const struct pack_objects_args *args,
- struct child_process cmd = CHILD_PROCESS_INIT;
- FILE *out;
- struct strbuf line = STRBUF_INIT;
-+ int ret = 0;
+ for_each_packed_object(write_oid, &cmd,
+ FOR_EACH_OBJECT_PROMISOR_ONLY);
- prepare_pack_objects(&cmd, args);
- cmd.in = -1;
-@@ builtin/repack.c: static void repack_promisor_objects(const struct pack_objects_args *args,
-
- if (cmd.in == -1)
+- if (cmd.in == -1)
++ if (cmd.in == -1) {
/* No packed objects; cmd was never started */
-- return;
-+ goto cleanup;
++ child_process_clear(&cmd);
+ return;
++ }
close(cmd.in);
-@@ builtin/repack.c: static void repack_promisor_objects(const struct pack_objects_args *args,
- free(promisor_name);
- }
- fclose(out);
-- if (finish_command(&cmd))
-+ ret = finish_command(&cmd);
-+
-+cleanup:
-+ child_process_clear(&cmd);
-+
-+ if (ret)
- die(_("could not finish pack-objects to repack promisor objects"));
- }
-
-@@ builtin/repack.c: static int write_midx_included_packs(struct string_list *include,
- struct string_list_item *item;
- struct packed_git *largest = get_largest_active_pack(geometry);
- FILE *in;
-- int ret;
-+ int ret = 0;
-
- if (!include->nr)
-- return 0;
-+ goto cleanup;
-
- cmd.in = -1;
- cmd.git_cmd = 1;
@@ builtin/repack.c: static int write_midx_included_packs(struct string_list *include,
+ strvec_pushf(&cmd.args, "--refs-snapshot=%s", refs_snapshot);
ret = start_command(&cmd);
- if (ret)
-- return ret;
-+ goto cleanup;
+- if (ret)
++ if (ret) {
++ child_process_clear(&cmd);
+ return ret;
++ }
in = xfdopen(cmd.in, "w");
for_each_string_list_item(item, include)
- fprintf(in, "%s\n", item->string);
- fclose(in);
-
-- return finish_command(&cmd);
-+ ret = finish_command(&cmd);
-+
-+cleanup:
-+ child_process_clear(&cmd);
-+
-+ return ret;
- }
-
- int cmd_repack(int argc, const char **argv, const char *prefix)
@@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix)
struct pack_geometry *geometry = NULL;
struct strbuf line = STRBUF_INIT;
@@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix
+ int i, ext, ret = 0;
FILE *out;
int show_progress = isatty(2);
++ int cmd_cleared = 0;
+ /* variables to be filled by option parsing */
+ int pack_everything = 0;
@@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix)
ret = start_command(&cmd);
@@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix
if (geometry) {
FILE *in = xfdopen(cmd.in, "w");
@@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix)
+ }
fclose(out);
ret = finish_command(&cmd);
++ cmd_cleared = 1;
if (ret)
- return ret;
+ goto cleanup;
@@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix
string_list_clear(&existing_kept_packs, 0);
clear_pack_geometry(geometry);
strbuf_release(&line);
-+ child_process_clear(&cmd);
++ if (!cmd_cleared)
++ child_process_clear(&cmd);
- return 0;
+ return ret;
6: ffded80c7d = 6: 0d252cd323 builtin/multi-pack-index.c: don't leak concatenated options
7: f3897c3afc < -: ---------- pack-bitmap.c: avoid leaking via midx_bitmap_filename()
-: ---------- > 7: 0f293ab638 midx.c: write MIDX filenames to strbuf
8: 29920e7735 = 8: 77a4454632 pack-bitmap.c: don't leak type-level bitmaps
9: e65ac7deb5 ! 9: c1e7e6cc92 pack-bitmap.c: more aggressively free in free_bitmap_index()
@@ Commit message
The function free_bitmap_index() is somewhat lax in what it frees. There
are two notable examples:
- - While it does call kh_destroy_oid_map on the "bitmaps" map (which
- maps) commit OIDs to their corresponding bitmaps, the bitmaps
+ - While it does call kh_destroy_oid_map on the "bitmaps" map, which
+ maps commit OIDs to their corresponding bitmaps, the bitmaps
themselves are not freed. Note here that we recycle already-freed
ewah_bitmaps into a pool, but these are handled correctly by
ewah_pool_free().
10: cb30aa67c0 < -: ---------- pack-bitmap-write.c: don't return without stop_progress()
11: f1bb8b73ff < -: ---------- t5319: UNLEAK() the remaining leaks
--
2.33.0.96.g73915697e6
next reply other threads:[~2021-10-26 21:01 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-26 21:01 Taylor Blau [this message]
2021-10-26 21:01 ` [PATCH v2 1/9] midx.c: clean up chunkfile after reading the MIDX Taylor Blau
2021-10-26 21:01 ` [PATCH v2 2/9] midx.c: don't leak MIDX from verify_midx_file Taylor Blau
2021-10-26 21:01 ` [PATCH v2 3/9] t/helper/test-read-midx.c: free MIDX within read_midx_file() Taylor Blau
2021-10-26 21:01 ` [PATCH v2 4/9] builtin/pack-objects.c: don't leak memory via arguments Taylor Blau
2021-10-26 21:01 ` [PATCH v2 5/9] builtin/repack.c: avoid leaking child arguments Taylor Blau
2021-10-27 23:44 ` Junio C Hamano
2021-10-28 20:25 ` Taylor Blau
2021-10-26 21:01 ` [PATCH v2 6/9] builtin/multi-pack-index.c: don't leak concatenated options Taylor Blau
2021-10-26 21:01 ` [PATCH v2 7/9] midx.c: write MIDX filenames to strbuf Taylor Blau
2021-10-26 21:01 ` [PATCH v2 8/9] pack-bitmap.c: don't leak type-level bitmaps Taylor Blau
2021-10-26 21:01 ` [PATCH v2 9/9] pack-bitmap.c: more aggressively free in free_bitmap_index() Taylor Blau
2021-10-27 23:49 ` [PATCH v2 0/9] midx: clean up t5319 under 'SANITIZE=leak' Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1635282024.git.me@ttaylorr.com \
--to=me@ttaylorr.com \
--cc=avarab@gmail.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).