All of lore.kernel.org
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: avarab@gmail.com, dstolee@microsoft.com, peff@peff.net
Subject: [PATCH v2 0/9] midx: clean up t5319 under 'SANITIZE=leak'
Date: Tue, 26 Oct 2021 17:01:00 -0400	[thread overview]
Message-ID: <cover.1635282024.git.me@ttaylorr.com> (raw)

Here is a small-ish update to my series which makes t5319 leak-free (i.e., it
passes even when Git is compiled with SANITIZE=leak). It is based on
ab/only-single-progress-at-once (so I dropped the cherry-picked patch towards
the end from Ævar).

Most of the fixes are straightforward from review, but the biggest change is in
7/10, which converts get_midx_filename() to write its result to a strbuf per
Junio's helpful suggestion.

Thinking ahead, there are a few things I noted while reading through replies for
future improvements to the MIDX and pack bitmap code. From my ~/notes, they are:

  - load_multi_pack_index() should not die when missing the packfile names chunk,
    instead should pretend that no MIDX exists (maybe warn?)
  - double-close when reaching the cleanup_fail codepath after xmmap returns
  - static `verify_midx_error` is ugly
  - load_bitmap() does not clean up after itself very well (e.g.,
    bitmap_git->trees and others)

I'm pretty happy with the state of this topic, and I'll plan on taking on the
above in separate series in the future.

Taylor Blau (9):
  midx.c: clean up chunkfile after reading the MIDX
  midx.c: don't leak MIDX from verify_midx_file
  t/helper/test-read-midx.c: free MIDX within read_midx_file()
  builtin/pack-objects.c: don't leak memory via arguments
  builtin/repack.c: avoid leaking child arguments
  builtin/multi-pack-index.c: don't leak concatenated options
  midx.c: write MIDX filenames to strbuf
  pack-bitmap.c: don't leak type-level bitmaps
  pack-bitmap.c: more aggressively free in free_bitmap_index()

 builtin/multi-pack-index.c |  4 +++
 builtin/pack-objects.c     | 11 ++++---
 builtin/repack.c           | 23 +++++++++----
 midx.c                     | 66 ++++++++++++++++++++++----------------
 midx.h                     |  4 +--
 pack-bitmap.c              | 29 ++++++++++++++---
 pack-revindex.c            |  8 ++---
 t/helper/test-read-midx.c  |  3 +-
 8 files changed, 97 insertions(+), 51 deletions(-)

Range-diff against v1:
 1:  30f6f23daf =  1:  dcc5998072 midx.c: clean up chunkfile after reading the MIDX
 2:  b0c79904ab !  2:  258a9e2e57 midx.c: don't leak MIDX from verify_midx_file
    @@ Metadata
      ## Commit message ##
         midx.c: don't leak MIDX from verify_midx_file
     
    -    The function midx.c:verify_midx_file() allocate a MIDX struct by calling
    -    load_multi_pack_index(). But when cleaning up, it calls free() without
    -    freeing any resources associated with the MIDX.
    +    The function midx.c:verify_midx_file() allocates a MIDX struct by
    +    calling load_multi_pack_index(). But when cleaning up, it calls free()
    +    without freeing any resources associated with the MIDX.
     
         Call the more appropriate close_midx() which does free those resources,
         which causes t5319.3 to pass when Git is compiled with SANITIZE=leak.
 3:  5157edb41e =  3:  84859d5b53 t/helper/test-read-midx.c: free MIDX within read_midx_file()
 4:  dd3b9a949e =  4:  aedb1713b4 builtin/pack-objects.c: don't leak memory via arguments
 5:  a68c77c006 !  5:  bcd12ecab8 builtin/repack.c: avoid leaking child arguments
    @@ Commit message
         In none of these cases do we bother to call child_process_clear(), which
         frees the memory associated with each child's arguments and environment.
     
    -    In order to do so, tweak each function that spawns a child process to
    -    have a `cleanup` label that we always visit before returning from each
    -    function. Then, make sure that we call child_process_clear() as a part
    -    of that label.
    +    Make sure that we call child_process_clear() in any functions which
    +    initialize a struct child_process before returning along a path which
    +    did not call finish_command().
    +
    +    In cmd_repack(), take a slightly different approach to use a cleanup
    +    label to clear the child_process, unless finish_command() was called.
    +    This allows us to free other memory allocated during the lifetime of
    +    that function. But it avoids calling child_process_clear() twice (the
    +    other call coming from inside of finish_command()) to avoid assuming the
    +    function's implementation is idempotent.
     
         Signed-off-by: Taylor Blau <me@ttaylorr.com>
     
      ## builtin/repack.c ##
     @@ builtin/repack.c: static void repack_promisor_objects(const struct pack_objects_args *args,
    - 	struct child_process cmd = CHILD_PROCESS_INIT;
    - 	FILE *out;
    - 	struct strbuf line = STRBUF_INIT;
    -+	int ret = 0;
    + 	for_each_packed_object(write_oid, &cmd,
    + 			       FOR_EACH_OBJECT_PROMISOR_ONLY);
      
    - 	prepare_pack_objects(&cmd, args);
    - 	cmd.in = -1;
    -@@ builtin/repack.c: static void repack_promisor_objects(const struct pack_objects_args *args,
    - 
    - 	if (cmd.in == -1)
    +-	if (cmd.in == -1)
    ++	if (cmd.in == -1) {
      		/* No packed objects; cmd was never started */
    --		return;
    -+		goto cleanup;
    ++		child_process_clear(&cmd);
    + 		return;
    ++	}
      
      	close(cmd.in);
      
    -@@ builtin/repack.c: static void repack_promisor_objects(const struct pack_objects_args *args,
    - 		free(promisor_name);
    - 	}
    - 	fclose(out);
    --	if (finish_command(&cmd))
    -+	ret = finish_command(&cmd);
    -+
    -+cleanup:
    -+	child_process_clear(&cmd);
    -+
    -+	if (ret)
    - 		die(_("could not finish pack-objects to repack promisor objects"));
    - }
    - 
    -@@ builtin/repack.c: static int write_midx_included_packs(struct string_list *include,
    - 	struct string_list_item *item;
    - 	struct packed_git *largest = get_largest_active_pack(geometry);
    - 	FILE *in;
    --	int ret;
    -+	int ret = 0;
    - 
    - 	if (!include->nr)
    --		return 0;
    -+		goto cleanup;
    - 
    - 	cmd.in = -1;
    - 	cmd.git_cmd = 1;
     @@ builtin/repack.c: static int write_midx_included_packs(struct string_list *include,
    + 		strvec_pushf(&cmd.args, "--refs-snapshot=%s", refs_snapshot);
      
      	ret = start_command(&cmd);
    - 	if (ret)
    --		return ret;
    -+		goto cleanup;
    +-	if (ret)
    ++	if (ret) {
    ++		child_process_clear(&cmd);
    + 		return ret;
    ++	}
      
      	in = xfdopen(cmd.in, "w");
      	for_each_string_list_item(item, include)
    - 		fprintf(in, "%s\n", item->string);
    - 	fclose(in);
    - 
    --	return finish_command(&cmd);
    -+	ret = finish_command(&cmd);
    -+
    -+cleanup:
    -+	child_process_clear(&cmd);
    -+
    -+	return ret;
    - }
    - 
    - int cmd_repack(int argc, const char **argv, const char *prefix)
     @@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix)
      	struct pack_geometry *geometry = NULL;
      	struct strbuf line = STRBUF_INIT;
    @@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix
     +	int i, ext, ret = 0;
      	FILE *out;
      	int show_progress = isatty(2);
    ++	int cmd_cleared = 0;
      
    + 	/* variables to be filled by option parsing */
    + 	int pack_everything = 0;
     @@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix)
      
      	ret = start_command(&cmd);
    @@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix
      	if (geometry) {
      		FILE *in = xfdopen(cmd.in, "w");
     @@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix)
    + 	}
      	fclose(out);
      	ret = finish_command(&cmd);
    ++	cmd_cleared = 1;
      	if (ret)
     -		return ret;
     +		goto cleanup;
    @@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix
      	string_list_clear(&existing_kept_packs, 0);
      	clear_pack_geometry(geometry);
      	strbuf_release(&line);
    -+	child_process_clear(&cmd);
    ++	if (!cmd_cleared)
    ++		child_process_clear(&cmd);
      
     -	return 0;
     +	return ret;
 6:  ffded80c7d =  6:  0d252cd323 builtin/multi-pack-index.c: don't leak concatenated options
 7:  f3897c3afc <  -:  ---------- pack-bitmap.c: avoid leaking via midx_bitmap_filename()
 -:  ---------- >  7:  0f293ab638 midx.c: write MIDX filenames to strbuf
 8:  29920e7735 =  8:  77a4454632 pack-bitmap.c: don't leak type-level bitmaps
 9:  e65ac7deb5 !  9:  c1e7e6cc92 pack-bitmap.c: more aggressively free in free_bitmap_index()
    @@ Commit message
         The function free_bitmap_index() is somewhat lax in what it frees. There
         are two notable examples:
     
    -      - While it does call kh_destroy_oid_map on the "bitmaps" map (which
    -        maps) commit OIDs to their corresponding bitmaps, the bitmaps
    +      - While it does call kh_destroy_oid_map on the "bitmaps" map, which
    +        maps commit OIDs to their corresponding bitmaps, the bitmaps
             themselves are not freed. Note here that we recycle already-freed
             ewah_bitmaps into a pool, but these are handled correctly by
             ewah_pool_free().
10:  cb30aa67c0 <  -:  ---------- pack-bitmap-write.c: don't return without stop_progress()
11:  f1bb8b73ff <  -:  ---------- t5319: UNLEAK() the remaining leaks
-- 
2.33.0.96.g73915697e6

             reply	other threads:[~2021-10-26 21:01 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-26 21:01 Taylor Blau [this message]
2021-10-26 21:01 ` [PATCH v2 1/9] midx.c: clean up chunkfile after reading the MIDX Taylor Blau
2021-10-26 21:01 ` [PATCH v2 2/9] midx.c: don't leak MIDX from verify_midx_file Taylor Blau
2021-10-26 21:01 ` [PATCH v2 3/9] t/helper/test-read-midx.c: free MIDX within read_midx_file() Taylor Blau
2021-10-26 21:01 ` [PATCH v2 4/9] builtin/pack-objects.c: don't leak memory via arguments Taylor Blau
2021-10-26 21:01 ` [PATCH v2 5/9] builtin/repack.c: avoid leaking child arguments Taylor Blau
2021-10-27 23:44   ` Junio C Hamano
2021-10-28 20:25     ` Taylor Blau
2021-10-26 21:01 ` [PATCH v2 6/9] builtin/multi-pack-index.c: don't leak concatenated options Taylor Blau
2021-10-26 21:01 ` [PATCH v2 7/9] midx.c: write MIDX filenames to strbuf Taylor Blau
2021-10-26 21:01 ` [PATCH v2 8/9] pack-bitmap.c: don't leak type-level bitmaps Taylor Blau
2021-10-26 21:01 ` [PATCH v2 9/9] pack-bitmap.c: more aggressively free in free_bitmap_index() Taylor Blau
2021-10-27 23:49 ` [PATCH v2 0/9] midx: clean up t5319 under 'SANITIZE=leak' Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1635282024.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc=avarab@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.