From: Taylor Blau <me@ttaylorr.com>
To: Lidong Yan via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Patrick Steinhardt <ps@pks.im>,
Eric Sunshine <sunshine@sunshineco.com>,
Lidong Yan <502024330056@smail.nju.edu.cn>
Subject: Re: [PATCH v4] pack-bitmap: remove checks before bitmap_free
Date: Tue, 3 Jun 2025 18:09:26 -0400 [thread overview]
Message-ID: <aD9ylkFDWqapFjey@nand.local> (raw)
In-Reply-To: <pull.1977.v4.git.git.1748931650166.gitgitgadget@gmail.com>
On Tue, Jun 03, 2025 at 06:20:49AM +0000, Lidong Yan via GitGitGadget wrote:
> From: Lidong Yan <502024330056@smail.nju.edu.cn>
>
> In pack-bitmap.c:find_boundary_objects(), the roots_bitmap is only freed
> if cascade_pseudo_merges_1() fails. Since cascade_pseudo_merges_1() only
> use roots_bitmap as a mutable reference but not takes roots_bitmap's
> ownership. Once cascade_pseudo_merges_1 succeed(), roots_bitmap leaks.
> And this leak currently lacks a dedicated test to detect it.
>
> To fix this leak, remove if cascade_pseudo_merges_1() succeed check and
> always calling bitmap_free(roots_bitmap);
This sentence might be more clear if it were written as:
To fix this leak, unconditionally free the roots_bitmap regardless
of whether or not cascade_pseudo_merges_1() succeeds.
> To trigger this leak, we need a pseudo-merge whose size is equal to
> or smaller than roots_bitmap (which corresponds to the set of "haves"
> commits in prepare_bitmap_walk()). To do this, we can create two
> commits: A and B. Add A to the pseudo-merge list and perform a traversal
> over the range A..B. In this scenario, the "haves" set will be {A},
> and cascade_pseudo_merges_1() will succeed, thereby exposing the leak
> due to the missing roots_bitmap cleanup.
I don't think this is quite right. Calling cascade_pseudo_merges_1()
succeeds (and returns a non-zero value) when one or more pseudo-merges
are satisfied. A pseudo-merge is satisfied here when its parents bitmap
is a *subset* of the roots_bitmap, not when it has a smaller size.
The precise definition of one bitmap being a subset of another can be
found in ewah/bitmap.c::ewah_bitamp_is_subset(). But in general one
bitmap is a subset of the other if the set of bit positions with value
"1" from one is a subset of the same set from the other bitmap.
I think that's what you meant by "smaller", but I think it's worth
clarifying here.
> diff --git a/pack-bitmap.c b/pack-bitmap.c
> index ac6d62b980c..8727f316de9 100644
> --- a/pack-bitmap.c
> +++ b/pack-bitmap.c
> @@ -1363,8 +1363,8 @@ static struct bitmap *find_boundary_objects(struct bitmap_index *bitmap_git,
> bitmap_set(roots_bitmap, pos);
> }
>
> - if (!cascade_pseudo_merges_1(bitmap_git, cb.base, roots_bitmap))
> - bitmap_free(roots_bitmap);
> + cascade_pseudo_merges_1(bitmap_git, cb.base, roots_bitmap);
> + bitmap_free(roots_bitmap);
Makes sense.
> diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
> index 56674db562f..e665001a410 100755
> --- a/t/t5333-pseudo-merge-bitmaps.sh
> +++ b/t/t5333-pseudo-merge-bitmaps.sh
> @@ -445,4 +445,24 @@ test_expect_success 'pseudo-merge closure' '
> )
> '
>
> +test_expect_success 'use pseudo-merge in boundary traversal' '
> + git init pseudo-merge-boundary-traversal &&
> + (
> + cd pseudo-merge-boundary-traversal &&
> +
> + git config bitmapPseudoMerge.test.pattern refs/ &&
> + git config bitmapPseudoMerge.test.threshold now &&
Setting the unstable threshold here should be unnecessary, since the
unstable portion of the group only includes matching commits beyond the
threshold that *don't* already have a bitmap. Since "A" is the only
commit at the time you write the bitmap below, it will always be
selected, and thus never appear in the unstable portion of a
pseudo-merge group.
> + git config bitmapPseudoMerge.test.stableThreshold now &&
This one is technically unnecessary, but only because test_commit starts
at the $test_tick value, which is very far in the past (beyond the
default value of 1.month.ago).
> + test_commit A &&
> + git repack -adb &&
> + test_commit B &&
> +
> + echo '1' >expect &&
Please do not use single-quotes in a test script. It happens to work in
this instance, but it is easy to break.
> + GIT_TEST_PACK_USE_BITMAP_BOUNDARY_TRAVERSAL=1 \
> + git rev-list --count --use-bitmap-index HEAD~1..HEAD >actual &&
This test needs to use the boundary-based bitmap traversal routines, but
I'm unclear on why you're using the GIT_TEST_-environment variable to
enable them.
Is there a reason that we can't rely on the usual repository
configuration here? I would have expected something like this (which
should apply cleanly on top of your patch):
--- 8< ---
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index e665001a41..491ef404ea 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -453,14 +453,14 @@ test_expect_success 'use pseudo-merge in boundary traversal' '
git config bitmapPseudoMerge.test.pattern refs/ &&
git config bitmapPseudoMerge.test.threshold now &&
git config bitmapPseudoMerge.test.stableThreshold now &&
+ git config pack.useBitmapBoundaryTraversal true &&
test_commit A &&
git repack -adb &&
test_commit B &&
- echo '1' >expect &&
- GIT_TEST_PACK_USE_BITMAP_BOUNDARY_TRAVERSAL=1 \
- git rev-list --count --use-bitmap-index HEAD~1..HEAD >actual &&
+ echo 1 >expect &&
+ git rev-list --count --use-bitmap-index HEAD~1..HEAD >actual &&
test_cmp expect actual
)
'
--- >8 ---
> + test_cmp expect actual
Hmm. I suppose, although it feels a little clunky to me to write
something like "echo 1 >expect". I would imagine that you'd do something
like:
test 1 -eq $(git rev-list --count --use-bitmap-index HEAD~1..HEAD)
instead. Or if you wanted to split them off into separate lines, you
could do:
nr=$(git rev-list --count --use-bitmap-index HEAD~1..HEAD) &&
test 1 -eq "$nr"
Thanks,
Taylor
next prev parent reply other threads:[~2025-06-03 22:09 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-25 5:09 [PATCH] pack-bitmap: remove checks before bitmap_free Lidong Yan via GitGitGadget
2025-05-26 6:49 ` Patrick Steinhardt
2025-05-26 16:05 ` lidongyan
2025-05-30 18:14 ` [PATCH v2 0/2] " Lidong Yan via GitGitGadget
2025-05-30 18:14 ` [PATCH v2 1/2] " Lidong Yan via GitGitGadget
2025-05-30 18:14 ` [PATCH v2 2/2] t5333: test memory leak when use pseudo-merge in boundary traversal Lidong Yan via GitGitGadget
2025-05-30 21:42 ` Junio C Hamano
2025-05-30 21:50 ` Eric Sunshine
2025-05-31 3:18 ` lidongyan
2025-05-30 21:06 ` [PATCH v2 0/2] pack-bitmap: remove checks before bitmap_free Junio C Hamano
2025-06-03 1:46 ` [PATCH v3] " Lidong Yan via GitGitGadget
2025-06-03 6:12 ` Junio C Hamano
2025-06-03 6:22 ` lidongyan
2025-06-03 15:14 ` Junio C Hamano
2025-06-03 15:32 ` lidongyan
2025-06-04 12:32 ` Junio C Hamano
2025-06-04 12:43 ` lidongyan
2025-06-04 14:49 ` Junio C Hamano
2025-06-03 6:20 ` [PATCH v4] " Lidong Yan via GitGitGadget
2025-06-03 22:09 ` Taylor Blau [this message]
2025-06-04 2:50 ` lidongyan
2025-06-05 6:24 ` [PATCH v5] " Lidong Yan via GitGitGadget
2025-06-05 15:29 ` Junio C Hamano
2025-06-10 5:58 ` lidongyan
2025-06-05 15:53 ` [PATCH v6] " Lidong Yan via GitGitGadget
2025-06-06 1:28 ` Junio C Hamano
2025-06-06 5:49 ` lidongyan
2025-06-09 8:18 ` [PATCH v7] " Lidong Yan via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aD9ylkFDWqapFjey@nand.local \
--to=me@ttaylorr.com \
--cc=502024330056@smail.nju.edu.cn \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=ps@pks.im \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.