Git development
 help / color / mirror / Atom feed
From: Lorenzo Pegorari <lorenzo.pegorari2002@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Patrick Steinhardt <ps@pks.im>,
	git@vger.kernel.org, Taylor Blau <me@ttaylorr.com>,
	Derrick Stolee <stolee@gmail.com>, Tian Yuchen <cat@malon.dev>,
	Eric Sunshine <sunshine@sunshineco.com>,
	Elijah Newren <newren@gmail.com>
Subject: Re: [GSoC PATCH v5 2/6] repack-promisor add helper to fill promisor file after repack
Date: Sat, 11 Apr 2026 03:59:09 +0200	[thread overview]
Message-ID: <admq7TA9sEPjskO5@lorenzo-VM> (raw)
In-Reply-To: <xmqqo6jqmlzz.fsf@gitster.g>

On Fri, Apr 10, 2026 at 04:30:40PM -0700, Junio C Hamano wrote:
> LorenzoPegorari <lorenzo.pegorari2002@gmail.com> writes:
> 
> > +	dest_pack = parse_pack_index(repo, dest_oid.hash, dest_idx_name);
> 
> parse_pack_index() has this comment:
> 
>         /*
>          * Parse the pack idx file found at idx_path and create a packed_git struct
>          * which can be used with find_pack_entry_one().
>          *
>          * You probably don't want to use this function! It skips most of the normal
>          * sanity checks (including whether we even have the matching .pack file),
>          * and does not add the resulting packed_git struct to the internal list of
>          * packs. You probably want add_packed_git() instead.
>          */
>         struct packed_git *parse_pack_index(struct repository *r, unsigned char *sha1,
>                                             const char *idx_path);
> 
> The function can return NULL, but this caller does not seem to be
> prepared for it to return NULL (i.e., the loop introduced by the
> repo_for_each_pack() macro we see below, nobody assumes dest_pack
> could be NULL).

That's true. The function is created assuming that the promisor files
are correctly formed, but this is unrelated to the promisor files
content, so it should be checked to avoid any issue. I will add a
`die()` if `parse_pack_index()` returns `NULL`, since this means that
something bad happened considering that we literally just created the
pack that we want to retrieve.

> But what pack index file are we parsing here?

`parse_pack_index()` is used to create a temporary `packed_git` for the
just created pack, so that we can then check if <oid> (for each line of
each ".promisor" file) appears in it (and then append that line to the
".promisor" file of the just created pack).

> Isn't it already part
> of the running system that we should be able to find on the list of
> packfiles in the packfile store?

No, the new pack's `packed_git` doesn't appear in the
`repo_for_each_pack` macro loop, if that is what you are asking.

> Is this because we lack "find a
> packfile on this packfile store by name" API, because what we want
> to find if each of <oid>s we have appear in the particular packfile
> or not, and packfile_list_find_oid() is not sufficiently precise (i.e.
> "the object appears in one of the packfile on the list" is not what
> we want to know, "the object appears in this particular packfile" is)?

Yes, it's pretty much this.

> 	Patrick CC'ed primarily because this part of the API and the
> 	data structures have been reshuffled to add quite a lot of
> 	abstraction since I last looked at the area.
> 
> As close_pack_index(dest_pack) does not release resources held by
> dest_pack itself (even though the region of mmaped memory that is
> pointed at by its index_data member is unmapped), I think that is
> where the memory leak is breaking the CI jobs (see my other message).
>
> But I am not sure if the use of parse_pack_index() - close_pack_index()
> API is the right thing to use here.

I struggled a lot to understand how to deal with a pack opened with
`parse_pack_index()`, considering that this function is only used once
in the codebase. In the end I decided to simply use `close_pack_index()`
as it is done in this other instance where `parse_pack_index()` is used.

Having done more research now, I realized that the issue is that the
code is missing a `free(dest_pack)`. With this line, no leak happens,
and all GitHub Actions-based CI tests are green.

Also, I think that `close_pack_index()` is indeed the correct way to
close a `packed_git` opened with `parse_pack_index()` (instead of using
the generic `close_pack()`, or not closing it at all).


Finally, having looked more deeply at that other instance where
`parse_pack_index()` is used, I believe there actually is a possible
memory leak that might rarely ever happen. Isn't there a `free()`
missing, like this:

---
 http.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/http.c b/http.c
index 8ea1b9d1f6..e765852071 100644
--- a/http.c
+++ b/http.c
@@ -2446,8 +2446,10 @@ static int fetch_and_setup_pack_index(struct packfile_list *packs,
 	if (!ret)
 		close_pack_index(new_pack);
 	free(tmp_idx);
-	if (ret)
+	if (ret) {
+		free(new_pack);
 		return -1;
+	}

 	packfile_list_prepend(packs, new_pack);
 	return 0;
--
 
> > +	/* Open the .promisor dest file, and fill dest_content with its content */
> > +	dest_promisor_name = mkpathdup("%s-%s.promisor", packtmp, dest_hex);
> > +	dest = xfopen(dest_promisor_name, "r+");
> > +	while (strbuf_getline(&line, dest) != EOF)
> > +		strset_add(&dest_content, line.buf);
> > +
> > +	repo_for_each_pack(repo, p) {
> > +		FILE *source;
> > +		struct stat source_stat;
> > +
> > +		if (!p->pack_promisor)
> > +			continue;
> > +
> > +		if (not_repacked_basenames &&
> > +			strset_contains(not_repacked_basenames, pack_basename(p)))
> > +			continue;
> > +

Thanks Junio,

Lorenzo

  reply	other threads:[~2026-04-11  1:59 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-21 21:28 [GSoC PATCH 0/3] preserve promisor files content after repack LorenzoPegorari
2026-03-21 21:28 ` [GSoC PATCH 1/3] pack-write: add explanation to promisor file content LorenzoPegorari
2026-03-21 21:28 ` [GSoC PATCH 2/3] pack-write: add helper to fill promisor file after repack LorenzoPegorari
2026-03-22  2:04   ` Eric Sunshine
2026-03-22 18:50     ` Lorenzo Pegorari
2026-03-21 21:29 ` [GSoC PATCH 3/3] repack-promisor: preserve content of promisor files " LorenzoPegorari
2026-03-22 19:16 ` [GSoC PATCH v2 0/4] preserve promisor files content " LorenzoPegorari
2026-03-22 19:16   ` [GSoC PATCH v2 1/4] pack-write: add explanation to promisor file content LorenzoPegorari
2026-03-23 21:07     ` Junio C Hamano
2026-03-25 21:33       ` Lorenzo Pegorari
2026-03-22 19:18   ` [GSoC PATCH v2 2/4] pack-write: add helper to fill promisor file after repack LorenzoPegorari
2026-03-23 20:27     ` Eric Sunshine
2026-03-26 16:15       ` Lorenzo Pegorari
2026-03-23 21:30     ` Junio C Hamano
2026-03-26  2:01       ` Lorenzo Pegorari
2026-03-22 19:18   ` [GSoC PATCH v2 3/4] repack-promisor: preserve content of promisor files " LorenzoPegorari
2026-03-23 21:48     ` Junio C Hamano
2026-03-26  2:12       ` Lorenzo Pegorari
2026-03-22 19:18   ` [GSoC PATCH v2 4/4] t7700: test for promisor file content " LorenzoPegorari
2026-04-06  0:23   ` [GSoC PATCH v3 0/5] preserve promisor files " LorenzoPegorari
2026-04-06  0:24     ` [GSoC PATCH v3 1/5] pack-write: add explanation to promisor file content LorenzoPegorari
2026-04-06  0:24     ` [GSoC PATCH v3 2/5] pack-write: add helper to fill promisor file after repack LorenzoPegorari
2026-04-06 17:22       ` Tian Yuchen
2026-04-06 18:40         ` Lorenzo Pegorari
2026-04-06 21:17           ` Junio C Hamano
2026-04-07 21:46             ` Lorenzo Pegorari
2026-04-07  2:01           ` Junio C Hamano
2026-04-07 21:52             ` Lorenzo Pegorari
2026-04-07 22:03               ` Junio C Hamano
2026-04-06 21:34       ` Junio C Hamano
2026-04-07 22:07         ` Lorenzo Pegorari
2026-04-06  0:25     ` [GSoC PATCH v3 3/5] repack-promisor: preserve content of promisor files " LorenzoPegorari
2026-04-06  0:25     ` [GSoC PATCH v3 4/5] t7700: test for promisor file content " LorenzoPegorari
2026-04-06 22:05       ` Junio C Hamano
2026-04-07 23:28         ` Lorenzo Pegorari
2026-04-07 18:10       ` Junio C Hamano
2026-04-07 23:11         ` Lorenzo Pegorari
2026-04-08  0:38           ` Lorenzo Pegorari
2026-04-06  0:25     ` [GSoC PATCH v3 5/5] t7703: test for promisor file content after geometric repack LorenzoPegorari
2026-04-10 15:01     ` [GSoC PATCH v4 0/5] preserve promisor files content after repack LorenzoPegorari
2026-04-10 15:02       ` [GSoC PATCH v4 1/5] pack-write: add explanation to promisor file content LorenzoPegorari
2026-04-10 15:02       ` [GSoC PATCH v4 2/5] pack-write: add helper to fill promisor file after repack LorenzoPegorari
2026-04-10 16:01         ` Junio C Hamano
2026-04-10 16:34           ` Lorenzo Pegorari
2026-04-10 18:10           ` [PATCH] CodingGuidelines: st_mtimespec vs st_mtim vs st_mtime Junio C Hamano
2026-04-16 23:46             ` Elijah Newren
2026-04-17  4:25               ` Junio C Hamano
2026-04-10 15:03       ` [GSoC PATCH v4 3/5] repack-promisor: preserve content of promisor files after repack LorenzoPegorari
2026-04-10 15:04       ` [GSoC PATCH v4 4/5] t7700: test for promisor file content " LorenzoPegorari
2026-04-10 15:04       ` [GSoC PATCH v4 5/5] t7703: test for promisor file content after geometric repack LorenzoPegorari
2026-04-10 15:47       ` [GSoC PATCH v4 0/5] preserve promisor files content after repack Junio C Hamano
2026-04-10 16:44         ` Lorenzo Pegorari
2026-04-10 22:54       ` [GSoC PATCH v5 0/6] " LorenzoPegorari
2026-04-10 22:54         ` [GSoC PATCH v5 1/6] pack-write: add explanation to promisor file content LorenzoPegorari
2026-04-10 22:55         ` [GSoC PATCH v5 2/6] repack-promisor add helper to fill promisor file after repack LorenzoPegorari
2026-04-10 23:30           ` Junio C Hamano
2026-04-11  1:59             ` Lorenzo Pegorari [this message]
2026-04-12  6:27           ` Junio C Hamano
2026-04-17  0:30             ` Lorenzo Pegorari
2026-04-10 22:55         ` [GSoC PATCH v5 3/6] repack-promisor: preserve content of promisor files " LorenzoPegorari
2026-04-11 18:25           ` Tian Yuchen
2026-04-17  0:34             ` Lorenzo Pegorari
2026-04-10 22:55         ` [GSoC PATCH v5 4/6] t7700: test for promisor file content " LorenzoPegorari
2026-04-10 22:56         ` [GSoC PATCH v5 5/6] t7703: test for promisor file content after geometric repack LorenzoPegorari
2026-04-11 18:49           ` Tian Yuchen
2026-04-17  0:46             ` Lorenzo Pegorari
2026-04-10 22:56         ` [GSoC PATCH v5 6/6] repack-promisor: add missing headers LorenzoPegorari
2026-04-18 14:16         ` [GSoC PATCH v6 0/6] preserve promisor files content after repack LorenzoPegorari
2026-04-18 14:16           ` [GSoC PATCH v6 1/6] pack-write: add explanation to promisor file content LorenzoPegorari
2026-04-18 14:17           ` [GSoC PATCH v6 2/6] repack-promisor add helper to fill promisor file after repack LorenzoPegorari
2026-04-18 14:17           ` [GSoC PATCH v6 3/6] repack-promisor: preserve content of promisor files " LorenzoPegorari
2026-04-18 14:17           ` [GSoC PATCH v6 4/6] t7700: test for promisor file content " LorenzoPegorari
2026-04-18 14:17           ` [GSoC PATCH v6 5/6] t7703: test for promisor file content after geometric repack LorenzoPegorari
2026-04-18 14:17           ` [GSoC PATCH v6 6/6] repack-promisor: add missing headers LorenzoPegorari
2026-05-12  6:49           ` [GSoC PATCH v6 0/6] preserve promisor files content after repack Junio C Hamano
2026-04-10 23:05       ` [GSoC PATCH v4 0/5] " Junio C Hamano
2026-04-11  2:02         ` Junio C Hamano
2026-04-11 14:05           ` Lorenzo Pegorari

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=admq7TA9sEPjskO5@lorenzo-VM \
    --to=lorenzo.pegorari2002@gmail.com \
    --cc=cat@malon.dev \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=newren@gmail.com \
    --cc=ps@pks.im \
    --cc=stolee@gmail.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox