Git development
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>,
	Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org,  Taylor Blau <me@ttaylorr.com>,
	 Derrick Stolee <stolee@gmail.com>,
	 Patrick Steinhardt <ps@pks.im>,  Tian Yuchen <cat@malon.dev>,
	 Eric Sunshine <sunshine@sunshineco.com>,
	 Elijah Newren <newren@gmail.com>
Subject: Re: [GSoC PATCH v5 2/6] repack-promisor add helper to fill promisor file after repack
Date: Fri, 10 Apr 2026 16:30:40 -0700	[thread overview]
Message-ID: <xmqqo6jqmlzz.fsf@gitster.g> (raw)
In-Reply-To: <3558bb38956b522c91057598db645eb42ffb48b2.1775861047.git.lorenzo.pegorari2002@gmail.com> (LorenzoPegorari's message of "Sat, 11 Apr 2026 00:55:11 +0200")

LorenzoPegorari <lorenzo.pegorari2002@gmail.com> writes:

> +	dest_pack = parse_pack_index(repo, dest_oid.hash, dest_idx_name);

parse_pack_index() has this comment:

        /*
         * Parse the pack idx file found at idx_path and create a packed_git struct
         * which can be used with find_pack_entry_one().
         *
         * You probably don't want to use this function! It skips most of the normal
         * sanity checks (including whether we even have the matching .pack file),
         * and does not add the resulting packed_git struct to the internal list of
         * packs. You probably want add_packed_git() instead.
         */
        struct packed_git *parse_pack_index(struct repository *r, unsigned char *sha1,
                                            const char *idx_path);

The function can return NULL, but this caller does not seem to be
prepared for it to return NULL (i.e., the loop introduced by the
repo_for_each_pack() macro we see below, nobody assumes dest_pack
could be NULL).

But what pack index file are we parsing here?  Isn't it already part
of the running system that we should be able to find on the list of
packfiles in the packfile store?  Is this because we lack "find a
packfile on this packfile store by name" API, because what we want
to find if each of <oid>s we have appear in the particular packfile
or not, and packfile_list_find_oid() is not sufficiently precise (i.e.
"the object appears in one of the packfile on the list" is not what
we want to know, "the object appears in this particular packfile" is)?

	Patrick CC'ed primarily because this part of the API and the
	data structures have been reshuffled to add quite a lot of
	abstraction since I last looked at the area.

As close_pack_index(dest_pack) does not release resources held by
dest_pack itself (even though the region of mmaped memory that is
pointed at by its index_data member is unmapped), I think that is
where the memory leak is breaking the CI jobs (see my other message).

But I am not sure if the use of parse_pack_index() - close_pack_index()
API is the right thing to use here.



> +	/* Open the .promisor dest file, and fill dest_content with its content */
> +	dest_promisor_name = mkpathdup("%s-%s.promisor", packtmp, dest_hex);
> +	dest = xfopen(dest_promisor_name, "r+");
> +	while (strbuf_getline(&line, dest) != EOF)
> +		strset_add(&dest_content, line.buf);
> +
> +	repo_for_each_pack(repo, p) {
> +		FILE *source;
> +		struct stat source_stat;
> +
> +		if (!p->pack_promisor)
> +			continue;
> +
> +		if (not_repacked_basenames &&
> +			strset_contains(not_repacked_basenames, pack_basename(p)))
> +			continue;
> +
> +		strbuf_reset(&source_promisor_name);
> +		strbuf_addstr(&source_promisor_name, p->pack_name);
> +		strbuf_strip_suffix(&source_promisor_name, ".pack");
> +		strbuf_addstr(&source_promisor_name, ".promisor");
> +
> +		if (stat(source_promisor_name.buf, &source_stat))
> +			die(_("File not found: %s"), source_promisor_name.buf);
> +
> +		source = xfopen(source_promisor_name.buf, "r");
> +
> +		while (strbuf_getline(&line, source) != EOF) {
> +			struct string_list line_sections = STRING_LIST_INIT_DUP;
> +			struct object_id oid;
> +
> +			/* Split line into <oid>, <ref> and <time> (if <time> exists) */
> +			string_list_split(&line_sections, line.buf, " ", 3);
> +
> +			/* Ignore the lines where <oid> doesn't appear in the dest_pack */
> +			get_oid_hex_algop(line_sections.items[0].string, &oid, repo->hash_algo);
> +			if (!find_pack_entry_one(&oid, dest_pack)) {
> +				string_list_clear(&line_sections, 0);
> +				continue;
> +			}
> +
> +			/* If <time> doesn't exist, retrieve it and add it to line */
> +			if (line_sections.nr < 3)
> +				strbuf_addf(&line, " %" PRItime, (timestamp_t)source_stat.st_mtime);
> +
> +			/*
> +			 * Add the finalized line to dest_to_write and dest_content if it
> +			 * wasn't already present inside dest_content
> +			 */
> +			if (strset_add(&dest_content, line.buf)) {
> +				strbuf_addbuf(&dest_to_write, &line);
> +				strbuf_addch(&dest_to_write, '\n');
> +			}
> +
> +			string_list_clear(&line_sections, 0);
> +		}
> +
> +		err = ferror(source);
> +		err |= fclose(source);
> +		if (err)
> +			die(_("Could not read '%s' promisor file"), source_promisor_name.buf);
> +	}
> +
> +	/* If dest_to_write is not empty, then there are new lines to append */
> +	if (dest_to_write.len) {
> +		if (fseek(dest, 0L, SEEK_END))
> +			die_errno(_("fseek failed"));
> +		fprintf(dest, "%s", dest_to_write.buf);
> +	}
> +
> +	err = ferror(dest);
> +	err |= fclose(dest);
> +	if (err)
> +		die(_("Could not write '%s' promisor file"), dest_promisor_name);
> +
> +	close_pack_index(dest_pack);
> +	free(dest_idx_name);
> +	free(dest_promisor_name);
> +	strset_clear(&dest_content);
> +	strbuf_release(&dest_to_write);
> +	strbuf_release(&source_promisor_name);
> +	strbuf_release(&line);
> +}
> +
>  static void finish_repacking_promisor_objects(struct repository *repo,
>  					      struct child_process *cmd,
>  					      struct string_list *names,

  reply	other threads:[~2026-04-10 23:30 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-21 21:28 [GSoC PATCH 0/3] preserve promisor files content after repack LorenzoPegorari
2026-03-21 21:28 ` [GSoC PATCH 1/3] pack-write: add explanation to promisor file content LorenzoPegorari
2026-03-21 21:28 ` [GSoC PATCH 2/3] pack-write: add helper to fill promisor file after repack LorenzoPegorari
2026-03-22  2:04   ` Eric Sunshine
2026-03-22 18:50     ` Lorenzo Pegorari
2026-03-21 21:29 ` [GSoC PATCH 3/3] repack-promisor: preserve content of promisor files " LorenzoPegorari
2026-03-22 19:16 ` [GSoC PATCH v2 0/4] preserve promisor files content " LorenzoPegorari
2026-03-22 19:16   ` [GSoC PATCH v2 1/4] pack-write: add explanation to promisor file content LorenzoPegorari
2026-03-23 21:07     ` Junio C Hamano
2026-03-25 21:33       ` Lorenzo Pegorari
2026-03-22 19:18   ` [GSoC PATCH v2 2/4] pack-write: add helper to fill promisor file after repack LorenzoPegorari
2026-03-23 20:27     ` Eric Sunshine
2026-03-26 16:15       ` Lorenzo Pegorari
2026-03-23 21:30     ` Junio C Hamano
2026-03-26  2:01       ` Lorenzo Pegorari
2026-03-22 19:18   ` [GSoC PATCH v2 3/4] repack-promisor: preserve content of promisor files " LorenzoPegorari
2026-03-23 21:48     ` Junio C Hamano
2026-03-26  2:12       ` Lorenzo Pegorari
2026-03-22 19:18   ` [GSoC PATCH v2 4/4] t7700: test for promisor file content " LorenzoPegorari
2026-04-06  0:23   ` [GSoC PATCH v3 0/5] preserve promisor files " LorenzoPegorari
2026-04-06  0:24     ` [GSoC PATCH v3 1/5] pack-write: add explanation to promisor file content LorenzoPegorari
2026-04-06  0:24     ` [GSoC PATCH v3 2/5] pack-write: add helper to fill promisor file after repack LorenzoPegorari
2026-04-06 17:22       ` Tian Yuchen
2026-04-06 18:40         ` Lorenzo Pegorari
2026-04-06 21:17           ` Junio C Hamano
2026-04-07 21:46             ` Lorenzo Pegorari
2026-04-07  2:01           ` Junio C Hamano
2026-04-07 21:52             ` Lorenzo Pegorari
2026-04-07 22:03               ` Junio C Hamano
2026-04-06 21:34       ` Junio C Hamano
2026-04-07 22:07         ` Lorenzo Pegorari
2026-04-06  0:25     ` [GSoC PATCH v3 3/5] repack-promisor: preserve content of promisor files " LorenzoPegorari
2026-04-06  0:25     ` [GSoC PATCH v3 4/5] t7700: test for promisor file content " LorenzoPegorari
2026-04-06 22:05       ` Junio C Hamano
2026-04-07 23:28         ` Lorenzo Pegorari
2026-04-07 18:10       ` Junio C Hamano
2026-04-07 23:11         ` Lorenzo Pegorari
2026-04-08  0:38           ` Lorenzo Pegorari
2026-04-06  0:25     ` [GSoC PATCH v3 5/5] t7703: test for promisor file content after geometric repack LorenzoPegorari
2026-04-10 15:01     ` [GSoC PATCH v4 0/5] preserve promisor files content after repack LorenzoPegorari
2026-04-10 15:02       ` [GSoC PATCH v4 1/5] pack-write: add explanation to promisor file content LorenzoPegorari
2026-04-10 15:02       ` [GSoC PATCH v4 2/5] pack-write: add helper to fill promisor file after repack LorenzoPegorari
2026-04-10 16:01         ` Junio C Hamano
2026-04-10 16:34           ` Lorenzo Pegorari
2026-04-10 18:10           ` [PATCH] CodingGuidelines: st_mtimespec vs st_mtim vs st_mtime Junio C Hamano
2026-04-16 23:46             ` Elijah Newren
2026-04-17  4:25               ` Junio C Hamano
2026-04-10 15:03       ` [GSoC PATCH v4 3/5] repack-promisor: preserve content of promisor files after repack LorenzoPegorari
2026-04-10 15:04       ` [GSoC PATCH v4 4/5] t7700: test for promisor file content " LorenzoPegorari
2026-04-10 15:04       ` [GSoC PATCH v4 5/5] t7703: test for promisor file content after geometric repack LorenzoPegorari
2026-04-10 15:47       ` [GSoC PATCH v4 0/5] preserve promisor files content after repack Junio C Hamano
2026-04-10 16:44         ` Lorenzo Pegorari
2026-04-10 22:54       ` [GSoC PATCH v5 0/6] " LorenzoPegorari
2026-04-10 22:54         ` [GSoC PATCH v5 1/6] pack-write: add explanation to promisor file content LorenzoPegorari
2026-04-10 22:55         ` [GSoC PATCH v5 2/6] repack-promisor add helper to fill promisor file after repack LorenzoPegorari
2026-04-10 23:30           ` Junio C Hamano [this message]
2026-04-11  1:59             ` Lorenzo Pegorari
2026-04-12  6:27           ` Junio C Hamano
2026-04-17  0:30             ` Lorenzo Pegorari
2026-04-10 22:55         ` [GSoC PATCH v5 3/6] repack-promisor: preserve content of promisor files " LorenzoPegorari
2026-04-11 18:25           ` Tian Yuchen
2026-04-17  0:34             ` Lorenzo Pegorari
2026-04-10 22:55         ` [GSoC PATCH v5 4/6] t7700: test for promisor file content " LorenzoPegorari
2026-04-10 22:56         ` [GSoC PATCH v5 5/6] t7703: test for promisor file content after geometric repack LorenzoPegorari
2026-04-11 18:49           ` Tian Yuchen
2026-04-17  0:46             ` Lorenzo Pegorari
2026-04-10 22:56         ` [GSoC PATCH v5 6/6] repack-promisor: add missing headers LorenzoPegorari
2026-04-18 14:16         ` [GSoC PATCH v6 0/6] preserve promisor files content after repack LorenzoPegorari
2026-04-18 14:16           ` [GSoC PATCH v6 1/6] pack-write: add explanation to promisor file content LorenzoPegorari
2026-04-18 14:17           ` [GSoC PATCH v6 2/6] repack-promisor add helper to fill promisor file after repack LorenzoPegorari
2026-04-18 14:17           ` [GSoC PATCH v6 3/6] repack-promisor: preserve content of promisor files " LorenzoPegorari
2026-04-18 14:17           ` [GSoC PATCH v6 4/6] t7700: test for promisor file content " LorenzoPegorari
2026-04-18 14:17           ` [GSoC PATCH v6 5/6] t7703: test for promisor file content after geometric repack LorenzoPegorari
2026-04-18 14:17           ` [GSoC PATCH v6 6/6] repack-promisor: add missing headers LorenzoPegorari
2026-05-12  6:49           ` [GSoC PATCH v6 0/6] preserve promisor files content after repack Junio C Hamano
2026-04-10 23:05       ` [GSoC PATCH v4 0/5] " Junio C Hamano
2026-04-11  2:02         ` Junio C Hamano
2026-04-11 14:05           ` Lorenzo Pegorari

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqo6jqmlzz.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=cat@malon.dev \
    --cc=git@vger.kernel.org \
    --cc=lorenzo.pegorari2002@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=newren@gmail.com \
    --cc=ps@pks.im \
    --cc=stolee@gmail.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox