* [GSoC PATCH 1/3] pack-write: add explanation to promisor file content
2026-03-21 21:28 [GSoC PATCH 0/3] preserve promisor files content after repack LorenzoPegorari
@ 2026-03-21 21:28 ` LorenzoPegorari
2026-03-21 21:28 ` [GSoC PATCH 2/3] pack-write: add helper to fill promisor file after repack LorenzoPegorari
` (2 subsequent siblings)
3 siblings, 0 replies; 19+ messages in thread
From: LorenzoPegorari @ 2026-03-21 21:28 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Taylor Blau, Karthik Nayak, Junio C Hamano
In the entire codebase there is no explanation as to why the ".promisor"
files may contain the ref names (and their associated hashes) that were
fetched at the time the corresponding packfile was downloaded.
Add comment explaining that these pieces of information are used only for
debugging reasons, and how they can be used while debugging.
Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
---
pack-write.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/pack-write.c b/pack-write.c
index 83eaf88541..6a2023327e 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -603,6 +603,15 @@ void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_
int i, err;
FILE *output = xfopen(promisor_name, "w");
+ /*
+ * Write in the .promisor file the ref names and associated hashes,
+ * obtained by fetch-pack, at the point of generation of the
+ * corresponding packfile. These pieces of info are only used to make
+ * it easier to debug issues with partial clones, as we can identify
+ * what refs (and their associated hashes) were fetched at the time
+ * the packfile was downloaded, and if necessary, compare those hashes
+ * against what the promisor remote reports now.
+ */
for (i = 0; i < nr_sought; i++)
fprintf(output, "%s %s\n", oid_to_hex(&sought[i]->old_oid),
sought[i]->name);
--
2.43.0
^ permalink raw reply related [flat|nested] 19+ messages in thread* [GSoC PATCH 2/3] pack-write: add helper to fill promisor file after repack
2026-03-21 21:28 [GSoC PATCH 0/3] preserve promisor files content after repack LorenzoPegorari
2026-03-21 21:28 ` [GSoC PATCH 1/3] pack-write: add explanation to promisor file content LorenzoPegorari
@ 2026-03-21 21:28 ` LorenzoPegorari
2026-03-22 2:04 ` Eric Sunshine
2026-03-21 21:29 ` [GSoC PATCH 3/3] repack-promisor: preserve content of promisor files " LorenzoPegorari
2026-03-22 19:16 ` [GSoC PATCH v2 0/4] preserve promisor files content " LorenzoPegorari
3 siblings, 1 reply; 19+ messages in thread
From: LorenzoPegorari @ 2026-03-21 21:28 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Taylor Blau, Karthik Nayak, Junio C Hamano
Create a `copy_all_promisor_files()` helper function used to copy the
contents of all ".promisor" files in a `repository` inside another
".promisor" file.
This function can be used to preserve the contents of all ".promisor"
files inside a new ".promisor" file, for example when a repack happens.
This function is written in such a way so that it will read all the
".promisor" files inside the given `repository` line by line, and copy
only the lines that are not already present in the destination file. This
is done to avoid copying the same lines multiple times that may come from
multiple (redundant) packfiles. A better way to achieve this might be (is
definitely) possible.
Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
---
pack-write.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++++
pack.h | 1 +
2 files changed, 63 insertions(+)
diff --git a/pack-write.c b/pack-write.c
index 6a2023327e..3620e6bd02 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -621,3 +621,65 @@ void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_
if (err)
die(_("could not write '%s' promisor file"), promisor_name);
}
+
+void copy_all_promisor_files(struct repository *repo, const char *promisor_name)
+{
+ struct strbuf promisor_source_name = STRBUF_INIT;
+ struct strbuf read_source = STRBUF_INIT, read_dest = STRBUF_INIT;
+ struct strbuf write_dest = STRBUF_INIT;
+ int err;
+
+ FILE *dest = xfopen(promisor_name, "r+");
+
+ struct packed_git *p;
+ repo_for_each_pack(repo, p) {
+ if (!p->pack_promisor)
+ continue;
+
+ strbuf_reset(&promisor_source_name);
+ strbuf_addstr(&promisor_source_name, p->pack_name);
+ strbuf_strip_suffix(&promisor_source_name, ".pack");
+ strbuf_addstr(&promisor_source_name, ".promisor");
+ FILE *source = xfopen(promisor_source_name.buf, "r");
+
+ /*
+ * For each line of the promisor source file, check if it already
+ * is in the promisor dest file. If not, add it to write_dest, so
+ * that it will be written in the dest file.
+ */
+ while (strbuf_getline(&read_source, source) != EOF) {
+ if (fseek(dest, 0L, SEEK_SET))
+ die_errno(_("fseek failed"));
+ int is_source_in_dest = 0;
+ while (strbuf_getline(&read_dest, dest) != EOF) {
+ if (!strbuf_cmp(&read_source, &read_dest)) {
+ is_source_in_dest = 1;
+ break;
+ }
+ }
+ if (!is_source_in_dest) {
+ strbuf_addbuf(&write_dest, &read_source);
+ strbuf_addstr(&write_dest, "\n");
+ }
+ }
+
+ if (write_dest.len) {
+ strbuf_strip_suffix(&write_dest, "\n");
+ if (fseek(dest, 0L, SEEK_END))
+ die_errno(_("fseek failed"));
+ fprintf(dest, "%s\n", write_dest.buf);
+ fflush(dest);
+ strbuf_reset(&write_dest);
+ }
+
+ err = ferror(source);
+ err |= fclose(source);
+ if (err)
+ die(_("could not read '%s' promisor file"), promisor_source_name.buf);
+ }
+
+ err = ferror(dest);
+ err |= fclose(dest);
+ if (err)
+ die(_("could not write '%s' promisor file"), promisor_name);
+}
diff --git a/pack.h b/pack.h
index ec76472e49..509e90edba 100644
--- a/pack.h
+++ b/pack.h
@@ -105,6 +105,7 @@ char *index_pack_lockfile(struct repository *r, int fd, int *is_well_formed);
struct ref;
void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought);
+void copy_all_promisor_files(struct repository *repo, const char *promisor_name);
char *write_rev_file(struct repository *repo,
const char *rev_name,
--
2.43.0
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [GSoC PATCH 2/3] pack-write: add helper to fill promisor file after repack
2026-03-21 21:28 ` [GSoC PATCH 2/3] pack-write: add helper to fill promisor file after repack LorenzoPegorari
@ 2026-03-22 2:04 ` Eric Sunshine
2026-03-22 18:50 ` Lorenzo Pegorari
0 siblings, 1 reply; 19+ messages in thread
From: Eric Sunshine @ 2026-03-22 2:04 UTC (permalink / raw)
To: LorenzoPegorari
Cc: git, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Junio C Hamano
On Sat, Mar 21, 2026 at 5:29 PM LorenzoPegorari
<lorenzo.pegorari2002@gmail.com> wrote:
> Create a `copy_all_promisor_files()` helper function used to copy the
> contents of all ".promisor" files in a `repository` inside another
> ".promisor" file.
>
> This function can be used to preserve the contents of all ".promisor"
> files inside a new ".promisor" file, for example when a repack happens.
>
> This function is written in such a way so that it will read all the
> ".promisor" files inside the given `repository` line by line, and copy
> only the lines that are not already present in the destination file. This
> is done to avoid copying the same lines multiple times that may come from
> multiple (redundant) packfiles. A better way to achieve this might be (is
> definitely) possible.
>
> Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
> ---
> diff --git a/pack-write.c b/pack-write.c
> @@ -621,3 +621,65 @@ void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_
> +void copy_all_promisor_files(struct repository *repo, const char *promisor_name)
> +{
> + struct strbuf promisor_source_name = STRBUF_INIT;
> + struct strbuf read_source = STRBUF_INIT, read_dest = STRBUF_INIT;
> + struct strbuf write_dest = STRBUF_INIT;
These strbufs don't seem to be released, thus are leaked.
> + int err;
> +
> + FILE *dest = xfopen(promisor_name, "r+");
> +
> + struct packed_git *p;
Style nit: Place all the variable declarations together (without blank
lines), followed by a blank line.
> + repo_for_each_pack(repo, p) {
> + if (!p->pack_promisor)
> + continue;
> +
> + strbuf_reset(&promisor_source_name);
> + strbuf_addstr(&promisor_source_name, p->pack_name);
> + strbuf_strip_suffix(&promisor_source_name, ".pack");
> + strbuf_addstr(&promisor_source_name, ".promisor");
> + FILE *source = xfopen(promisor_source_name.buf, "r");
This project still frowns upon variable declaration after code. You
will want to declare `FILE *source;` at the top of this loop body and
then assign `source = xfopen(...)` here.
> + /*
> + * For each line of the promisor source file, check if it already
> + * is in the promisor dest file. If not, add it to write_dest, so
> + * that it will be written in the dest file.
> + */
> + while (strbuf_getline(&read_source, source) != EOF) {
> + if (fseek(dest, 0L, SEEK_SET))
> + die_errno(_("fseek failed"));
> + int is_source_in_dest = 0;
Ditto regarding variable declaration following code.
> + while (strbuf_getline(&read_dest, dest) != EOF) {
> + if (!strbuf_cmp(&read_source, &read_dest)) {
> + is_source_in_dest = 1;
> + break;
> + }
> + }
> + if (!is_source_in_dest) {
> + strbuf_addbuf(&write_dest, &read_source);
> + strbuf_addstr(&write_dest, "\n");
> + }
The commit message talks about this, and it is indeed very ugly that
this re-reads `dest` from the beginning for *every* `source` line. Is
there a reason you can't simply read `dest` into a `strset` (see Git's
`strmap.h`) in its entirety before entering the repo_for_each_pack()
loop and then merely check the strset for existence using
strset_add()?
> + }
> +
> + if (write_dest.len) {
> + strbuf_strip_suffix(&write_dest, "\n");
> + if (fseek(dest, 0L, SEEK_END))
> + die_errno(_("fseek failed"));
> + fprintf(dest, "%s\n", write_dest.buf);
> + fflush(dest);
> + strbuf_reset(&write_dest);
> + }
> +
> + err = ferror(source);
> + err |= fclose(source);
> + if (err)
> + die(_("could not read '%s' promisor file"), promisor_source_name.buf);
> + }
> +
> + err = ferror(dest);
> + err |= fclose(dest);
> + if (err)
> + die(_("could not write '%s' promisor file"), promisor_name);
> +}
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [GSoC PATCH 2/3] pack-write: add helper to fill promisor file after repack
2026-03-22 2:04 ` Eric Sunshine
@ 2026-03-22 18:50 ` Lorenzo Pegorari
0 siblings, 0 replies; 19+ messages in thread
From: Lorenzo Pegorari @ 2026-03-22 18:50 UTC (permalink / raw)
To: Eric Sunshine
Cc: git, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Junio C Hamano
On Sat, Mar 21, 2026 at 10:04:01PM -0400, Eric Sunshine wrote:
> On Sat, Mar 21, 2026 at 5:29 PM LorenzoPegorari
> <lorenzo.pegorari2002@gmail.com> wrote:
> > Create a `copy_all_promisor_files()` helper function used to copy the
> > contents of all ".promisor" files in a `repository` inside another
> > ".promisor" file.
> >
> > This function can be used to preserve the contents of all ".promisor"
> > files inside a new ".promisor" file, for example when a repack happens.
> >
> > This function is written in such a way so that it will read all the
> > ".promisor" files inside the given `repository` line by line, and copy
> > only the lines that are not already present in the destination file. This
> > is done to avoid copying the same lines multiple times that may come from
> > multiple (redundant) packfiles. A better way to achieve this might be (is
> > definitely) possible.
> >
> > Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
> > ---
> > diff --git a/pack-write.c b/pack-write.c
> > @@ -621,3 +621,65 @@ void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_
> > +void copy_all_promisor_files(struct repository *repo, const char *promisor_name)
> > +{
> > + struct strbuf promisor_source_name = STRBUF_INIT;
> > + struct strbuf read_source = STRBUF_INIT, read_dest = STRBUF_INIT;
> > + struct strbuf write_dest = STRBUF_INIT;
>
> These strbufs don't seem to be released, thus are leaked.
Of course... trivial mistake. Will fix it in v2.
> > + int err;
> > +
> > + FILE *dest = xfopen(promisor_name, "r+");
> > +
> > + struct packed_git *p;
>
> Style nit: Place all the variable declarations together (without blank
> lines), followed by a blank line.
Ack.
> > + repo_for_each_pack(repo, p) {
> > + if (!p->pack_promisor)
> > + continue;
> > +
> > + strbuf_reset(&promisor_source_name);
> > + strbuf_addstr(&promisor_source_name, p->pack_name);
> > + strbuf_strip_suffix(&promisor_source_name, ".pack");
> > + strbuf_addstr(&promisor_source_name, ".promisor");
> > + FILE *source = xfopen(promisor_source_name.buf, "r");
>
> This project still frowns upon variable declaration after code. You
> will want to declare `FILE *source;` at the top of this loop body and
> then assign `source = xfopen(...)` here.
Ack.
> > + /*
> > + * For each line of the promisor source file, check if it already
> > + * is in the promisor dest file. If not, add it to write_dest, so
> > + * that it will be written in the dest file.
> > + */
> > + while (strbuf_getline(&read_source, source) != EOF) {
> > + if (fseek(dest, 0L, SEEK_SET))
> > + die_errno(_("fseek failed"));
> > + int is_source_in_dest = 0;
>
> Ditto regarding variable declaration following code.
Ack.
> > + while (strbuf_getline(&read_dest, dest) != EOF) {
> > + if (!strbuf_cmp(&read_source, &read_dest)) {
> > + is_source_in_dest = 1;
> > + break;
> > + }
> > + }
> > + if (!is_source_in_dest) {
> > + strbuf_addbuf(&write_dest, &read_source);
> > + strbuf_addstr(&write_dest, "\n");
> > + }
>
> The commit message talks about this, and it is indeed very ugly that
> this re-reads `dest` from the beginning for *every* `source` line. Is
> there a reason you can't simply read `dest` into a `strset` (see Git's
> `strmap.h`) in its entirety before entering the repo_for_each_pack()
> loop and then merely check the strset for existence using
> strset_add()?
No reason at all, except for me to knowing about `strset`! Thanks for
suggesting it to me. Will use it in v2.
> > + }
> > +
> > + if (write_dest.len) {
> > + strbuf_strip_suffix(&write_dest, "\n");
> > + if (fseek(dest, 0L, SEEK_END))
> > + die_errno(_("fseek failed"));
> > + fprintf(dest, "%s\n", write_dest.buf);
> > + fflush(dest);
> > + strbuf_reset(&write_dest);
> > + }
> > +
> > + err = ferror(source);
> > + err |= fclose(source);
> > + if (err)
> > + die(_("could not read '%s' promisor file"), promisor_source_name.buf);
> > + }
> > +
> > + err = ferror(dest);
> > + err |= fclose(dest);
> > + if (err)
> > + die(_("could not write '%s' promisor file"), promisor_name);
> > +}
^ permalink raw reply [flat|nested] 19+ messages in thread
* [GSoC PATCH 3/3] repack-promisor: preserve content of promisor files after repack
2026-03-21 21:28 [GSoC PATCH 0/3] preserve promisor files content after repack LorenzoPegorari
2026-03-21 21:28 ` [GSoC PATCH 1/3] pack-write: add explanation to promisor file content LorenzoPegorari
2026-03-21 21:28 ` [GSoC PATCH 2/3] pack-write: add helper to fill promisor file after repack LorenzoPegorari
@ 2026-03-21 21:29 ` LorenzoPegorari
2026-03-22 19:16 ` [GSoC PATCH v2 0/4] preserve promisor files content " LorenzoPegorari
3 siblings, 0 replies; 19+ messages in thread
From: LorenzoPegorari @ 2026-03-21 21:29 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Taylor Blau, Karthik Nayak, Junio C Hamano
When a repack involving promisor packfiles happens, the new ".promisor"
file is created empty, losing all the debug info that might be present
inside the ".promisor" files before the repack.
Use the "copy_all_promisor_files()" function created previously to
preserve the contents of all ".promisor" files inside the first
".promisor" file created by the repack.
Also, update the documentation accordingly.
Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
---
Documentation/git-repack.adoc | 4 ++--
repack-promisor.c | 23 +++++++++++++++--------
2 files changed, 17 insertions(+), 10 deletions(-)
diff --git a/Documentation/git-repack.adoc b/Documentation/git-repack.adoc
index 673ce91083..33d3c8afbd 100644
--- a/Documentation/git-repack.adoc
+++ b/Documentation/git-repack.adoc
@@ -45,8 +45,8 @@ other objects in that pack they already have locally.
+
Promisor packfiles are repacked separately: if there are packfiles that
have an associated ".promisor" file, these packfiles will be repacked
-into another separate pack, and an empty ".promisor" file corresponding
-to the new separate pack will be written.
+into another separate pack, and a ".promisor" file corresponding to the
+new separate pack will be written (with arbitrary contents).
-A::
Same as `-a`, unless `-d` is used. Then any unreachable
diff --git a/repack-promisor.c b/repack-promisor.c
index 90318ce150..6670728669 100644
--- a/repack-promisor.c
+++ b/repack-promisor.c
@@ -40,6 +40,7 @@ static void finish_repacking_promisor_objects(struct repository *repo,
const char *packtmp)
{
struct strbuf line = STRBUF_INIT;
+ int is_first_promisor = 1;
FILE *out;
close(cmd->in);
@@ -55,19 +56,25 @@ static void finish_repacking_promisor_objects(struct repository *repo,
/*
* pack-objects creates the .pack and .idx files, but not the
- * .promisor file. Create the .promisor file, which is empty.
- *
- * NEEDSWORK: fetch-pack sometimes generates non-empty
- * .promisor files containing the ref names and associated
- * hashes at the point of generation of the corresponding
- * packfile, but this would not preserve their contents. Maybe
- * concatenate the contents of all .promisor files instead of
- * just creating a new empty file.
+ * .promisor file. Create the .promisor file.
*/
promisor_name = mkpathdup("%s-%s.promisor", packtmp,
line.buf);
write_promisor_file(promisor_name, NULL, 0);
+ /*
+ * Fetch-pack sometimes generates non-empty .promisor files
+ * containing the ref names and associated hashes at the point of
+ * generation of the corresponding packfile. These pieces of info
+ * are only used for debugging reasons. In order to preserve
+ * these, let's copy the contents of all .promisor files in the
+ * first promisor file created.
+ */
+ if (is_first_promisor) {
+ copy_all_promisor_files(repo, promisor_name);
+ is_first_promisor = 0;
+ }
+
item->util = generated_pack_populate(item->string, packtmp);
free(promisor_name);
--
2.43.0
^ permalink raw reply related [flat|nested] 19+ messages in thread* [GSoC PATCH v2 0/4] preserve promisor files content after repack
2026-03-21 21:28 [GSoC PATCH 0/3] preserve promisor files content after repack LorenzoPegorari
` (2 preceding siblings ...)
2026-03-21 21:29 ` [GSoC PATCH 3/3] repack-promisor: preserve content of promisor files " LorenzoPegorari
@ 2026-03-22 19:16 ` LorenzoPegorari
2026-03-22 19:16 ` [GSoC PATCH v2 1/4] pack-write: add explanation to promisor file content LorenzoPegorari
` (3 more replies)
3 siblings, 4 replies; 19+ messages in thread
From: LorenzoPegorari @ 2026-03-22 19:16 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Patrick Steinhardt, Junio C Hamano, Taylor Blau,
Eric Sunshine
The goal of this patch is to solve the NEEDSWORK comment added by
5374a290 (fetch-pack: write fetched refs to .promisor, 14/10/2019). This
is done by adding a helper function that takes the content of all
.promisor files in the `repository`, and copies it inside the first
.promisor file created by the repack.
Also, I added a comment explaining what is the purpose of the content of
the .promisor files, since this wasn't explained anywhere (I found
information regarding this only in the message of the previously cited
commit).
Finally, I added a test to "t7700-repack.sh" that checks if the content
of .promisor files are correctly copied into the first .promisor file
created by a repack.
V2 DIFF:
* changed how the `copy_all_promisor_files()` function works, so that it
reads `dest` into a `strset` in its entirety before entering the
`repo_for_each_pack()` loop, and then checks the `strset` for
existence using `strset_add()` (as suggested by Eric Sunshine)
* correctly release `strbuf`s
* added test
LorenzoPegorari (4):
pack-write: add explanation to promisor file content
pack-write: add helper to fill promisor file after repack
repack-promisor: preserve content of promisor files after repack
t7700: test for promisor file content after repack
Documentation/git-repack.adoc | 4 +-
pack-write.c | 70 +++++++++++++++++++++++++++++++++++
pack.h | 1 +
repack-promisor.c | 23 ++++++++----
t/t7700-repack.sh | 12 ++++++
5 files changed, 100 insertions(+), 10 deletions(-)
Range-diff against v1:
1: 9bba49563e = 1: fec0c24897 pack-write: add explanation to promisor file content
2: 3c0702f81b < -: ---------- pack-write: add helper to fill promisor file after repack
-: ---------- > 2: 0bb031e744 pack-write: add helper to fill promisor file after repack
3: 6967066fe3 = 3: 3dab969a39 repack-promisor: preserve content of promisor files after repack
-: ---------- > 4: cb642d8225 t7700: test for promisor file content after repack
--
2.43.0
^ permalink raw reply [flat|nested] 19+ messages in thread* [GSoC PATCH v2 1/4] pack-write: add explanation to promisor file content
2026-03-22 19:16 ` [GSoC PATCH v2 0/4] preserve promisor files content " LorenzoPegorari
@ 2026-03-22 19:16 ` LorenzoPegorari
2026-03-23 21:07 ` Junio C Hamano
2026-03-22 19:18 ` [GSoC PATCH v2 2/4] pack-write: add helper to fill promisor file after repack LorenzoPegorari
` (2 subsequent siblings)
3 siblings, 1 reply; 19+ messages in thread
From: LorenzoPegorari @ 2026-03-22 19:16 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Patrick Steinhardt, Junio C Hamano, Taylor Blau,
Eric Sunshine
In the entire codebase there is no explanation as to why the ".promisor"
files may contain the ref names (and their associated hashes) that were
fetched at the time the corresponding packfile was downloaded.
Add comment explaining that these pieces of information are used only for
debugging reasons, and how they can be used while debugging.
Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
---
pack-write.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/pack-write.c b/pack-write.c
index 83eaf88541..6a2023327e 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -603,6 +603,15 @@ void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_
int i, err;
FILE *output = xfopen(promisor_name, "w");
+ /*
+ * Write in the .promisor file the ref names and associated hashes,
+ * obtained by fetch-pack, at the point of generation of the
+ * corresponding packfile. These pieces of info are only used to make
+ * it easier to debug issues with partial clones, as we can identify
+ * what refs (and their associated hashes) were fetched at the time
+ * the packfile was downloaded, and if necessary, compare those hashes
+ * against what the promisor remote reports now.
+ */
for (i = 0; i < nr_sought; i++)
fprintf(output, "%s %s\n", oid_to_hex(&sought[i]->old_oid),
sought[i]->name);
--
2.43.0
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [GSoC PATCH v2 1/4] pack-write: add explanation to promisor file content
2026-03-22 19:16 ` [GSoC PATCH v2 1/4] pack-write: add explanation to promisor file content LorenzoPegorari
@ 2026-03-23 21:07 ` Junio C Hamano
2026-03-25 21:33 ` Lorenzo Pegorari
0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2026-03-23 21:07 UTC (permalink / raw)
To: LorenzoPegorari
Cc: git, Elijah Newren, Patrick Steinhardt, Taylor Blau,
Eric Sunshine
LorenzoPegorari <lorenzo.pegorari2002@gmail.com> writes:
> In the entire codebase there is no explanation as to why the ".promisor"
> files may contain the ref names (and their associated hashes) that were
> fetched at the time the corresponding packfile was downloaded.
>
> Add comment explaining that these pieces of information are used only for
> debugging reasons, and how they can be used while debugging.
>
> Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
A natural question any reader of the above (and below) would be
asking is: Who told you that these are only to aid debugging?
Please refer to the commit that brought in the reasoning behind the
comment to make it more convincing.
Something like this replacing the second paragraph,
As explained in the log message of the commit 5374a290
(fetch-pack: write fetched refs to .promisor, 2019-10-14), where
this loop originally came from, these ref values are not
actually used for anything in the production, but are solely
there to help debugging. Explain it in a new comment.
perhaps?
> + /*
> + * Write in the .promisor file the ref names and associated hashes,
> + * obtained by fetch-pack, at the point of generation of the
> + * corresponding packfile. These pieces of info are only used to make
> + * it easier to debug issues with partial clones, as we can identify
> + * what refs (and their associated hashes) were fetched at the time
> + * the packfile was downloaded, and if necessary, compare those hashes
> + * against what the promisor remote reports now.
> + */
I do not want to sound too pedantic, but we align '*' asterisks in
our multi-line comments, assuming tabwidth=8 and monospace:
/*
* Write in the .promisor ...
...
* against what the promisor remote reports now.
*/
Your second and subsequent lines lack a single whitespace after the
leading tab used for indent.
> for (i = 0; i < nr_sought; i++)
> fprintf(output, "%s %s\n", oid_to_hex(&sought[i]->old_oid),
> sought[i]->name);
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [GSoC PATCH v2 1/4] pack-write: add explanation to promisor file content
2026-03-23 21:07 ` Junio C Hamano
@ 2026-03-25 21:33 ` Lorenzo Pegorari
0 siblings, 0 replies; 19+ messages in thread
From: Lorenzo Pegorari @ 2026-03-25 21:33 UTC (permalink / raw)
To: Junio C Hamano
Cc: git, Elijah Newren, Patrick Steinhardt, Taylor Blau,
Eric Sunshine
On Mon, Mar 23, 2026 at 02:07:31PM -0700, Junio C Hamano wrote:
> LorenzoPegorari <lorenzo.pegorari2002@gmail.com> writes:
>
> > In the entire codebase there is no explanation as to why the ".promisor"
> > files may contain the ref names (and their associated hashes) that were
> > fetched at the time the corresponding packfile was downloaded.
> >
> > Add comment explaining that these pieces of information are used only for
> > debugging reasons, and how they can be used while debugging.
> >
> > Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
>
> A natural question any reader of the above (and below) would be
> asking is: Who told you that these are only to aid debugging?
>
> Please refer to the commit that brought in the reasoning behind the
> comment to make it more convincing.
>
> Something like this replacing the second paragraph,
>
> As explained in the log message of the commit 5374a290
> (fetch-pack: write fetched refs to .promisor, 2019-10-14), where
> this loop originally came from, these ref values are not
> actually used for anything in the production, but are solely
> there to help debugging. Explain it in a new comment.
>
> perhaps?
Makes perfect sense. I should have done this from the start. Thanks for
pointing that out.
> > + /*
> > + * Write in the .promisor file the ref names and associated hashes,
> > + * obtained by fetch-pack, at the point of generation of the
> > + * corresponding packfile. These pieces of info are only used to make
> > + * it easier to debug issues with partial clones, as we can identify
> > + * what refs (and their associated hashes) were fetched at the time
> > + * the packfile was downloaded, and if necessary, compare those hashes
> > + * against what the promisor remote reports now.
> > + */
>
> I do not want to sound too pedantic, but we align '*' asterisks in
> our multi-line comments, assuming tabwidth=8 and monospace:
>
> /*
> * Write in the .promisor ...
> ...
> * against what the promisor remote reports now.
> */
>
> Your second and subsequent lines lack a single whitespace after the
> leading tab used for indent.
You are not too pedantic! Ack.
> > for (i = 0; i < nr_sought; i++)
> > fprintf(output, "%s %s\n", oid_to_hex(&sought[i]->old_oid),
> > sought[i]->name);
^ permalink raw reply [flat|nested] 19+ messages in thread
* [GSoC PATCH v2 2/4] pack-write: add helper to fill promisor file after repack
2026-03-22 19:16 ` [GSoC PATCH v2 0/4] preserve promisor files content " LorenzoPegorari
2026-03-22 19:16 ` [GSoC PATCH v2 1/4] pack-write: add explanation to promisor file content LorenzoPegorari
@ 2026-03-22 19:18 ` LorenzoPegorari
2026-03-23 20:27 ` Eric Sunshine
2026-03-23 21:30 ` Junio C Hamano
2026-03-22 19:18 ` [GSoC PATCH v2 3/4] repack-promisor: preserve content of promisor files " LorenzoPegorari
2026-03-22 19:18 ` [GSoC PATCH v2 4/4] t7700: test for promisor file content " LorenzoPegorari
3 siblings, 2 replies; 19+ messages in thread
From: LorenzoPegorari @ 2026-03-22 19:18 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Patrick Steinhardt, Junio C Hamano, Taylor Blau,
Eric Sunshine
Create a `copy_all_promisor_files()` helper function used to copy the
contents of all ".promisor" files in a `repository` inside another
".promisor" file.
This function can be used to preserve the contents of all ".promisor"
files inside a new ".promisor" file, for example when a repack happens.
This function is written in such a way so that it will read all the
".promisor" files inside the given `repository` line by line, and copy
only the lines that are not already present in the destination file. This
is done to avoid copying the same lines multiple times that may come from
multiple (redundant) packfiles. There might be another better/cleaner way
to achieve this.
Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
---
pack-write.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++++
pack.h | 1 +
2 files changed, 62 insertions(+)
diff --git a/pack-write.c b/pack-write.c
index 6a2023327e..583e40b423 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -13,6 +13,7 @@
#include "path.h"
#include "repository.h"
#include "strbuf.h"
+#include "strmap.h"
void reset_pack_idx_option(struct pack_idx_option *opts)
{
@@ -621,3 +622,63 @@ void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_
if (err)
die(_("could not write '%s' promisor file"), promisor_name);
}
+
+void copy_all_promisor_files(struct repository *repo, const char *promisor_name)
+{
+ struct strset dest_content = STRSET_INIT;
+ struct strbuf read_line = STRBUF_INIT;
+ struct strbuf promisor_source_name = STRBUF_INIT;
+ struct strbuf write_dest = STRBUF_INIT;
+ FILE *dest, *source;
+ struct packed_git *p;
+ int err;
+
+ dest = xfopen(promisor_name, "r+");
+ while (strbuf_getline(&read_line, dest) != EOF)
+ strset_add(&dest_content, read_line.buf);
+
+ repo_for_each_pack(repo, p) {
+ if (!p->pack_promisor)
+ continue;
+
+ strbuf_reset(&promisor_source_name);
+ strbuf_addstr(&promisor_source_name, p->pack_name);
+ strbuf_strip_suffix(&promisor_source_name, ".pack");
+ strbuf_addstr(&promisor_source_name, ".promisor");
+ source = xfopen(promisor_source_name.buf, "r");
+
+ /*
+ * For each line of the promisor source file, check if it already
+ * is in the promisor dest file. If not, add it to write_dest, so
+ * that it will be written in the dest file.
+ */
+ while (strbuf_getline(&read_line, source) != EOF) {
+ if (strset_add(&dest_content, read_line.buf)) {
+ strbuf_addbuf(&write_dest, &read_line);
+ strbuf_addstr(&write_dest, "\n");
+ }
+ }
+
+ err = ferror(source);
+ err |= fclose(source);
+ if (err)
+ die(_("could not read '%s' promisor file"), promisor_source_name.buf);
+ }
+
+ if (write_dest.len) {
+ strbuf_strip_suffix(&write_dest, "\n");
+ if (fseek(dest, 0L, SEEK_END))
+ die_errno(_("fseek failed"));
+ fprintf(dest, "%s\n", write_dest.buf);
+ }
+
+ err = ferror(dest);
+ err |= fclose(dest);
+ if (err)
+ die(_("could not write '%s' promisor file"), promisor_name);
+
+ strbuf_release(&read_line);
+ strbuf_release(&promisor_source_name);
+ strbuf_release(&write_dest);
+ strset_clear(&dest_content);
+}
diff --git a/pack.h b/pack.h
index ec76472e49..509e90edba 100644
--- a/pack.h
+++ b/pack.h
@@ -105,6 +105,7 @@ char *index_pack_lockfile(struct repository *r, int fd, int *is_well_formed);
struct ref;
void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought);
+void copy_all_promisor_files(struct repository *repo, const char *promisor_name);
char *write_rev_file(struct repository *repo,
const char *rev_name,
--
2.43.0
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [GSoC PATCH v2 2/4] pack-write: add helper to fill promisor file after repack
2026-03-22 19:18 ` [GSoC PATCH v2 2/4] pack-write: add helper to fill promisor file after repack LorenzoPegorari
@ 2026-03-23 20:27 ` Eric Sunshine
2026-03-26 16:15 ` Lorenzo Pegorari
2026-03-23 21:30 ` Junio C Hamano
1 sibling, 1 reply; 19+ messages in thread
From: Eric Sunshine @ 2026-03-23 20:27 UTC (permalink / raw)
To: LorenzoPegorari
Cc: git, Elijah Newren, Patrick Steinhardt, Junio C Hamano,
Taylor Blau
On Sun, Mar 22, 2026 at 3:18 PM LorenzoPegorari
<lorenzo.pegorari2002@gmail.com> wrote:
> Create a `copy_all_promisor_files()` helper function used to copy the
> contents of all ".promisor" files in a `repository` inside another
> ".promisor" file.
>
> This function can be used to preserve the contents of all ".promisor"
> files inside a new ".promisor" file, for example when a repack happens.
>
> This function is written in such a way so that it will read all the
> ".promisor" files inside the given `repository` line by line, and copy
> only the lines that are not already present in the destination file. This
> is done to avoid copying the same lines multiple times that may come from
> multiple (redundant) packfiles. There might be another better/cleaner way
> to achieve this.
>
> Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
> ---
Thanks, I think this version addresses all my review comments[*] and
looks much better overall. Use of `strset` makes a big difference over
the previous attempt. A couple minor comments below...
[*]: https://lore.kernel.org/git/CAPig+cQSsMfvHJnwuXGQ1Je8ekz=Rqbaibn-3shbya5y-5xTKg@mail.gmail.com/
> diff --git a/pack-write.c b/pack-write.c
> @@ -621,3 +622,63 @@ void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_
> +void copy_all_promisor_files(struct repository *repo, const char *promisor_name)
> +{
> + struct strset dest_content = STRSET_INIT;
> + struct strbuf read_line = STRBUF_INIT;
> + struct strbuf promisor_source_name = STRBUF_INIT;
> + struct strbuf write_dest = STRBUF_INIT;
> + FILE *dest, *source;
> + struct packed_git *p;
> + int err;
Nit: I probably would have declared `FILE *dest` within the scope of
the repo_for_each_pack() loop as suggested in the review, but it's not
worth a reroll.
> + dest = xfopen(promisor_name, "r+");
> + while (strbuf_getline(&read_line, dest) != EOF)
> + strset_add(&dest_content, read_line.buf);
> +
> + repo_for_each_pack(repo, p) {
> + if (!p->pack_promisor)
> + continue;
> +
> + strbuf_reset(&promisor_source_name);
> + strbuf_addstr(&promisor_source_name, p->pack_name);
> + strbuf_strip_suffix(&promisor_source_name, ".pack");
> + strbuf_addstr(&promisor_source_name, ".promisor");
> + source = xfopen(promisor_source_name.buf, "r");
> +
> + /*
> + * For each line of the promisor source file, check if it already
> + * is in the promisor dest file. If not, add it to write_dest, so
> + * that it will be written in the dest file.
> + */
> + while (strbuf_getline(&read_line, source) != EOF) {
> + if (strset_add(&dest_content, read_line.buf)) {
> + strbuf_addbuf(&write_dest, &read_line);
> + strbuf_addstr(&write_dest, "\n");
Not worth a reroll, but this could also be:
strbuf_addch(&write_dest, '\n');
> + }
> + }
> +
> + err = ferror(source);
> + err |= fclose(source);
> + if (err)
> + die(_("could not read '%s' promisor file"), promisor_source_name.buf);
> + }
> +
> + if (write_dest.len) {
> + strbuf_strip_suffix(&write_dest, "\n");
> + if (fseek(dest, 0L, SEEK_END))
> + die_errno(_("fseek failed"));
> + fprintf(dest, "%s\n", write_dest.buf);
> + }
Can you explain why you strip "\n" and then re-add it via fprintf()?
The reason is not immediately obvious.
> + err = ferror(dest);
> + err |= fclose(dest);
> + if (err)
> + die(_("could not write '%s' promisor file"), promisor_name);
> +
> + strbuf_release(&read_line);
> + strbuf_release(&promisor_source_name);
> + strbuf_release(&write_dest);
> + strset_clear(&dest_content);
> +}
Everything appears to be released. Good.
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [GSoC PATCH v2 2/4] pack-write: add helper to fill promisor file after repack
2026-03-23 20:27 ` Eric Sunshine
@ 2026-03-26 16:15 ` Lorenzo Pegorari
0 siblings, 0 replies; 19+ messages in thread
From: Lorenzo Pegorari @ 2026-03-26 16:15 UTC (permalink / raw)
To: Eric Sunshine
Cc: git, Elijah Newren, Patrick Steinhardt, Junio C Hamano,
Taylor Blau
On Mon, Mar 23, 2026 at 04:27:44PM -0400, Eric Sunshine wrote:
> On Sun, Mar 22, 2026 at 3:18 PM LorenzoPegorari
> <lorenzo.pegorari2002@gmail.com> wrote:
> > Create a `copy_all_promisor_files()` helper function used to copy the
> > contents of all ".promisor" files in a `repository` inside another
> > ".promisor" file.
> >
> > This function can be used to preserve the contents of all ".promisor"
> > files inside a new ".promisor" file, for example when a repack happens.
> >
> > This function is written in such a way so that it will read all the
> > ".promisor" files inside the given `repository` line by line, and copy
> > only the lines that are not already present in the destination file. This
> > is done to avoid copying the same lines multiple times that may come from
> > multiple (redundant) packfiles. There might be another better/cleaner way
> > to achieve this.
> >
> > Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
> > ---
>
> Thanks, I think this version addresses all my review comments[*] and
> looks much better overall. Use of `strset` makes a big difference over
> the previous attempt. A couple minor comments below...
>
> [*]: https://lore.kernel.org/git/CAPig+cQSsMfvHJnwuXGQ1Je8ekz=Rqbaibn-3shbya5y-5xTKg@mail.gmail.com/
>
> > diff --git a/pack-write.c b/pack-write.c
> > @@ -621,3 +622,63 @@ void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_
> > +void copy_all_promisor_files(struct repository *repo, const char *promisor_name)
> > +{
> > + struct strset dest_content = STRSET_INIT;
> > + struct strbuf read_line = STRBUF_INIT;
> > + struct strbuf promisor_source_name = STRBUF_INIT;
> > + struct strbuf write_dest = STRBUF_INIT;
> > + FILE *dest, *source;
> > + struct packed_git *p;
> > + int err;
>
> Nit: I probably would have declared `FILE *dest` within the scope of
> the repo_for_each_pack() loop as suggested in the review, but it's not
> worth a reroll.
Ack. I will fix it. Thanks!
> > + dest = xfopen(promisor_name, "r+");
> > + while (strbuf_getline(&read_line, dest) != EOF)
> > + strset_add(&dest_content, read_line.buf);
> > +
> > + repo_for_each_pack(repo, p) {
> > + if (!p->pack_promisor)
> > + continue;
> > +
> > + strbuf_reset(&promisor_source_name);
> > + strbuf_addstr(&promisor_source_name, p->pack_name);
> > + strbuf_strip_suffix(&promisor_source_name, ".pack");
> > + strbuf_addstr(&promisor_source_name, ".promisor");
> > + source = xfopen(promisor_source_name.buf, "r");
> > +
> > + /*
> > + * For each line of the promisor source file, check if it already
> > + * is in the promisor dest file. If not, add it to write_dest, so
> > + * that it will be written in the dest file.
> > + */
> > + while (strbuf_getline(&read_line, source) != EOF) {
> > + if (strset_add(&dest_content, read_line.buf)) {
> > + strbuf_addbuf(&write_dest, &read_line);
> > + strbuf_addstr(&write_dest, "\n");
>
> Not worth a reroll, but this could also be:
>
> strbuf_addch(&write_dest, '\n');
Ack.
> > + }
> > + }
> > +
> > + err = ferror(source);
> > + err |= fclose(source);
> > + if (err)
> > + die(_("could not read '%s' promisor file"), promisor_source_name.buf);
> > + }
> > +
> > + if (write_dest.len) {
> > + strbuf_strip_suffix(&write_dest, "\n");
> > + if (fseek(dest, 0L, SEEK_END))
> > + die_errno(_("fseek failed"));
> > + fprintf(dest, "%s\n", write_dest.buf);
> > + }
>
> Can you explain why you strip "\n" and then re-add it via fprintf()?
> The reason is not immediately obvious.
Stripping it and then adding it again is actually not necessary. I think
it was necessary in a previous iteration. Thanks for noticing, will fix!
> > + err = ferror(dest);
> > + err |= fclose(dest);
> > + if (err)
> > + die(_("could not write '%s' promisor file"), promisor_name);
> > +
> > + strbuf_release(&read_line);
> > + strbuf_release(&promisor_source_name);
> > + strbuf_release(&write_dest);
> > + strset_clear(&dest_content);
> > +}
>
> Everything appears to be released. Good.
Thank you so much for your help Eric,
Lorenzo
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [GSoC PATCH v2 2/4] pack-write: add helper to fill promisor file after repack
2026-03-22 19:18 ` [GSoC PATCH v2 2/4] pack-write: add helper to fill promisor file after repack LorenzoPegorari
2026-03-23 20:27 ` Eric Sunshine
@ 2026-03-23 21:30 ` Junio C Hamano
2026-03-26 2:01 ` Lorenzo Pegorari
1 sibling, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2026-03-23 21:30 UTC (permalink / raw)
To: LorenzoPegorari
Cc: git, Elijah Newren, Patrick Steinhardt, Taylor Blau,
Eric Sunshine
LorenzoPegorari <lorenzo.pegorari2002@gmail.com> writes:
> Create a `copy_all_promisor_files()` helper function used to copy the
> contents of all ".promisor" files in a `repository` inside another
> ".promisor" file.
>
> This function can be used to preserve the contents of all ".promisor"
> files inside a new ".promisor" file, for example when a repack happens.
>
> This function is written in such a way so that it will read all the
> ".promisor" files inside the given `repository` line by line, and copy
> only the lines that are not already present in the destination file. This
> is done to avoid copying the same lines multiple times that may come from
> multiple (redundant) packfiles. There might be another better/cleaner way
> to achieve this.
In the previous step, we extablished that these "back then their ref
X used to point at object Y" records are there so that we can
identify which refs were fetched at the time the packfile was
downloaded to help debugging. When repacking, losing these records
certainly would lose information.
But would concatenating all into a single file help preserve the
useful information? Don't we need do better than that?
A NEEDSWORK comment, as was discussed in another thread or two in
the recent past, is not necessarily a well thought out fully
finished specification of an additional piece of work. "We know
this has a problem, we may need to do something about it, like
concatenating to save the contents, perhaps? We do not know the
answer, and we do not bother thinking it through right at this
moment. It is left to the future developers to figure it out" is
what a NEEDSWORK comment is about.
Your first response to such a comment may be "yeah, I agree that it
is bad to lose information we added to help debugging", but the
second one should be to wonder if the "like concatenating..." is the
best approach going forward.
In other words, we should take a NEEDSWORK comment as a mere
starting point, and what NEEDS your work begins at thinking what
needs to be done about the problem raised there.
By mixing them up all into a single list, you no longer can tell
when their ref X was observed to be pointing at object Y anymore.
You may have two packs originally, with a record for "ref X pointing
at object Y" in each of them, but by deduping them, you lose the
information that you cloned at one time, and made an additional
fetch on another day, and the fact the ref X was pointing at the
same value at both times. I am not sure if it is a good
implementation if the objective of this topic is to preserve
information that is useful for debugging.
I wonder if it helps to append to each line the file timestamp of
the .promisor file we took the record originally? For the sake of
completeness, we could consider adding the filename as well, but we
can quickly dismiss it as not so useful ;-)
If repacking already repacked promisor packfile, the records would
already contain such a timestamp at the end, so the code to copy
existing records must be prepared to see if the records are the
<ref, oid> tuple, or <ref, oid, timestamp> tuple, and act
accordingly.
I am *not* saying that without such a "preserve timestamp" column in
the record, copying existing records to a new .promisor file is
useless. But we do not see any explanation why the author thinks
that it is sufficient to copy existing records while silently
deduping. We can implement only one choice backed by series of
decisions like "timestamp might help" and "original filenames would
probably not help", and the design should describe what was
considered and rejected (as opposed to "we didn't think things
through---we just did what the original NEEDSWORK comment suggested
doing").
Thanks.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [GSoC PATCH v2 2/4] pack-write: add helper to fill promisor file after repack
2026-03-23 21:30 ` Junio C Hamano
@ 2026-03-26 2:01 ` Lorenzo Pegorari
0 siblings, 0 replies; 19+ messages in thread
From: Lorenzo Pegorari @ 2026-03-26 2:01 UTC (permalink / raw)
To: Junio C Hamano
Cc: git, Elijah Newren, Patrick Steinhardt, Taylor Blau,
Eric Sunshine
On Mon, Mar 23, 2026 at 02:30:26PM -0700, Junio C Hamano wrote:
> LorenzoPegorari <lorenzo.pegorari2002@gmail.com> writes:
>
> > Create a `copy_all_promisor_files()` helper function used to copy the
> > contents of all ".promisor" files in a `repository` inside another
> > ".promisor" file.
> >
> > This function can be used to preserve the contents of all ".promisor"
> > files inside a new ".promisor" file, for example when a repack happens.
> >
> > This function is written in such a way so that it will read all the
> > ".promisor" files inside the given `repository` line by line, and copy
> > only the lines that are not already present in the destination file. This
> > is done to avoid copying the same lines multiple times that may come from
> > multiple (redundant) packfiles. There might be another better/cleaner way
> > to achieve this.
>
> In the previous step, we extablished that these "back then their ref
> X used to point at object Y" records are there so that we can
> identify which refs were fetched at the time the packfile was
> downloaded to help debugging. When repacking, losing these records
> certainly would lose information.
>
> But would concatenating all into a single file help preserve the
> useful information? Don't we need do better than that?
Yeah, we absolutely need to! That's why I said that I was not satisfied
at all with the patch (in cover letter of v1). I really needed some
feedback, because I knew that I was doing things wrong.
> A NEEDSWORK comment, as was discussed in another thread or two in
> the recent past, is not necessarily a well thought out fully
> finished specification of an additional piece of work. "We know
> this has a problem, we may need to do something about it, like
> concatenating to save the contents, perhaps? We do not know the
> answer, and we do not bother thinking it through right at this
> moment. It is left to the future developers to figure it out" is
> what a NEEDSWORK comment is about.
>
> Your first response to such a comment may be "yeah, I agree that it
> is bad to lose information we added to help debugging", but the
> second one should be to wonder if the "like concatenating..." is the
> best approach going forward.
>
> In other words, we should take a NEEDSWORK comment as a mere
> starting point, and what NEEDS your work begins at thinking what
> needs to be done about the problem raised there.
I fully understand this! Honestly, my biggest weakness that I've
discovered about myself as a dev (through past open-source experience,
e.g. GSoC'25) is that I get hesitant when I have to work on and submit
a patch if I don't have a lot of experience with the codebase. This
happens particularly when I have to take a decision, and not only
complete a task.
In fact, I decided to work on this specific NEEDSWORK issue to get more
experience on promisor remotes (the feature that I want to improve in my
GSoC proposal) before the GSoC coding period... if I get selected, of
course :-).
I will try my best to improve!
> By mixing them up all into a single list, you no longer can tell
> when their ref X was observed to be pointing at object Y anymore.
> You may have two packs originally, with a record for "ref X pointing
> at object Y" in each of them, but by deduping them, you lose the
> information that you cloned at one time, and made an additional
> fetch on another day, and the fact the ref X was pointing at the
> same value at both times. I am not sure if it is a good
> implementation if the objective of this topic is to preserve
> information that is useful for debugging.
My reasoning was based on the (wrong) assumption that it was impossible
for the same record fo "ref X is pointing at object Y" to appear
multiple times. Obviously then, deduping them is the wrong solution, as
it will lose some debugging information.
> I wonder if it helps to append to each line the file timestamp of
> the .promisor file we took the record originally? For the sake of
> completeness, we could consider adding the filename as well, but we
> can quickly dismiss it as not so useful ;-)
Related to what I said before about getting hesitant... I actually
thought about (pretty much) exactly this! My original idea was to add
a timestamp of the current time when the repack happened. I discarded it
because I didn't want to add any new information (for no particular
reason tbh) and because I didn't want ".promisor" file content to
potentially become too long if many repacks happen.
Your solution is much cleaner compared to what I originally thought of.
> If repacking already repacked promisor packfile, the records would
> already contain such a timestamp at the end, so the code to copy
> existing records must be prepared to see if the records are the
> <ref, oid> tuple, or <ref, oid, timestamp> tuple, and act
> accordingly.
Makes perfect sense.
> I am *not* saying that without such a "preserve timestamp" column in
> the record, copying existing records to a new .promisor file is
> useless. But we do not see any explanation why the author thinks
> that it is sufficient to copy existing records while silently
> deduping. We can implement only one choice backed by series of
> decisions like "timestamp might help" and "original filenames would
> probably not help", and the design should describe what was
> considered and rejected (as opposed to "we didn't think things
> through---we just did what the original NEEDSWORK comment suggested
> doing").
I 100% agree.
> Thanks.
Thank you Junio for your time and feedback,
Lorenzo
^ permalink raw reply [flat|nested] 19+ messages in thread
* [GSoC PATCH v2 3/4] repack-promisor: preserve content of promisor files after repack
2026-03-22 19:16 ` [GSoC PATCH v2 0/4] preserve promisor files content " LorenzoPegorari
2026-03-22 19:16 ` [GSoC PATCH v2 1/4] pack-write: add explanation to promisor file content LorenzoPegorari
2026-03-22 19:18 ` [GSoC PATCH v2 2/4] pack-write: add helper to fill promisor file after repack LorenzoPegorari
@ 2026-03-22 19:18 ` LorenzoPegorari
2026-03-23 21:48 ` Junio C Hamano
2026-03-22 19:18 ` [GSoC PATCH v2 4/4] t7700: test for promisor file content " LorenzoPegorari
3 siblings, 1 reply; 19+ messages in thread
From: LorenzoPegorari @ 2026-03-22 19:18 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Patrick Steinhardt, Junio C Hamano, Taylor Blau,
Eric Sunshine
When a repack involving promisor packfiles happens, the new ".promisor"
file is created empty, losing all the debug info that might be present
inside the ".promisor" files before the repack.
Use the "copy_all_promisor_files()" function created previously to
preserve the contents of all ".promisor" files inside the first
".promisor" file created by the repack.
Also, update the documentation accordingly.
Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
---
Documentation/git-repack.adoc | 4 ++--
repack-promisor.c | 23 +++++++++++++++--------
2 files changed, 17 insertions(+), 10 deletions(-)
diff --git a/Documentation/git-repack.adoc b/Documentation/git-repack.adoc
index 673ce91083..33d3c8afbd 100644
--- a/Documentation/git-repack.adoc
+++ b/Documentation/git-repack.adoc
@@ -45,8 +45,8 @@ other objects in that pack they already have locally.
+
Promisor packfiles are repacked separately: if there are packfiles that
have an associated ".promisor" file, these packfiles will be repacked
-into another separate pack, and an empty ".promisor" file corresponding
-to the new separate pack will be written.
+into another separate pack, and a ".promisor" file corresponding to the
+new separate pack will be written (with arbitrary contents).
-A::
Same as `-a`, unless `-d` is used. Then any unreachable
diff --git a/repack-promisor.c b/repack-promisor.c
index 90318ce150..6670728669 100644
--- a/repack-promisor.c
+++ b/repack-promisor.c
@@ -40,6 +40,7 @@ static void finish_repacking_promisor_objects(struct repository *repo,
const char *packtmp)
{
struct strbuf line = STRBUF_INIT;
+ int is_first_promisor = 1;
FILE *out;
close(cmd->in);
@@ -55,19 +56,25 @@ static void finish_repacking_promisor_objects(struct repository *repo,
/*
* pack-objects creates the .pack and .idx files, but not the
- * .promisor file. Create the .promisor file, which is empty.
- *
- * NEEDSWORK: fetch-pack sometimes generates non-empty
- * .promisor files containing the ref names and associated
- * hashes at the point of generation of the corresponding
- * packfile, but this would not preserve their contents. Maybe
- * concatenate the contents of all .promisor files instead of
- * just creating a new empty file.
+ * .promisor file. Create the .promisor file.
*/
promisor_name = mkpathdup("%s-%s.promisor", packtmp,
line.buf);
write_promisor_file(promisor_name, NULL, 0);
+ /*
+ * Fetch-pack sometimes generates non-empty .promisor files
+ * containing the ref names and associated hashes at the point of
+ * generation of the corresponding packfile. These pieces of info
+ * are only used for debugging reasons. In order to preserve
+ * these, let's copy the contents of all .promisor files in the
+ * first promisor file created.
+ */
+ if (is_first_promisor) {
+ copy_all_promisor_files(repo, promisor_name);
+ is_first_promisor = 0;
+ }
+
item->util = generated_pack_populate(item->string, packtmp);
free(promisor_name);
--
2.43.0
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [GSoC PATCH v2 3/4] repack-promisor: preserve content of promisor files after repack
2026-03-22 19:18 ` [GSoC PATCH v2 3/4] repack-promisor: preserve content of promisor files " LorenzoPegorari
@ 2026-03-23 21:48 ` Junio C Hamano
2026-03-26 2:12 ` Lorenzo Pegorari
0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2026-03-23 21:48 UTC (permalink / raw)
To: LorenzoPegorari
Cc: git, Elijah Newren, Patrick Steinhardt, Taylor Blau,
Eric Sunshine
LorenzoPegorari <lorenzo.pegorari2002@gmail.com> writes:
> @@ -40,6 +40,7 @@ static void finish_repacking_promisor_objects(struct repository *repo,
> const char *packtmp)
> {
> struct strbuf line = STRBUF_INIT;
> + int is_first_promisor = 1;
> FILE *out;
> ...
> + /*
> + * Fetch-pack sometimes generates non-empty .promisor files
> + * containing the ref names and associated hashes at the point of
> + * generation of the corresponding packfile. These pieces of info
> + * are only used for debugging reasons. In order to preserve
> + * these, let's copy the contents of all .promisor files in the
> + * first promisor file created.
> + */
> + if (is_first_promisor) {
> + copy_all_promisor_files(repo, promisor_name);
> + is_first_promisor = 0;
> + }
> +
Here the underlying assumption seems to be that whichever one of the
two potential callers of this function, repack_promisor_objects()
and pack_geometry_repack_promisors(), would handle all the existing
packs with corresponding .promisor file so it is safe to coalesce
all the debugging comments from all the existing .promisor files
into one?
Is it really true, though? Especially with geometry repacking
enabled, wouldn't a regular repack coalesce only the smallish ones
into a single pack while leaving an already largeish ones intact, or
something?
Thanks.
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [GSoC PATCH v2 3/4] repack-promisor: preserve content of promisor files after repack
2026-03-23 21:48 ` Junio C Hamano
@ 2026-03-26 2:12 ` Lorenzo Pegorari
0 siblings, 0 replies; 19+ messages in thread
From: Lorenzo Pegorari @ 2026-03-26 2:12 UTC (permalink / raw)
To: Junio C Hamano
Cc: git, Elijah Newren, Patrick Steinhardt, Taylor Blau,
Eric Sunshine
On Mon, Mar 23, 2026 at 02:48:21PM -0700, Junio C Hamano wrote:
> LorenzoPegorari <lorenzo.pegorari2002@gmail.com> writes:
>
> > @@ -40,6 +40,7 @@ static void finish_repacking_promisor_objects(struct repository *repo,
> > const char *packtmp)
> > {
> > struct strbuf line = STRBUF_INIT;
> > + int is_first_promisor = 1;
> > FILE *out;
> > ...
> > + /*
> > + * Fetch-pack sometimes generates non-empty .promisor files
> > + * containing the ref names and associated hashes at the point of
> > + * generation of the corresponding packfile. These pieces of info
> > + * are only used for debugging reasons. In order to preserve
> > + * these, let's copy the contents of all .promisor files in the
> > + * first promisor file created.
> > + */
> > + if (is_first_promisor) {
> > + copy_all_promisor_files(repo, promisor_name);
> > + is_first_promisor = 0;
> > + }
> > +
>
> Here the underlying assumption seems to be that whichever one of the
> two potential callers of this function, repack_promisor_objects()
> and pack_geometry_repack_promisors(), would handle all the existing
> packs with corresponding .promisor file so it is safe to coalesce
> all the debugging comments from all the existing .promisor files
> into one?
>
> Is it really true, though? Especially with geometry repacking
> enabled, wouldn't a regular repack coalesce only the smallish ones
> into a single pack while leaving an already largeish ones intact, or
> something?
>
> Thanks.
I will look into this. I'm going to drastically rework this patch
series, so that the next version will be much better and better
explained.
Thank you so much for the time,
Lorenzo
^ permalink raw reply [flat|nested] 19+ messages in thread
* [GSoC PATCH v2 4/4] t7700: test for promisor file content after repack
2026-03-22 19:16 ` [GSoC PATCH v2 0/4] preserve promisor files content " LorenzoPegorari
` (2 preceding siblings ...)
2026-03-22 19:18 ` [GSoC PATCH v2 3/4] repack-promisor: preserve content of promisor files " LorenzoPegorari
@ 2026-03-22 19:18 ` LorenzoPegorari
3 siblings, 0 replies; 19+ messages in thread
From: LorenzoPegorari @ 2026-03-22 19:18 UTC (permalink / raw)
To: git
Cc: Elijah Newren, Patrick Steinhardt, Junio C Hamano, Taylor Blau,
Eric Sunshine
Add test that checks if the content of all ".promisor" files are copied
inside the first ".promisor" file created by a repack.
Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
---
t/t7700-repack.sh | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
index 63ef63fc50..10187d5954 100755
--- a/t/t7700-repack.sh
+++ b/t/t7700-repack.sh
@@ -904,4 +904,16 @@ test_expect_success 'pending objects are repacked appropriately' '
)
'
+test_expect_success 'check .promisor file content after repack' '
+ git init prom_test1 &&
+ test_commit -C prom_test1 temp &&
+ git clone prom_test1 prom_test2 --filter=blob:none --no-local &&
+
+ cp $(ls prom_test2/.git/objects/pack/pack-*.promisor) prom_content_before &&
+ git -C prom_test2 repack -a -d &&
+ cp $(ls prom_test2/.git/objects/pack/pack-*.promisor) prom_content_after &&
+
+ test_cmp prom_content_before prom_content_after
+'
+
test_done
--
2.43.0
^ permalink raw reply related [flat|nested] 19+ messages in thread