All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lorenzo Pegorari <lorenzo.pegorari2002@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Elijah Newren <newren@gmail.com>,
	Patrick Steinhardt <ps@pks.im>, Taylor Blau <me@ttaylorr.com>,
	Eric Sunshine <sunshine@sunshineco.com>
Subject: Re: [GSoC PATCH v2 2/4] pack-write: add helper to fill promisor file after repack
Date: Thu, 26 Mar 2026 03:01:07 +0100	[thread overview]
Message-ID: <acSTY_i4zAueN9jD@lorenzo-VM> (raw)
In-Reply-To: <xmqqfr5q44jx.fsf@gitster.g>

On Mon, Mar 23, 2026 at 02:30:26PM -0700, Junio C Hamano wrote:
> LorenzoPegorari <lorenzo.pegorari2002@gmail.com> writes:
> 
> > Create a `copy_all_promisor_files()` helper function used to copy the
> > contents of all ".promisor" files in a `repository` inside another
> > ".promisor" file.
> >
> > This function can be used to preserve the contents of all ".promisor"
> > files inside a new ".promisor" file, for example when a repack happens.
> >
> > This function is written in such a way so that it will read all the
> > ".promisor" files inside the given `repository` line by line, and copy
> > only the lines that are not already present in the destination file. This
> > is done to avoid copying the same lines multiple times that may come from
> > multiple (redundant) packfiles. There might be another better/cleaner way
> > to achieve this.
> 
> In the previous step, we extablished that these "back then their ref
> X used to point at object Y" records are there so that we can
> identify which refs were fetched at the time the packfile was
> downloaded to help debugging.  When repacking, losing these records
> certainly would lose information.
> 
> But would concatenating all into a single file help preserve the
> useful information?  Don't we need do better than that?

Yeah, we absolutely need to! That's why I said that I was not satisfied
at all with the patch (in cover letter of v1). I really needed some
feedback, because I knew that I was doing things wrong.

> A NEEDSWORK comment, as was discussed in another thread or two in
> the recent past, is not necessarily a well thought out fully
> finished specification of an additional piece of work.  "We know
> this has a problem, we may need to do something about it, like
> concatenating to save the contents, perhaps?  We do not know the
> answer, and we do not bother thinking it through right at this
> moment.  It is left to the future developers to figure it out" is
> what a NEEDSWORK comment is about.
> 
> Your first response to such a comment may be "yeah, I agree that it
> is bad to lose information we added to help debugging", but the
> second one should be to wonder if the "like concatenating..." is the
> best approach going forward.
> 
> In other words, we should take a NEEDSWORK comment as a mere
> starting point, and what NEEDS your work begins at thinking what
> needs to be done about the problem raised there.

I fully understand this! Honestly, my biggest weakness that I've
discovered about myself as a dev (through past open-source experience,
e.g. GSoC'25) is that I get hesitant when I have to work on and submit
a patch if I don't have a lot of experience with the codebase. This
happens particularly when I have to take a decision, and not only
complete a task.

In fact, I decided to work on this specific NEEDSWORK issue to get more
experience on promisor remotes (the feature that I want to improve in my
GSoC proposal) before the GSoC coding period... if I get selected, of
course :-).

I will try my best to improve!

> By mixing them up all into a single list, you no longer can tell
> when their ref X was observed to be pointing at object Y anymore.
> You may have two packs originally, with a record for "ref X pointing
> at object Y" in each of them, but by deduping them, you lose the
> information that you cloned at one time, and made an additional
> fetch on another day, and the fact the ref X was pointing at the
> same value at both times.  I am not sure if it is a good
> implementation if the objective of this topic is to preserve
> information that is useful for debugging.

My reasoning was based on the (wrong) assumption that it was impossible
for the same record fo "ref X is pointing at object Y" to appear
multiple times. Obviously then, deduping them is the wrong solution, as
it will lose some debugging information.

> I wonder if it helps to append to each line the file timestamp of
> the .promisor file we took the record originally?  For the sake of
> completeness, we could consider adding the filename as well, but we
> can quickly dismiss it as not so useful ;-)

Related to what I said before about getting hesitant... I actually
thought about (pretty much) exactly this! My original idea was to add
a timestamp of the current time when the repack happened. I discarded it
because I didn't want to add any new information (for no particular
reason tbh) and because I didn't want ".promisor" file content to
potentially become too long if many repacks happen.

Your solution is much cleaner compared to what I originally thought of.

> If repacking already repacked promisor packfile, the records would
> already contain such a timestamp at the end, so the code to copy
> existing records must be prepared to see if the records are the
> <ref, oid> tuple, or <ref, oid, timestamp> tuple, and act
> accordingly.

Makes perfect sense.

> I am *not* saying that without such a "preserve timestamp" column in
> the record, copying existing records to a new .promisor file is
> useless.  But we do not see any explanation why the author thinks
> that it is sufficient to copy existing records while silently
> deduping.  We can implement only one choice backed by series of
> decisions like "timestamp might help" and "original filenames would
> probably not help", and the design should describe what was
> considered and rejected (as opposed to "we didn't think things
> through---we just did what the original NEEDSWORK comment suggested
> doing").

I 100% agree.

> Thanks.

Thank you Junio for your time and feedback,

Lorenzo

  reply	other threads:[~2026-03-26  2:01 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-21 21:28 [GSoC PATCH 0/3] preserve promisor files content after repack LorenzoPegorari
2026-03-21 21:28 ` [GSoC PATCH 1/3] pack-write: add explanation to promisor file content LorenzoPegorari
2026-03-21 21:28 ` [GSoC PATCH 2/3] pack-write: add helper to fill promisor file after repack LorenzoPegorari
2026-03-22  2:04   ` Eric Sunshine
2026-03-22 18:50     ` Lorenzo Pegorari
2026-03-21 21:29 ` [GSoC PATCH 3/3] repack-promisor: preserve content of promisor files " LorenzoPegorari
2026-03-22 19:16 ` [GSoC PATCH v2 0/4] preserve promisor files content " LorenzoPegorari
2026-03-22 19:16   ` [GSoC PATCH v2 1/4] pack-write: add explanation to promisor file content LorenzoPegorari
2026-03-23 21:07     ` Junio C Hamano
2026-03-25 21:33       ` Lorenzo Pegorari
2026-03-22 19:18   ` [GSoC PATCH v2 2/4] pack-write: add helper to fill promisor file after repack LorenzoPegorari
2026-03-23 20:27     ` Eric Sunshine
2026-03-26 16:15       ` Lorenzo Pegorari
2026-03-23 21:30     ` Junio C Hamano
2026-03-26  2:01       ` Lorenzo Pegorari [this message]
2026-03-22 19:18   ` [GSoC PATCH v2 3/4] repack-promisor: preserve content of promisor files " LorenzoPegorari
2026-03-23 21:48     ` Junio C Hamano
2026-03-26  2:12       ` Lorenzo Pegorari
2026-03-22 19:18   ` [GSoC PATCH v2 4/4] t7700: test for promisor file content " LorenzoPegorari
2026-04-06  0:23   ` [GSoC PATCH v3 0/5] preserve promisor files " LorenzoPegorari
2026-04-06  0:24     ` [GSoC PATCH v3 1/5] pack-write: add explanation to promisor file content LorenzoPegorari
2026-04-06  0:24     ` [GSoC PATCH v3 2/5] pack-write: add helper to fill promisor file after repack LorenzoPegorari
2026-04-06 17:22       ` Tian Yuchen
2026-04-06 18:40         ` Lorenzo Pegorari
2026-04-06 21:17           ` Junio C Hamano
2026-04-07 21:46             ` Lorenzo Pegorari
2026-04-07  2:01           ` Junio C Hamano
2026-04-07 21:52             ` Lorenzo Pegorari
2026-04-07 22:03               ` Junio C Hamano
2026-04-06 21:34       ` Junio C Hamano
2026-04-07 22:07         ` Lorenzo Pegorari
2026-04-06  0:25     ` [GSoC PATCH v3 3/5] repack-promisor: preserve content of promisor files " LorenzoPegorari
2026-04-06  0:25     ` [GSoC PATCH v3 4/5] t7700: test for promisor file content " LorenzoPegorari
2026-04-06 22:05       ` Junio C Hamano
2026-04-07 23:28         ` Lorenzo Pegorari
2026-04-07 18:10       ` Junio C Hamano
2026-04-07 23:11         ` Lorenzo Pegorari
2026-04-08  0:38           ` Lorenzo Pegorari
2026-04-06  0:25     ` [GSoC PATCH v3 5/5] t7703: test for promisor file content after geometric repack LorenzoPegorari
2026-04-10 15:01     ` [GSoC PATCH v4 0/5] preserve promisor files content after repack LorenzoPegorari
2026-04-10 15:02       ` [GSoC PATCH v4 1/5] pack-write: add explanation to promisor file content LorenzoPegorari
2026-04-10 15:02       ` [GSoC PATCH v4 2/5] pack-write: add helper to fill promisor file after repack LorenzoPegorari
2026-04-10 16:01         ` Junio C Hamano
2026-04-10 16:34           ` Lorenzo Pegorari
2026-04-10 18:10           ` [PATCH] CodingGuidelines: st_mtimespec vs st_mtim vs st_mtime Junio C Hamano
2026-04-16 23:46             ` Elijah Newren
2026-04-17  4:25               ` Junio C Hamano
2026-04-10 15:03       ` [GSoC PATCH v4 3/5] repack-promisor: preserve content of promisor files after repack LorenzoPegorari
2026-04-10 15:04       ` [GSoC PATCH v4 4/5] t7700: test for promisor file content " LorenzoPegorari
2026-04-10 15:04       ` [GSoC PATCH v4 5/5] t7703: test for promisor file content after geometric repack LorenzoPegorari
2026-04-10 15:47       ` [GSoC PATCH v4 0/5] preserve promisor files content after repack Junio C Hamano
2026-04-10 16:44         ` Lorenzo Pegorari
2026-04-10 22:54       ` [GSoC PATCH v5 0/6] " LorenzoPegorari
2026-04-10 22:54         ` [GSoC PATCH v5 1/6] pack-write: add explanation to promisor file content LorenzoPegorari
2026-04-10 22:55         ` [GSoC PATCH v5 2/6] repack-promisor add helper to fill promisor file after repack LorenzoPegorari
2026-04-10 23:30           ` Junio C Hamano
2026-04-11  1:59             ` Lorenzo Pegorari
2026-04-12  6:27           ` Junio C Hamano
2026-04-17  0:30             ` Lorenzo Pegorari
2026-04-10 22:55         ` [GSoC PATCH v5 3/6] repack-promisor: preserve content of promisor files " LorenzoPegorari
2026-04-11 18:25           ` Tian Yuchen
2026-04-17  0:34             ` Lorenzo Pegorari
2026-04-10 22:55         ` [GSoC PATCH v5 4/6] t7700: test for promisor file content " LorenzoPegorari
2026-04-10 22:56         ` [GSoC PATCH v5 5/6] t7703: test for promisor file content after geometric repack LorenzoPegorari
2026-04-11 18:49           ` Tian Yuchen
2026-04-17  0:46             ` Lorenzo Pegorari
2026-04-10 22:56         ` [GSoC PATCH v5 6/6] repack-promisor: add missing headers LorenzoPegorari
2026-04-18 14:16         ` [GSoC PATCH v6 0/6] preserve promisor files content after repack LorenzoPegorari
2026-04-18 14:16           ` [GSoC PATCH v6 1/6] pack-write: add explanation to promisor file content LorenzoPegorari
2026-04-18 14:17           ` [GSoC PATCH v6 2/6] repack-promisor add helper to fill promisor file after repack LorenzoPegorari
2026-04-18 14:17           ` [GSoC PATCH v6 3/6] repack-promisor: preserve content of promisor files " LorenzoPegorari
2026-04-18 14:17           ` [GSoC PATCH v6 4/6] t7700: test for promisor file content " LorenzoPegorari
2026-04-18 14:17           ` [GSoC PATCH v6 5/6] t7703: test for promisor file content after geometric repack LorenzoPegorari
2026-04-18 14:17           ` [GSoC PATCH v6 6/6] repack-promisor: add missing headers LorenzoPegorari
2026-05-12  6:49           ` [GSoC PATCH v6 0/6] preserve promisor files content after repack Junio C Hamano
2026-04-10 23:05       ` [GSoC PATCH v4 0/5] " Junio C Hamano
2026-04-11  2:02         ` Junio C Hamano
2026-04-11 14:05           ` Lorenzo Pegorari

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=acSTY_i4zAueN9jD@lorenzo-VM \
    --to=lorenzo.pegorari2002@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=newren@gmail.com \
    --cc=ps@pks.im \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.