All of lore.kernel.org
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Taylor Blau <ttaylorr@github.com>, Sun Chao <16657101987@163.com>,
	Sun Chao via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Subject: Re: [PATCH v2] packfile: freshen the mtime of packfile by configuration
Date: Wed, 14 Jul 2021 15:52:26 -0400	[thread overview]
Message-ID: <YO9AeudYPmWRnRNb@nand.local> (raw)
In-Reply-To: <87y2a8zntw.fsf@evledraar.gmail.com>

On Wed, Jul 14, 2021 at 09:32:26PM +0200, Ævar Arnfjörð Bjarmason wrote:
> > But if that isn't possible, then I find introducing a new file to
> > redefine the pack's mtime just to accommodate a backup system that
> > doesn't know better to be a poor justification for adding this
> > complexity. Especially since we agree that rsync-ing live Git
> > repositories is a bad idea in the first place ;).
> >
> > If it were me, I would probably stop here and avoid pursuing this
> > further. But an OK middle ground might be core.freshenPackfiles=<bool>
> > to indicate whether or not packs can be freshened, or the objects
> > contained within them should just be rewritten loose.
> >
> > Sun could then set this configuration to "false", implying:
> >
> >   - That they would have more random loose objects, leading to some
> >     redundant work by their backup system.
> >   - But they wouldn't have to resync their huge packfiles.
> >
> > ...and we wouldn't have to introduce any new formats/file types to do
> > it. To me, that seems like a net-positive outcome.
>
> This approach is getting quite close to my core.checkCollisions patch,
> to the point of perhaps being indistinguishable in practice:
> https://lore.kernel.org/git/20181028225023.26427-5-avarab@gmail.com/

Hmm, I'm not sure if I understand. That collision check is only done
during index-pack, and reading builtin/index-pack.c:check_collision(),
it looks like we only do it for large blobs anyway.

> I.e. if you're happy to re-write out duplicate objects then you're going
> to be ignoring the collision check and don't need to do it. It's not the
> same in that you might skip writing objects you know are reachable, and
> with the collisions check off and not-so-thin packs you will/might get
> more redundancy than you asked for.

We may be talking about different things, but if users are concerned
about SHA-1 collisions, then they should still be able to build with
DC_SHA1=YesPlease to catch shattered-style collisions.

Anyway, I think we may be a little in the weeds for what we are trying
to accomplish here. I'm thinking something along the lines of the
following (sans documentation and tests, of course ;)).

--- >8 ---

diff --git a/object-file.c b/object-file.c
index f233b440b2..87c9238365 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1971,9 +1971,22 @@ static int freshen_loose_object(const struct object_id *oid)
 	return check_and_freshen(oid, 1);
 }

+static int can_freshen_packs = -1;
+static int get_can_freshen_packs(void)
+{
+	 if (can_freshen_packs < 0) {
+		if (git_config_get_bool("core.freshenpackfiles",
+					&can_freshen_packs))
+			can_freshen_packs = 1;
+	 }
+	 return can_freshen_packs;
+}
+
 static int freshen_packed_object(const struct object_id *oid)
 {
 	struct pack_entry e;
+	if (!get_can_freshen_packs())
+		return 0;
 	if (!find_pack_entry(the_repository, oid, &e))
 		return 0;
 	if (e.p->freshened)

  reply	other threads:[~2021-07-14 19:59 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-10 19:01 [PATCH] packfile: enhance the mtime of packfile by idx file Sun Chao via GitGitGadget
2021-07-11 23:44 ` Ævar Arnfjörð Bjarmason
2021-07-12 16:17   ` Sun Chao
2021-07-14  1:28 ` [PATCH v2] packfile: freshen the mtime of packfile by configuration Sun Chao via GitGitGadget
2021-07-14  1:39   ` Ævar Arnfjörð Bjarmason
2021-07-14  2:52     ` Taylor Blau
2021-07-14 16:46       ` Sun Chao
2021-07-14 17:04         ` Taylor Blau
2021-07-14 18:19           ` Ævar Arnfjörð Bjarmason
2021-07-14 19:11             ` Martin Fick
2021-07-14 19:41               ` Ævar Arnfjörð Bjarmason
2021-07-14 20:20                 ` Martin Fick
2021-07-20  6:32                   ` Ævar Arnfjörð Bjarmason
2021-07-15  8:23                 ` Son Luong Ngoc
2021-07-20  6:29                   ` Ævar Arnfjörð Bjarmason
2021-07-14 19:30             ` Taylor Blau
2021-07-14 19:32               ` Ævar Arnfjörð Bjarmason
2021-07-14 19:52                 ` Taylor Blau [this message]
2021-07-14 21:40               ` Junio C Hamano
2021-07-15 16:30           ` Sun Chao
2021-07-15 16:42             ` Taylor Blau
2021-07-15 16:48               ` Sun Chao
2021-07-14 16:11     ` Sun Chao
2021-07-19 19:53   ` [PATCH v3] " Sun Chao via GitGitGadget
2021-07-19 20:51     ` Taylor Blau
2021-07-20  0:07       ` Junio C Hamano
2021-07-20 15:07         ` Sun Chao
2021-07-20  6:19       ` Ævar Arnfjörð Bjarmason
2021-07-20 15:34         ` Sun Chao
2021-07-20 15:00       ` Sun Chao
2021-07-20 16:53         ` Taylor Blau
2021-08-15 17:08     ` [PATCH v4 0/2] " Sun Chao via GitGitGadget
2021-08-15 17:08       ` [PATCH v4 1/2] packfile: rename `derive_filename()` to `derive_pack_filename()` Sun Chao via GitGitGadget
2021-08-15 17:08       ` [PATCH v4 2/2] packfile: freshen the mtime of packfile by bump file Sun Chao via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YO9AeudYPmWRnRNb@nand.local \
    --to=me@ttaylorr.com \
    --cc=16657101987@163.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=ttaylorr@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.