From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Robert Coup via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Jonathan Tan <jonathantanmy@google.com>,
John Cai <johncai86@gmail.com>,
Jeff Hostetler <git@jeffhostetler.com>,
Junio C Hamano <gitster@pobox.com>,
Derrick Stolee <derrickstolee@github.com>,
Robert Coup <robert@coup.net.nz>
Subject: Re: [PATCH v2 0/8] fetch: add repair: full refetch without negotiation (was: "refiltering")
Date: Mon, 28 Feb 2022 17:43:28 +0100 [thread overview]
Message-ID: <220228.86r17n2aqd.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <pull.1138.v2.git.1645719218.gitgitgadget@gmail.com>
On Thu, Feb 24 2022, Robert Coup via GitGitGadget wrote:
> [...] While a key use case is described
> above for partial clones, a user could also use --repair to fix a corrupted
> object database by performing a refetch of objects that should already be
> present, establishing a better workflow than deleting the local repository
> and re-cloning.
>
> * Using --repair will produce duplicated objects between the existing and
> newly fetched packs, but maintenance will clean them up when it runs
> automatically post-fetch (if enabled).
> * If a user fetches with --repair applying a more restrictive partial clone
> filter than previously (eg: blob:limit=1m then blob:limit=1k) the
> eventual state is a no-op, since any referenced object already in the
> local repository is never removed. More advanced repacking which could
> improve this scenario is currently proposed at [2].
I realize this was probably based on feedback on v1 (I didn't go back
and re-read it, sorry).
But I feel strongly that we really should name this something other than
--repair. I don't care much if it isn't that :) But maybe
--expand-filters, --fleshen-partial or something like that?
So first (and partially as an aside): Is a "noop" negotiatior really
want we want at all? Don't we instead want to be discovering those parts
of our history that are closed under reachability (if any) and say we
HAVE those things during negotiation?
I haven't tested, but maybe that's just more complex, e.g. you have a
filter that's excluding >500MB blobs or whatever you might have the full
history already, or maybe that 500MB blob was added last week, so you
have almost all of it.
But wouldn't that be a lot kinder to server resources + network at the
expense of some (presumably rare) extra local computation?
But secondly, on the "--repair" name: The reason I mentioned that is
that I'd really like us to actually have a "my repo is screwed, please
repair it".
But (and I haven't tested, but I'm pretty sure), this patch series isn't
going to give you that. The reasons are elaborated on in [1], basically
we try really hard to re-use local data, and due to that & the collision
detection will often just hard die early in object walking.
But maybe I'm wrong, have you actually tested this with *broken* objects
as opposed to just missing ones with repo filters + promisors in play?
Our t/*fsck* and t/*corrupt*/ etc. tests have some of those.
And maybe I'm making a big deal out of nothing, but I fear that by
naming it --repair and giving it these semantics that we'd be closing
the door on things that are actually needed for some of the trickier
edge cases when it comes to repairing a bad repository.
Including but not limited to having a loose BAD_OBJ, and needing to
replace it with another loose object (due to the unpack limit), the
branch we're updating can't be read locally, but is at an OID that's
(re-)included in the incoming pack and is hopefully about to repair our
repository.
Or even that we have a SHA-1 collision, but we intentionally want to
override the collision detection because we know our local repo is bad,
but the remote can be trusted.
All of which are much more involved than just the "fleshen partial data"
you're aiming for here...
1. https://lore.kernel.org/git/87czo7haha.fsf@evledraar.gmail.com/
next prev parent reply other threads:[~2022-02-28 16:54 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-01 15:49 [PATCH 0/6] [RFC] partial-clone: add ability to refetch with expanded filter Robert Coup via GitGitGadget
2022-02-01 15:49 ` [PATCH 1/6] fetch-negotiator: add specific noop initializor Robert Coup via GitGitGadget
2022-02-01 15:49 ` [PATCH 2/6] fetch-pack: add partial clone refiltering Robert Coup via GitGitGadget
2022-02-04 18:02 ` Jonathan Tan
2022-02-11 14:56 ` Robert Coup
2022-02-17 0:05 ` Jonathan Tan
2022-02-01 15:49 ` [PATCH 3/6] builtin/fetch-pack: add --refilter option Robert Coup via GitGitGadget
2022-02-01 15:49 ` [PATCH 4/6] fetch: " Robert Coup via GitGitGadget
2022-02-01 15:49 ` [PATCH 5/6] t5615-partial-clone: add test for --refilter Robert Coup via GitGitGadget
2022-02-01 15:49 ` [PATCH 6/6] doc/partial-clone: mention --refilter option Robert Coup via GitGitGadget
2022-02-01 20:13 ` [PATCH 0/6] [RFC] partial-clone: add ability to refetch with expanded filter Junio C Hamano
2022-02-02 15:02 ` Robert Coup
2022-02-16 13:24 ` Robert Coup
2022-02-02 18:59 ` Jonathan Tan
2022-02-02 21:58 ` Robert Coup
2022-02-02 21:59 ` Robert Coup
2022-02-07 19:37 ` Jeff Hostetler
2022-02-24 16:13 ` [PATCH v2 0/8] fetch: add repair: full refetch without negotiation (was: "refiltering") Robert Coup via GitGitGadget
2022-02-24 16:13 ` [PATCH v2 1/8] fetch-negotiator: add specific noop initializor Robert Coup via GitGitGadget
2022-02-25 6:19 ` Junio C Hamano
2022-02-28 12:22 ` Robert Coup
2022-02-24 16:13 ` [PATCH v2 2/8] fetch-pack: add repairing Robert Coup via GitGitGadget
2022-02-25 6:46 ` Junio C Hamano
2022-02-28 12:14 ` Robert Coup
2022-02-24 16:13 ` [PATCH v2 3/8] builtin/fetch-pack: add --repair option Robert Coup via GitGitGadget
2022-02-24 16:13 ` [PATCH v2 4/8] fetch: " Robert Coup via GitGitGadget
2022-02-24 16:13 ` [PATCH v2 5/8] t5615-partial-clone: add test for fetch --repair Robert Coup via GitGitGadget
2022-02-24 16:13 ` [PATCH v2 6/8] maintenance: add ability to pass config options Robert Coup via GitGitGadget
2022-02-25 6:57 ` Junio C Hamano
2022-02-28 12:02 ` Robert Coup
2022-02-28 17:07 ` Junio C Hamano
2022-02-25 10:29 ` Ævar Arnfjörð Bjarmason
2022-02-28 11:51 ` Robert Coup
2022-02-24 16:13 ` [PATCH v2 7/8] fetch: after repair, encourage auto gc repacking Robert Coup via GitGitGadget
2022-02-28 16:40 ` Ævar Arnfjörð Bjarmason
2022-02-24 16:13 ` [PATCH v2 8/8] doc/partial-clone: mention --repair fetch option Robert Coup via GitGitGadget
2022-02-28 16:43 ` Ævar Arnfjörð Bjarmason [this message]
2022-02-28 17:27 ` [PATCH v2 0/8] fetch: add repair: full refetch without negotiation (was: "refiltering") Robert Coup
2022-02-28 18:54 ` [PATCH v2 0/8] fetch: add repair: full refetch without negotiation Junio C Hamano
2022-02-28 22:20 ` [PATCH v2 0/8] fetch: add repair: full refetch without negotiation (was: "refiltering") Ævar Arnfjörð Bjarmason
2022-03-04 15:04 ` [PATCH v3 0/7] " Robert Coup via GitGitGadget
2022-03-04 15:04 ` [PATCH v3 1/7] fetch-negotiator: add specific noop initializer Robert Coup via GitGitGadget
2022-03-04 15:04 ` [PATCH v3 2/7] fetch-pack: add refetch Robert Coup via GitGitGadget
2022-03-04 15:04 ` [PATCH v3 3/7] builtin/fetch-pack: add --refetch option Robert Coup via GitGitGadget
2022-03-04 15:04 ` [PATCH v3 4/7] fetch: " Robert Coup via GitGitGadget
2022-03-04 21:19 ` Junio C Hamano
2022-03-07 11:31 ` Robert Coup
2022-03-07 17:27 ` Junio C Hamano
2022-03-09 10:00 ` Robert Coup
2022-03-04 15:04 ` [PATCH v3 5/7] t5615-partial-clone: add test for fetch --refetch Robert Coup via GitGitGadget
2022-03-04 15:04 ` [PATCH v3 6/7] fetch: after refetch, encourage auto gc repacking Robert Coup via GitGitGadget
2022-03-04 15:04 ` [PATCH v3 7/7] doc/partial-clone: mention --refetch fetch option Robert Coup via GitGitGadget
2022-03-09 0:27 ` [PATCH v3 0/7] fetch: add repair: full refetch without negotiation (was: "refiltering") Calvin Wan
2022-03-09 9:57 ` Robert Coup
2022-03-09 21:32 ` [PATCH v3 0/7] fetch: add repair: full refetch without negotiation Junio C Hamano
2022-03-10 1:07 ` Calvin Wan
2022-03-10 14:29 ` Robert Coup
2022-03-21 17:58 ` Calvin Wan
2022-03-21 21:34 ` Robert Coup
2022-03-28 14:02 ` [PATCH v4 0/7] fetch: add repair: full refetch without negotiation (was: "refiltering") Robert Coup via GitGitGadget
2022-03-28 14:02 ` [PATCH v4 1/7] fetch-negotiator: add specific noop initializer Robert Coup via GitGitGadget
2022-03-28 14:02 ` [PATCH v4 2/7] fetch-pack: add refetch Robert Coup via GitGitGadget
2022-03-31 15:09 ` Ævar Arnfjörð Bjarmason
2022-04-01 10:26 ` Robert Coup
2022-03-28 14:02 ` [PATCH v4 3/7] builtin/fetch-pack: add --refetch option Robert Coup via GitGitGadget
2022-03-28 14:02 ` [PATCH v4 4/7] fetch: " Robert Coup via GitGitGadget
2022-03-31 15:18 ` Ævar Arnfjörð Bjarmason
2022-04-01 10:31 ` Robert Coup
2022-03-28 14:02 ` [PATCH v4 5/7] t5615-partial-clone: add test for fetch --refetch Robert Coup via GitGitGadget
2022-03-31 15:20 ` Ævar Arnfjörð Bjarmason
2022-04-01 10:36 ` Robert Coup
2022-03-28 14:02 ` [PATCH v4 6/7] fetch: after refetch, encourage auto gc repacking Robert Coup via GitGitGadget
2022-03-31 15:22 ` Ævar Arnfjörð Bjarmason
2022-04-01 10:51 ` Robert Coup
2022-03-28 14:02 ` [PATCH v4 7/7] docs: mention --refetch fetch option Robert Coup via GitGitGadget
2022-03-28 17:38 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=220228.86r17n2aqd.gmgdl@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=derrickstolee@github.com \
--cc=git@jeffhostetler.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=johncai86@gmail.com \
--cc=jonathantanmy@google.com \
--cc=robert@coup.net.nz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).