From: Junio C Hamano <gitster@pobox.com>
To: "Robert Coup via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Jonathan Tan <jonathantanmy@google.com>,
Derrick Stolee <stolee@gmail.com>, Taylor Blau <me@ttaylorr.com>,
Christian Couder <christian.couder@gmail.com>,
John Cai <johncai86@gmail.com>, Robert Coup <robert@coup.net.nz>
Subject: Re: [PATCH 0/6] [RFC] partial-clone: add ability to refetch with expanded filter
Date: Tue, 01 Feb 2022 12:13:38 -0800 [thread overview]
Message-ID: <xmqqk0eecpl9.fsf@gitster.g> (raw)
In-Reply-To: <pull.1138.git.1643730593.gitgitgadget@gmail.com> (Robert Coup via GitGitGadget's message of "Tue, 01 Feb 2022 15:49:47 +0000")
"Robert Coup via GitGitGadget" <gitgitgadget@gmail.com> writes:
> If a filter is changed on a partial clone repository, for example from
> blob:none to blob:limit=1m, there is currently no straightforward way to
> bulk-refetch the objects that match the new filter for existing local
> commits. This is because the client will report commits as "have" during
> negotiation and any dependent objects won't be included in the transferred
> pack.
It sounds like a useful thing to have such a "refetch things"
option.
A lazy/partial clone is narrower than the full tree in the width
dimension, while a shallow clone is shallower than the full history
in the time dimension. The latter already has the "--deepen"
support to say "the commits listed in my shallow boundary list may
claim that I already have these, but I actually don't have them;
please stop lying to the other side and refetch what I should have
fetched earlier". I understand that this works in the other
dimension to "--widen" things?
Makes me wonder how well these two features work together (or if
they are mutually exclusive, that is fine as well as a starting
point).
If you update the filter specification to make it narrower (e.g. you
start from blob:limit=1m down to blob:limit=512k), would we transfer
nothing (which would be ideal), or would we end up refetching
everything that are smaller than 512k?
> This patch series proposes adding a --refilter option to fetch & fetch-pack
> to enable doing a full fetch with a different filter, as if the local has no
> commits in common with the remote. It builds upon cbe566a071
> ("negotiator/noop: add noop fetch negotiator", 2020-08-18).
I guess the answer to the last question is ...
> To note:
>
> 1. This will produce duplicated objects between the existing and newly
> fetched packs, but gc will clean them up.
... it is not smart enough to stell them to exclude what we _ought_
to have by telling them what the _old_ filter spec was. That's OK
for a starting point, I guess. Hopefully, at the end of this
operation, we should garbage collect the duplicated objects by
default (with an option to turn it off)?
> 2. This series doesn't check that there's a new filter in any way, whether
> configured via config or passed via --filter=. Personally I think that's
> fine.
In other words, a repository that used to be a partial clone can
become a full clone by using the option _and_ not giving any filter.
I think that is an intuitive enough behaviour and a natural
consequence to the extreme of what the feature is. Compared to
making a full "git clone", fetching from the old local (and narrow)
repository into it and then discarding the old one, it would not
have any performance or storage advantage, but it probably is more
convenient.
> 3. If a user fetches with --refilter applying a more restrictive filter
> than previously (eg: blob:limit=1m then blob:limit=1k) the eventual
> state is a no-op, since any referenced object already in the local
> repository is never removed. Potentially this could be improved in
> future by more advanced gc, possibly along the lines discussed at [2].
OK. That matches my reaction to 1. above.
next prev parent reply other threads:[~2022-02-01 20:13 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-01 15:49 [PATCH 0/6] [RFC] partial-clone: add ability to refetch with expanded filter Robert Coup via GitGitGadget
2022-02-01 15:49 ` [PATCH 1/6] fetch-negotiator: add specific noop initializor Robert Coup via GitGitGadget
2022-02-01 15:49 ` [PATCH 2/6] fetch-pack: add partial clone refiltering Robert Coup via GitGitGadget
2022-02-04 18:02 ` Jonathan Tan
2022-02-11 14:56 ` Robert Coup
2022-02-17 0:05 ` Jonathan Tan
2022-02-01 15:49 ` [PATCH 3/6] builtin/fetch-pack: add --refilter option Robert Coup via GitGitGadget
2022-02-01 15:49 ` [PATCH 4/6] fetch: " Robert Coup via GitGitGadget
2022-02-01 15:49 ` [PATCH 5/6] t5615-partial-clone: add test for --refilter Robert Coup via GitGitGadget
2022-02-01 15:49 ` [PATCH 6/6] doc/partial-clone: mention --refilter option Robert Coup via GitGitGadget
2022-02-01 20:13 ` Junio C Hamano [this message]
2022-02-02 15:02 ` [PATCH 0/6] [RFC] partial-clone: add ability to refetch with expanded filter Robert Coup
2022-02-16 13:24 ` Robert Coup
2022-02-02 18:59 ` Jonathan Tan
2022-02-02 21:58 ` Robert Coup
2022-02-02 21:59 ` Robert Coup
2022-02-07 19:37 ` Jeff Hostetler
2022-02-24 16:13 ` [PATCH v2 0/8] fetch: add repair: full refetch without negotiation (was: "refiltering") Robert Coup via GitGitGadget
2022-02-24 16:13 ` [PATCH v2 1/8] fetch-negotiator: add specific noop initializor Robert Coup via GitGitGadget
2022-02-25 6:19 ` Junio C Hamano
2022-02-28 12:22 ` Robert Coup
2022-02-24 16:13 ` [PATCH v2 2/8] fetch-pack: add repairing Robert Coup via GitGitGadget
2022-02-25 6:46 ` Junio C Hamano
2022-02-28 12:14 ` Robert Coup
2022-02-24 16:13 ` [PATCH v2 3/8] builtin/fetch-pack: add --repair option Robert Coup via GitGitGadget
2022-02-24 16:13 ` [PATCH v2 4/8] fetch: " Robert Coup via GitGitGadget
2022-02-24 16:13 ` [PATCH v2 5/8] t5615-partial-clone: add test for fetch --repair Robert Coup via GitGitGadget
2022-02-24 16:13 ` [PATCH v2 6/8] maintenance: add ability to pass config options Robert Coup via GitGitGadget
2022-02-25 6:57 ` Junio C Hamano
2022-02-28 12:02 ` Robert Coup
2022-02-28 17:07 ` Junio C Hamano
2022-02-25 10:29 ` Ævar Arnfjörð Bjarmason
2022-02-28 11:51 ` Robert Coup
2022-02-24 16:13 ` [PATCH v2 7/8] fetch: after repair, encourage auto gc repacking Robert Coup via GitGitGadget
2022-02-28 16:40 ` Ævar Arnfjörð Bjarmason
2022-02-24 16:13 ` [PATCH v2 8/8] doc/partial-clone: mention --repair fetch option Robert Coup via GitGitGadget
2022-02-28 16:43 ` [PATCH v2 0/8] fetch: add repair: full refetch without negotiation (was: "refiltering") Ævar Arnfjörð Bjarmason
2022-02-28 17:27 ` Robert Coup
2022-02-28 18:54 ` [PATCH v2 0/8] fetch: add repair: full refetch without negotiation Junio C Hamano
2022-02-28 22:20 ` [PATCH v2 0/8] fetch: add repair: full refetch without negotiation (was: "refiltering") Ævar Arnfjörð Bjarmason
2022-03-04 15:04 ` [PATCH v3 0/7] " Robert Coup via GitGitGadget
2022-03-04 15:04 ` [PATCH v3 1/7] fetch-negotiator: add specific noop initializer Robert Coup via GitGitGadget
2022-03-04 15:04 ` [PATCH v3 2/7] fetch-pack: add refetch Robert Coup via GitGitGadget
2022-03-04 15:04 ` [PATCH v3 3/7] builtin/fetch-pack: add --refetch option Robert Coup via GitGitGadget
2022-03-04 15:04 ` [PATCH v3 4/7] fetch: " Robert Coup via GitGitGadget
2022-03-04 21:19 ` Junio C Hamano
2022-03-07 11:31 ` Robert Coup
2022-03-07 17:27 ` Junio C Hamano
2022-03-09 10:00 ` Robert Coup
2022-03-04 15:04 ` [PATCH v3 5/7] t5615-partial-clone: add test for fetch --refetch Robert Coup via GitGitGadget
2022-03-04 15:04 ` [PATCH v3 6/7] fetch: after refetch, encourage auto gc repacking Robert Coup via GitGitGadget
2022-03-04 15:04 ` [PATCH v3 7/7] doc/partial-clone: mention --refetch fetch option Robert Coup via GitGitGadget
2022-03-09 0:27 ` [PATCH v3 0/7] fetch: add repair: full refetch without negotiation (was: "refiltering") Calvin Wan
2022-03-09 9:57 ` Robert Coup
2022-03-09 21:32 ` [PATCH v3 0/7] fetch: add repair: full refetch without negotiation Junio C Hamano
2022-03-10 1:07 ` Calvin Wan
2022-03-10 14:29 ` Robert Coup
2022-03-21 17:58 ` Calvin Wan
2022-03-21 21:34 ` Robert Coup
2022-03-28 14:02 ` [PATCH v4 0/7] fetch: add repair: full refetch without negotiation (was: "refiltering") Robert Coup via GitGitGadget
2022-03-28 14:02 ` [PATCH v4 1/7] fetch-negotiator: add specific noop initializer Robert Coup via GitGitGadget
2022-03-28 14:02 ` [PATCH v4 2/7] fetch-pack: add refetch Robert Coup via GitGitGadget
2022-03-31 15:09 ` Ævar Arnfjörð Bjarmason
2022-04-01 10:26 ` Robert Coup
2022-03-28 14:02 ` [PATCH v4 3/7] builtin/fetch-pack: add --refetch option Robert Coup via GitGitGadget
2022-03-28 14:02 ` [PATCH v4 4/7] fetch: " Robert Coup via GitGitGadget
2022-03-31 15:18 ` Ævar Arnfjörð Bjarmason
2022-04-01 10:31 ` Robert Coup
2022-03-28 14:02 ` [PATCH v4 5/7] t5615-partial-clone: add test for fetch --refetch Robert Coup via GitGitGadget
2022-03-31 15:20 ` Ævar Arnfjörð Bjarmason
2022-04-01 10:36 ` Robert Coup
2022-03-28 14:02 ` [PATCH v4 6/7] fetch: after refetch, encourage auto gc repacking Robert Coup via GitGitGadget
2022-03-31 15:22 ` Ævar Arnfjörð Bjarmason
2022-04-01 10:51 ` Robert Coup
2022-03-28 14:02 ` [PATCH v4 7/7] docs: mention --refetch fetch option Robert Coup via GitGitGadget
2022-03-28 17:38 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqk0eecpl9.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=christian.couder@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=johncai86@gmail.com \
--cc=jonathantanmy@google.com \
--cc=me@ttaylorr.com \
--cc=robert@coup.net.nz \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).