From: phillip.wood123@gmail.com
To: Nikolay Edigaryev <edigaryev@gmail.com>, phillip.wood@dunelm.org.uk
Cc: Nikolay Edigaryev via GitGitGadget <gitgitgadget@gmail.com>,
git@vger.kernel.org, Derrick Stolee <stolee@gmail.com>
Subject: Re: [PATCH] clone: support cloning of filtered bundles
Date: Mon, 15 Jan 2024 10:35:49 +0000 [thread overview]
Message-ID: <56522074-47b4-44d7-91da-8cf1149e85c7@gmail.com> (raw)
In-Reply-To: <CAFX5hXQ291gR4831dtRTdvtffWefCNCFp13ADvOkU7s3SVPczQ@mail.gmail.com>
Hi Nikolay
On 14/01/2024 21:26, Nikolay Edigaryev wrote:
>> Note that currently, when you clone a normal non-filtered bundle with a
>> '--filter' argument specified, no filtering will take place and no error
>> will be thrown. "promisor = true" and "partialclonefilter = ..." options
>> will be set in the repo config, but no .promisor file will be created.
>> This is even more confusing IMO, but that's how it currently on
>> Git 2.43.0.
>
> I was wrong about this one: the .promisor pack file actually gets created.
>
> And the filtering is not being done because the "uploadpack.allowFilter"
> global option is not enabled by default.
That makes sense - if you try to make a partial clone from a remote that
does not support object filtering we fallback to a full clone and print
the warning you mention below.
> Unfortunately the only way to figure this out is to run Git with
> GIT_TRACE=2, which shows:
That seems strange, you should see the warning without having to set
GIT_TRACE. I've certainly seen the warning in the past when trying to
create a partial clone from a remote did not support them without me
setting any special environment variables.
Best Wishes
Phillip
> warning: filtering not recognized by server, ignoring
>
> So please feel free to skip this part from the consideration.
>
>
> On Sun, Jan 14, 2024 at 11:39 PM Nikolay Edigaryev <edigaryev@gmail.com> wrote:
>>
>> Hello Phillip,
>>
>>> As I understand it if you're cloning from a bundle file then then there
>>> is no remote so how can we set a remote-specific config?
>>
>> There is a remote, for more details see df61c88979 (clone: also
>> configure url for bare clones, 2010-03-29), which has the following
>> code:
>>
>> strbuf_addf(&key, "remote.%s.url", remote_name);
>> git_config_set(key.buf, repo);
>> strbuf_reset(&key);
>>
>> You can verify this by creating a bundle on Git 2.43.0 with "git create
>> bundle bundle.bundle --all" and then cloning it with "git clone
>> --bare /path/to/bundle.bundle", in my case the following repo-wide
>> configuration file was created:
>>
>> [core]
>> repositoryformatversion = 0
>> filemode = true
>> bare = true
>> ignorecase = true
>> precomposeunicode = true
>> [remote "origin"]
>> url = /Users/edi/src/cirrus-cli/cli.bundle
>>
>>> I'm surprised that the proposed change does not require the user to pass
>>> "--filter" to "git clone" as I expected that we'd want to check that the
>>> filter on the command line was compatible with the filter used to create
>>> the bundle. Allowing "git clone" to create a partial clone without the
>>> user asking for it by passing the "--filter" option feels like is going
>>> to end up confusing users.
>>
>> Note that currently, when you clone a normal non-filtered bundle with a
>> '--filter' argument specified, no filtering will take place and no error
>> will be thrown. "promisor = true" and "partialclonefilter = ..." options
>> will be set in the repo config, but no .promisor file will be created.
>> This is even more confusing IMO, but that's how it currently on
>> Git 2.43.0.
>>
>> You have a good point, but I feel like completely preventing cloning of
>> filtered bundles and requiring a '--filter' argument is very taxing. If
>> you've already specified a '--filter' when creating a bundle (and thus
>> your intent to use partially cloned data), why do it multiple times?
>>
>> What I propose as an alternative here is to act based on the user's
>> intent when cloning:
>>
>> * when the user specifies no '--filter' argument, do nothing special,
>> allow cloning both types of bundles: normal and filtered (with the
>> logic from this patch)
>>
>> * when the user does specify a '--filter' argument, either:
>> * throw an error explaining that filtering of filtered bundles is not
>> supported
>> * or compare the user's filter specification and the one that is
>> in the bundle and only throw an error if they mismatch
>>
>> Let me know what you think about this (and perhaps you have a more
>> concrete example in mind where this will have negative consequences)
>> and I'll be happy to do a next iteration.
>>
>>
>> On Sun, Jan 14, 2024 at 10:00 PM Phillip Wood <phillip.wood123@gmail.com> wrote:
>>>
>>> Hi Nikolay
>>>
>>> On 14/01/2024 11:16, Nikolay Edigaryev via GitGitGadget wrote:
>>>> From: Nikolay Edigaryev <edigaryev@gmail.com>
>>>>
>>>> f18b512bbb (bundle: create filtered bundles, 2022-03-09) introduced an
>>>> incredibly useful ability to create filtered bundles, which advances
>>>> the partial clone/promisor support in Git and allows for archiving
>>>> large repositories to object storages like S3 in bundles that are:
>>>>
>>>> * easy to manage
>>>> * bundle is just a single file, it's easier to guarantee atomic
>>>> replacements in object storages like S3 and they are faster to
>>>> fetch than a bare repository since there's only a single GET
>>>> request involved
>>>> * incredibly tiny
>>>> * no indexes (which may be more than 10 MB for some repositories)
>>>> and other fluff, compared to cloning a bare repository
>>>> * bundle can be filtered to only contain the tips of refs neccessary
>>>> for e.g. code-analysis purposes
>>>>
>>>> However, in 86fdd94d72 (clone: fail gracefully when cloning filtered
>>>> bundle, 2022-03-09) the cloning of such bundles was disabled, with a
>>>> note that this behavior is not desired, and it the long-term this
>>>> should be possible.
>>>>
>>>> The commit above states that it's not possible to have this at the
>>>> moment due to lack of remote and a repository-global config that
>>>> specifies an object filter, yet it's unclear why a remote-specific
>>>> config can't be used instead, which is what this change does.
>>>
>>> As I understand it if you're cloning from a bundle file then then there
>>> is no remote so how can we set a remote-specific config?
>>>
>>> I'm surprised that the proposed change does not require the user to pass
>>> "--filter" to "git clone" as I expected that we'd want to check that the
>>> filter on the command line was compatible with the filter used to create
>>> the bundle. Allowing "git clone" to create a partial clone without the
>>> user asking for it by passing the "--filter" option feels like is going
>>> to end up confusing users.
>>>
>>>> +test_expect_success 'cloning from filtered bundle works' '
>>>> + git bundle create partial.bdl --all --filter=blob:none &&
>>>> + git clone --bare partial.bdl partial 2>err
>>>
>>> The redirection hides any error message which will make debugging test
>>> failures harder. It would be nice to see this test check any config set
>>> when cloning and that git commands can run successfully in the repository.
>>>
>>> Best Wishes
>>>
>>> Phillip
next prev parent reply other threads:[~2024-01-15 10:35 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-14 11:16 [PATCH] clone: support cloning of filtered bundles Nikolay Edigaryev via GitGitGadget
2024-01-14 18:00 ` Phillip Wood
2024-01-14 19:39 ` Nikolay Edigaryev
2024-01-14 21:26 ` Nikolay Edigaryev
2024-01-15 10:35 ` phillip.wood123 [this message]
2024-01-15 10:18 ` phillip.wood123
2024-01-15 1:13 ` Junio C Hamano
2024-01-15 2:09 ` Junio C Hamano
2024-01-16 20:06 ` Nikolay Edigaryev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56522074-47b4-44d7-91da-8cf1149e85c7@gmail.com \
--to=phillip.wood123@gmail.com \
--cc=edigaryev@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=phillip.wood@dunelm.org.uk \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).