* [PATCH] fetch, clone: add fetch.blobSizeLimit config
@ 2026-03-01 16:44 Alan Braithwaite via GitGitGadget
2026-03-02 11:53 ` Patrick Steinhardt
2026-03-05 0:57 ` [PATCH v2] clone: add clone.<url>.defaultObjectFilter config Alan Braithwaite via GitGitGadget
0 siblings, 2 replies; 28+ messages in thread
From: Alan Braithwaite via GitGitGadget @ 2026-03-01 16:44 UTC (permalink / raw)
To: git
Cc: ps, christian.couder, jonathantanmy, me, gitster,
Alan Braithwaite, Alan Braithwaite
From: Alan Braithwaite <alan@braithwaite.dev>
External tools like git-lfs and git-fat use the filter clean/smudge
mechanism to manage large binary objects, but this requires pointer
files, a separate storage backend, and careful coordination. Git's
partial clone infrastructure provides a more native approach: large
blobs can be excluded at the protocol level during fetch and lazily
retrieved on demand. However, enabling this requires passing
`--filter=blob:limit=<size>` on every clone, which is not
discoverable and cannot be set as a global default.
Add a new `fetch.blobSizeLimit` configuration option that enables
size-based partial clone behavior globally. When set, both `git
clone` and `git fetch` automatically apply a `blob:limit=<size>`
filter. Blobs larger than the threshold that are not needed for the
current worktree are excluded from the transfer and lazily fetched
on demand when needed (e.g., during checkout, diff, or merge).
This makes it easy to work with repositories that have accumulated
large binary files in their history, without downloading all of
them upfront.
The precedence order is:
1. Explicit `--filter=` on the command line (highest)
2. Existing `remote.<name>.partialclonefilter`
3. `fetch.blobSizeLimit` (new, lowest)
Once a clone or fetch applies this setting, the remote is registered
as a promisor remote with the corresponding filter spec, so
subsequent fetches inherit it automatically. If the server does not
support object filtering, the setting is silently ignored.
Signed-off-by: Alan Braithwaite <alan@braithwaite.dev>
---
fetch, clone: add fetch.blobSizeLimit config
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2058%2Fabraithwaite%2Falan%2Ffetch-blob-size-limit-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2058/abraithwaite/alan/fetch-blob-size-limit-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2058
Documentation/config/fetch.adoc | 19 +++++++++++
builtin/clone.c | 13 +++++++
builtin/fetch.c | 45 +++++++++++++++++++------
t/t5616-partial-clone.sh | 60 +++++++++++++++++++++++++++++++++
4 files changed, 127 insertions(+), 10 deletions(-)
diff --git a/Documentation/config/fetch.adoc b/Documentation/config/fetch.adoc
index cd40db0cad..4165354dd9 100644
--- a/Documentation/config/fetch.adoc
+++ b/Documentation/config/fetch.adoc
@@ -103,6 +103,25 @@ config setting.
file helps performance of many Git commands, including `git merge-base`,
`git push -f`, and `git log --graph`. Defaults to `false`.
+`fetch.blobSizeLimit`::
+ When set to a size value (e.g., `1m`, `100k`, `1g`), both
+ linkgit:git-clone[1] and linkgit:git-fetch[1] will automatically
+ use `--filter=blob:limit=<value>` to enable partial clone
+ behavior. Blobs larger than this threshold are excluded from the
+ initial transfer and lazily fetched on demand when needed (e.g.,
+ during checkout).
++
+This provides a convenient way to enable size-based partial clones
+globally without passing `--filter` on every command. Once a clone or
+fetch applies this setting, the remote is registered as a promisor
+remote with the corresponding filter, so subsequent fetches inherit
+the filter automatically.
++
+An explicit `--filter` option on the command line takes precedence over
+this config. An existing `remote.<name>.partialclonefilter` also takes
+precedence. If the server does not support object filtering, the
+setting is silently ignored.
+
`fetch.bundleURI`::
This value stores a URI for downloading Git object data from a bundle
URI before performing an incremental fetch from the origin Git server.
diff --git a/builtin/clone.c b/builtin/clone.c
index 45d8fa0eed..1e3261b623 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -78,6 +78,7 @@ static struct string_list option_optional_reference = STRING_LIST_INIT_NODUP;
static int max_jobs = -1;
static struct string_list option_recurse_submodules = STRING_LIST_INIT_NODUP;
static int config_filter_submodules = -1; /* unspecified */
+static char *config_blob_size_limit;
static int option_remote_submodules;
static int recurse_submodules_cb(const struct option *opt,
@@ -753,6 +754,10 @@ static int git_clone_config(const char *k, const char *v,
config_reject_shallow = git_config_bool(k, v);
if (!strcmp(k, "clone.filtersubmodules"))
config_filter_submodules = git_config_bool(k, v);
+ if (!strcmp(k, "fetch.blobsizelimit")) {
+ free(config_blob_size_limit);
+ git_config_string(&config_blob_size_limit, k, v);
+ }
return git_default_config(k, v, ctx, cb);
}
@@ -1010,6 +1015,13 @@ int cmd_clone(int argc,
argc = parse_options(argc, argv, prefix, builtin_clone_options,
builtin_clone_usage, 0);
+ if (!filter_options.choice && config_blob_size_limit) {
+ struct strbuf buf = STRBUF_INIT;
+ strbuf_addf(&buf, "blob:limit=%s", config_blob_size_limit);
+ parse_list_objects_filter(&filter_options, buf.buf);
+ strbuf_release(&buf);
+ }
+
if (argc > 2)
usage_msg_opt(_("Too many arguments."),
builtin_clone_usage, builtin_clone_options);
@@ -1634,6 +1646,7 @@ int cmd_clone(int argc,
ref_storage_format);
list_objects_filter_release(&filter_options);
+ free(config_blob_size_limit);
string_list_clear(&option_not, 0);
string_list_clear(&option_config, 0);
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 573c295241..ff898cb6f4 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -109,6 +109,7 @@ struct fetch_config {
int recurse_submodules;
int parallel;
int submodule_fetch_jobs;
+ char *blob_size_limit;
};
static int git_fetch_config(const char *k, const char *v,
@@ -160,6 +161,9 @@ static int git_fetch_config(const char *k, const char *v,
return 0;
}
+ if (!strcmp(k, "fetch.blobsizelimit"))
+ return git_config_string(&fetch_config->blob_size_limit, k, v);
+
if (!strcmp(k, "fetch.output")) {
if (!v)
return config_error_nonbool(k);
@@ -2342,7 +2346,8 @@ static int fetch_multiple(struct string_list *list, int max_children,
* or inherit the default filter-spec from the config.
*/
static inline void fetch_one_setup_partial(struct remote *remote,
- struct list_objects_filter_options *filter_options)
+ struct list_objects_filter_options *filter_options,
+ const struct fetch_config *config)
{
/*
* Explicit --no-filter argument overrides everything, regardless
@@ -2352,10 +2357,12 @@ static inline void fetch_one_setup_partial(struct remote *remote,
return;
/*
- * If no prior partial clone/fetch and the current fetch DID NOT
- * request a partial-fetch, do a normal fetch.
+ * If no prior partial clone/fetch, the current fetch did not
+ * request a partial-fetch, and no global blob size limit is
+ * configured, do a normal fetch.
*/
- if (!repo_has_promisor_remote(the_repository) && !filter_options->choice)
+ if (!repo_has_promisor_remote(the_repository) &&
+ !filter_options->choice && !config->blob_size_limit)
return;
/*
@@ -2372,11 +2379,27 @@ static inline void fetch_one_setup_partial(struct remote *remote,
/*
* Do a partial-fetch from the promisor remote using either the
* explicitly given filter-spec or inherit the filter-spec from
- * the config.
+ * the per-remote config.
+ */
+ if (repo_has_promisor_remote(the_repository)) {
+ partial_clone_get_default_filter_spec(filter_options,
+ remote->name);
+ if (filter_options->choice)
+ return;
+ }
+
+ /*
+ * Fall back to the global fetch.blobSizeLimit config. This
+ * enables partial clone behavior without requiring --filter
+ * on the command line or a pre-existing promisor remote.
*/
- if (!filter_options->choice)
- partial_clone_get_default_filter_spec(filter_options, remote->name);
- return;
+ if (!filter_options->choice && config->blob_size_limit) {
+ struct strbuf buf = STRBUF_INIT;
+ strbuf_addf(&buf, "blob:limit=%s", config->blob_size_limit);
+ parse_list_objects_filter(filter_options, buf.buf);
+ strbuf_release(&buf);
+ partial_clone_register(remote->name, filter_options);
+ }
}
static int fetch_one(struct remote *remote, int argc, const char **argv,
@@ -2762,9 +2785,10 @@ int cmd_fetch(int argc,
oidset_clear(&acked_commits);
trace2_region_leave("fetch", "negotiate-only", the_repository);
} else if (remote) {
- if (filter_options.choice || repo_has_promisor_remote(the_repository)) {
+ if (filter_options.choice || repo_has_promisor_remote(the_repository) ||
+ config.blob_size_limit) {
trace2_region_enter("fetch", "setup-partial", the_repository);
- fetch_one_setup_partial(remote, &filter_options);
+ fetch_one_setup_partial(remote, &filter_options, &config);
trace2_region_leave("fetch", "setup-partial", the_repository);
}
trace2_region_enter("fetch", "fetch-one", the_repository);
@@ -2876,5 +2900,6 @@ int cmd_fetch(int argc,
cleanup:
string_list_clear(&list, 0);
list_objects_filter_release(&filter_options);
+ free(config.blob_size_limit);
return result;
}
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 1e354e057f..44b41f315f 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -722,6 +722,66 @@ test_expect_success 'after fetching descendants of non-promisor commits, gc work
git -C partial gc --prune=now
'
+# Test fetch.blobSizeLimit config
+
+test_expect_success 'setup for fetch.blobSizeLimit tests' '
+ git init blob-limit-src &&
+ echo "small" >blob-limit-src/small.txt &&
+ dd if=/dev/zero of=blob-limit-src/large.bin bs=1024 count=100 2>/dev/null &&
+ git -C blob-limit-src add . &&
+ git -C blob-limit-src commit -m "initial" &&
+
+ git clone --bare "file://$(pwd)/blob-limit-src" blob-limit-srv.bare &&
+ git -C blob-limit-srv.bare config --local uploadpack.allowfilter 1 &&
+ git -C blob-limit-srv.bare config --local uploadpack.allowanysha1inwant 1
+'
+
+test_expect_success 'clone with fetch.blobSizeLimit config applies filter' '
+ git -c fetch.blobSizeLimit=1k clone \
+ "file://$(pwd)/blob-limit-srv.bare" blob-limit-clone &&
+
+ test "$(git -C blob-limit-clone config --local remote.origin.promisor)" = "true" &&
+ test "$(git -C blob-limit-clone config --local remote.origin.partialclonefilter)" = "blob:limit=1024"
+'
+
+test_expect_success 'clone with --filter overrides fetch.blobSizeLimit' '
+ git -c fetch.blobSizeLimit=1k clone --filter=blob:none \
+ "file://$(pwd)/blob-limit-srv.bare" blob-limit-override &&
+
+ test "$(git -C blob-limit-override config --local remote.origin.partialclonefilter)" = "blob:none"
+'
+
+test_expect_success 'fetch with fetch.blobSizeLimit registers promisor remote' '
+ git clone --no-checkout "file://$(pwd)/blob-limit-srv.bare" blob-limit-fetch &&
+
+ # Sanity: not yet a partial clone
+ test_must_fail git -C blob-limit-fetch config --local remote.origin.promisor &&
+
+ # Add a new commit to the server
+ echo "new-small" >blob-limit-src/new-small.txt &&
+ dd if=/dev/zero of=blob-limit-src/new-large.bin bs=1024 count=100 2>/dev/null &&
+ git -C blob-limit-src add . &&
+ git -C blob-limit-src commit -m "second" &&
+ git -C blob-limit-src push "file://$(pwd)/blob-limit-srv.bare" main &&
+
+ # Fetch with the config set
+ git -C blob-limit-fetch -c fetch.blobSizeLimit=1k fetch origin &&
+
+ test "$(git -C blob-limit-fetch config --local remote.origin.promisor)" = "true" &&
+ test "$(git -C blob-limit-fetch config --local remote.origin.partialclonefilter)" = "blob:limit=1024"
+'
+
+test_expect_success 'fetch.blobSizeLimit does not override existing partialclonefilter' '
+ git clone --filter=blob:none \
+ "file://$(pwd)/blob-limit-srv.bare" blob-limit-existing &&
+
+ test "$(git -C blob-limit-existing config --local remote.origin.partialclonefilter)" = "blob:none" &&
+
+ # Fetch with a different blobSizeLimit; existing filter should win
+ git -C blob-limit-existing -c fetch.blobSizeLimit=1k fetch origin &&
+
+ test "$(git -C blob-limit-existing config --local remote.origin.partialclonefilter)" = "blob:none"
+'
. "$TEST_DIRECTORY"/lib-httpd.sh
start_httpd
base-commit: 7b2bccb0d58d4f24705bf985de1f4612e4cf06e5
--
gitgitgadget
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH] fetch, clone: add fetch.blobSizeLimit config
2026-03-01 16:44 [PATCH] fetch, clone: add fetch.blobSizeLimit config Alan Braithwaite via GitGitGadget
@ 2026-03-02 11:53 ` Patrick Steinhardt
2026-03-02 18:28 ` Jeff King
2026-03-02 18:57 ` Junio C Hamano
2026-03-05 0:57 ` [PATCH v2] clone: add clone.<url>.defaultObjectFilter config Alan Braithwaite via GitGitGadget
1 sibling, 2 replies; 28+ messages in thread
From: Patrick Steinhardt @ 2026-03-02 11:53 UTC (permalink / raw)
To: Alan Braithwaite via GitGitGadget
Cc: git, christian.couder, jonathantanmy, me, gitster,
Alan Braithwaite
On Sun, Mar 01, 2026 at 04:44:59PM +0000, Alan Braithwaite via GitGitGadget wrote:
> From: Alan Braithwaite <alan@braithwaite.dev>
>
> External tools like git-lfs and git-fat use the filter clean/smudge
> mechanism to manage large binary objects, but this requires pointer
> files, a separate storage backend, and careful coordination. Git's
> partial clone infrastructure provides a more native approach: large
> blobs can be excluded at the protocol level during fetch and lazily
> retrieved on demand. However, enabling this requires passing
> `--filter=blob:limit=<size>` on every clone, which is not
> discoverable and cannot be set as a global default.
I'm not sure that we should make blob size limiting the default. The
problem with specifying a limit is that this is comparatively expensive
to compute on the server side: we have to look up each blob so that we
can determine its size. Unfortunately, such requests cannot (currently)
be optimized via for example bitmaps, or any other cache that we have.
So if we want to make any filter the default, I'd propose that we should
rather think about filters that are computationally less expensive, like
for example `--filter=blob:none`. This can be computed efficiently via
bitmaps.
The downside is of course that in this case we have to do way more
backfill fetches compared to the case where we only leave out a couple
of blobs. But unless we figure out a way to serve the size limit filter
in a more efficient way I'm not sure about proper alternatives.
Another question to consider: is it really sensible to set this setting
globally? It is very much dependent on the forge that you're connecting
to, as forges may not even allow object filters at all, or only a subset
of them.
Thanks!
Patrick
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] fetch, clone: add fetch.blobSizeLimit config
2026-03-02 11:53 ` Patrick Steinhardt
@ 2026-03-02 18:28 ` Jeff King
2026-03-02 18:57 ` Junio C Hamano
1 sibling, 0 replies; 28+ messages in thread
From: Jeff King @ 2026-03-02 18:28 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: Alan Braithwaite via GitGitGadget, git, christian.couder,
jonathantanmy, me, gitster, Alan Braithwaite
On Mon, Mar 02, 2026 at 12:53:32PM +0100, Patrick Steinhardt wrote:
> On Sun, Mar 01, 2026 at 04:44:59PM +0000, Alan Braithwaite via GitGitGadget wrote:
> > From: Alan Braithwaite <alan@braithwaite.dev>
> >
> > External tools like git-lfs and git-fat use the filter clean/smudge
> > mechanism to manage large binary objects, but this requires pointer
> > files, a separate storage backend, and careful coordination. Git's
> > partial clone infrastructure provides a more native approach: large
> > blobs can be excluded at the protocol level during fetch and lazily
> > retrieved on demand. However, enabling this requires passing
> > `--filter=blob:limit=<size>` on every clone, which is not
> > discoverable and cannot be set as a global default.
>
> I'm not sure that we should make blob size limiting the default. The
> problem with specifying a limit is that this is comparatively expensive
> to compute on the server side: we have to look up each blob so that we
> can determine its size. Unfortunately, such requests cannot (currently)
> be optimized via for example bitmaps, or any other cache that we have.
We actually can do blob:limit filters with bitmaps. See 84243da129
(pack-bitmap: implement BLOB_LIMIT filtering, 2020-02-14). It's more
expensive than blob:none, but not much. Once we have the list of blobs
we can get their sizes directly from the packfile. It's stuff like
path-limiting that is truly expensive, because it requires a traversal.
All that said, I'd be wary of turning on partial clones like this by
default. I feel like there are still a lot of performance gotchas
lurking (and possibly some correctness ones, too).
-Peff
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] fetch, clone: add fetch.blobSizeLimit config
2026-03-02 11:53 ` Patrick Steinhardt
2026-03-02 18:28 ` Jeff King
@ 2026-03-02 18:57 ` Junio C Hamano
2026-03-02 21:36 ` Alan Braithwaite
1 sibling, 1 reply; 28+ messages in thread
From: Junio C Hamano @ 2026-03-02 18:57 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: Alan Braithwaite via GitGitGadget, git, christian.couder,
jonathantanmy, me, Alan Braithwaite
Patrick Steinhardt <ps@pks.im> writes:
> I'm not sure that we should make blob size limiting the default. The
> problem with specifying a limit is that this is comparatively expensive
> to compute on the server side: we have to look up each blob so that we
> can determine its size. Unfortunately, such requests cannot (currently)
> be optimized via for example bitmaps, or any other cache that we have.
> ...
> Another question to consider: is it really sensible to set this setting
> globally? It is very much dependent on the forge that you're connecting
> to, as forges may not even allow object filters at all, or only a subset
> of them.
Both are good questions, but to affect "clone" you'd need either
"git -c that.variable=setting clone" or have it in ~/.gitconfig no?
As to this extra variable, it can already be done with existing
remote.*.partialCloneFilter, it seems, so I do not know why we want
to add it.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] fetch, clone: add fetch.blobSizeLimit config
2026-03-02 18:57 ` Junio C Hamano
@ 2026-03-02 21:36 ` Alan Braithwaite
2026-03-03 6:30 ` Patrick Steinhardt
2026-03-03 14:34 ` Jeff King
0 siblings, 2 replies; 28+ messages in thread
From: Alan Braithwaite @ 2026-03-02 21:36 UTC (permalink / raw)
To: Junio C Hamano, Patrick Steinhardt
Cc: Alan Braithwaite via GitGitGadget, git, christian.couder,
jonathantanmy, me
Patrick, Peff, Junio — thanks for taking the time to look at
this.
Patrick wrote:
> I'm not sure that we should make blob size limiting the
> default.
To clarify — this is a user-opt-in config, not a default. You
would only get partial clone behavior if you explicitly set
fetch.blobSizeLimit in your gitconfig.
Peff wrote:
> We actually can do blob:limit filters with bitmaps. See
> 84243da129 (pack-bitmap: implement BLOB_LIMIT filtering,
> 2020-02-14).
Good to know. I'm not positive, but my understanding is that
this patch only touches client code, and the server sees an
identical request to what `git clone --filter=blob:limit=1m`
already sends today. If that's correct, anyone can already
impose that cost — this patch just makes it easier to opt in.
> All that said, I'd be wary of turning on partial clones like
> this by default.
That's fair. I'm not attached to getting this merged — it was
more exploratory to start a discussion.
Junio wrote:
> As to this extra variable, it can already be done with
> existing remote.*.partialCloneFilter, it seems, so I do not
> know why we want to add it.
I may not understand the config as well as you do, but my
reading is that remote.*.partialCloneFilter requires a specific
remote name and only takes effect on subsequent fetches from an
already-registered promisor remote — not the initial clone. You
would also need remote.origin.promisor=true set globally, which
seems odd. If I'm understanding correctly, there is currently
no way to say "all new clones should use a blob size filter"
via config alone. But please correct me if I'm wrong.
Separately — is my understanding correct that partial clone
with blob:limit works today without server-side changes,
assuming uploadpack.allowFilter is enabled? If so, I'm happy
to maintain this as a local client patch for my own workflow.
Thanks again,
Alan
On Mon, Mar 2, 2026, at 10:57, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
>> I'm not sure that we should make blob size limiting the default. The
>> problem with specifying a limit is that this is comparatively expensive
>> to compute on the server side: we have to look up each blob so that we
>> can determine its size. Unfortunately, such requests cannot (currently)
>> be optimized via for example bitmaps, or any other cache that we have.
>> ...
>> Another question to consider: is it really sensible to set this setting
>> globally? It is very much dependent on the forge that you're connecting
>> to, as forges may not even allow object filters at all, or only a subset
>> of them.
>
> Both are good questions, but to affect "clone" you'd need either
> "git -c that.variable=setting clone" or have it in ~/.gitconfig no?
>
> As to this extra variable, it can already be done with existing
> remote.*.partialCloneFilter, it seems, so I do not know why we want
> to add it.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] fetch, clone: add fetch.blobSizeLimit config
2026-03-02 21:36 ` Alan Braithwaite
@ 2026-03-03 6:30 ` Patrick Steinhardt
2026-03-03 14:00 ` Alan Braithwaite
2026-03-03 17:05 ` Junio C Hamano
2026-03-03 14:34 ` Jeff King
1 sibling, 2 replies; 28+ messages in thread
From: Patrick Steinhardt @ 2026-03-03 6:30 UTC (permalink / raw)
To: Alan Braithwaite
Cc: Junio C Hamano, Alan Braithwaite via GitGitGadget, git,
christian.couder, jonathantanmy, me
On Mon, Mar 02, 2026 at 01:36:40PM -0800, Alan Braithwaite wrote:
> Peff wrote:
> > We actually can do blob:limit filters with bitmaps. See
> > 84243da129 (pack-bitmap: implement BLOB_LIMIT filtering,
> > 2020-02-14).
>
> Good to know. I'm not positive, but my understanding is that
> this patch only touches client code, and the server sees an
> identical request to what `git clone --filter=blob:limit=1m`
> already sends today. If that's correct, anyone can already
> impose that cost — this patch just makes it easier to opt in.
Ah, right, that's something I forgot. I've seen too many performance
issues recently with blob:limit fetches, so I jumped the gun.
> Junio wrote:
> > As to this extra variable, it can already be done with
> > existing remote.*.partialCloneFilter, it seems, so I do not
> > know why we want to add it.
>
> I may not understand the config as well as you do, but my
> reading is that remote.*.partialCloneFilter requires a specific
> remote name and only takes effect on subsequent fetches from an
> already-registered promisor remote — not the initial clone. You
> would also need remote.origin.promisor=true set globally, which
> seems odd. If I'm understanding correctly, there is currently
> no way to say "all new clones should use a blob size filter"
> via config alone. But please correct me if I'm wrong.
No, you're right about this one, and I think this is a sensible thing to
want. But what I'd like to see is a bit more nuance, I guess:
- It should be possible to specify the configuration per URL. If you
know that git.example.com knows object filters you may want to turn
them on for that domain specifically. So the mechanism would work
similar to "url.<base>.insteadOf" or "http.<url>.*" settings.
- The infrastructure shouldn't cast any specific filter into stone.
Instead, it should be possible to specify a default filter.
I'd assume that these settings should only impact the initial clone to
use a default filter in case the cloned URL matches the configured URL.
For existing repositories it shouldn't have any impact, as we should
continue to respect the ".git/config" there when it comes to promisors
and filters.
> Separately — is my understanding correct that partial clone
> with blob:limit works today without server-side changes,
> assuming uploadpack.allowFilter is enabled? If so, I'm happy
> to maintain this as a local client patch for my own workflow.
Yes, blob:limit filters are supported by many forges nowadays.
Patrick
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] fetch, clone: add fetch.blobSizeLimit config
2026-03-03 6:30 ` Patrick Steinhardt
@ 2026-03-03 14:00 ` Alan Braithwaite
2026-03-03 15:08 ` Patrick Steinhardt
2026-03-03 17:05 ` Junio C Hamano
1 sibling, 1 reply; 28+ messages in thread
From: Alan Braithwaite @ 2026-03-03 14:00 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: Junio C Hamano, Alan Braithwaite, git, christian.couder,
jonathantanmy, me
Patrick wrote:
> No, you're right about this one, and I think this is a
> sensible thing to want. But what I'd like to see is a bit
> more nuance, I guess:
>
> - It should be possible to specify the configuration per
> URL. If you know that git.example.com knows object
> filters you may want to turn them on for that domain
> specifically. So the mechanism would work similar to
> "url.<base>.insteadOf" or "http.<url>.*" settings.
>
> - The infrastructure shouldn't cast any specific filter
> into stone. Instead, it should be possible to specify a
> default filter.
Thanks, this is great feedback. I took a look at the existing
URL-based config patterns and I think the http.<url>.* model
is the right one to follow, since it already uses the
urlmatch_config_entry() infrastructure with proper URL
normalization, host globs, and longest-match specificity.
Here's what I'm thinking for a v2. I'd like to get feedback
on the design before implementing:
The config would use a new section that supports both a global
default and per-URL overrides, following the same pattern as
http.sslVerify vs http.<url>.sslVerify:
# Global default — applies to all clones/fetches
[fetch]
partialCloneFilter = blob:limit=1m
# Per-URL override — more specific match wins
[fetch "https://github.com/"]
partialCloneFilter = blob:limit=5m
[fetch "https://internal.corp.com/"]
partialCloneFilter = blob:none
Design points:
- Accepts any filter spec, not just blob:limit. This
addresses your point about not casting a specific filter
into stone.
- Uses fetch.<url>.partialCloneFilter, following the
http.<url>.* precedent. The urlmatch.c infrastructure
handles URL normalization, host globs (*.example.com),
default port stripping, and path-based specificity
ordering — so no new matching logic would be needed.
- A bare fetch.partialCloneFilter (no URL) acts as the
global default, the same way http.sslVerify is the
global default that http.<url>.sslVerify can override.
- Only applies to initial clone and to fetches where no
existing remote.<name>.partialCloneFilter is set. Existing
repos continue using their per-remote config.
- Explicit --filter on the command line still takes
precedence over everything.
- If the server does not support object filtering, the
setting is silently ignored (existing behavior).
I chose fetch.* rather than clone.* so that both git-clone
and git-fetch can use the same config. In practice this
mainly matters for the initial clone, since once the promisor
remote is registered, subsequent fetches inherit the filter
from remote.<name>.partialCloneFilter anyway.
Does this direction make sense? Happy to hear if there are
concerns before I start on a v2.
Thanks,
- Alan
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] fetch, clone: add fetch.blobSizeLimit config
2026-03-02 21:36 ` Alan Braithwaite
2026-03-03 6:30 ` Patrick Steinhardt
@ 2026-03-03 14:34 ` Jeff King
1 sibling, 0 replies; 28+ messages in thread
From: Jeff King @ 2026-03-03 14:34 UTC (permalink / raw)
To: Alan Braithwaite
Cc: Junio C Hamano, Patrick Steinhardt,
Alan Braithwaite via GitGitGadget, git, christian.couder,
jonathantanmy, me
On Mon, Mar 02, 2026 at 01:36:40PM -0800, Alan Braithwaite wrote:
> Peff wrote:
> > We actually can do blob:limit filters with bitmaps. See
> > 84243da129 (pack-bitmap: implement BLOB_LIMIT filtering,
> > 2020-02-14).
>
> Good to know. I'm not positive, but my understanding is that
> this patch only touches client code, and the server sees an
> identical request to what `git clone --filter=blob:limit=1m`
> already sends today. If that's correct, anyone can already
> impose that cost — this patch just makes it easier to opt in.
Yes, that's correct. The server protects itself by refusing to support
certain filters that are too expensive. Usually by setting
uploadpackfilter.allow to "false", followed by enabling
uploadpackfilter.*.allow for particular filters.
When we added those, we left the defaults as-is (allowing everything).
That's OK for casual use amongst your own repositories, but terrible for
a hosting site. I don't know if it would be worth revisiting the
defaults.
But anyway, all orthogonal to the topic in this thread.
-Peff
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] fetch, clone: add fetch.blobSizeLimit config
2026-03-03 14:00 ` Alan Braithwaite
@ 2026-03-03 15:08 ` Patrick Steinhardt
2026-03-03 17:58 ` Junio C Hamano
0 siblings, 1 reply; 28+ messages in thread
From: Patrick Steinhardt @ 2026-03-03 15:08 UTC (permalink / raw)
To: Alan Braithwaite
Cc: Junio C Hamano, Alan Braithwaite, git, christian.couder,
jonathantanmy, me
On Tue, Mar 03, 2026 at 06:00:29AM -0800, Alan Braithwaite wrote:
> Patrick wrote:
> > No, you're right about this one, and I think this is a
> > sensible thing to want. But what I'd like to see is a bit
> > more nuance, I guess:
> >
> > - It should be possible to specify the configuration per
> > URL. If you know that git.example.com knows object
> > filters you may want to turn them on for that domain
> > specifically. So the mechanism would work similar to
> > "url.<base>.insteadOf" or "http.<url>.*" settings.
> >
> > - The infrastructure shouldn't cast any specific filter
> > into stone. Instead, it should be possible to specify a
> > default filter.
>
> Thanks, this is great feedback. I took a look at the existing
> URL-based config patterns and I think the http.<url>.* model
> is the right one to follow, since it already uses the
> urlmatch_config_entry() infrastructure with proper URL
> normalization, host globs, and longest-match specificity.
>
> Here's what I'm thinking for a v2. I'd like to get feedback
> on the design before implementing:
>
> The config would use a new section that supports both a global
> default and per-URL overrides, following the same pattern as
> http.sslVerify vs http.<url>.sslVerify:
>
> # Global default — applies to all clones/fetches
> [fetch]
> partialCloneFilter = blob:limit=1m
>
> # Per-URL override — more specific match wins
> [fetch "https://github.com/"]
> partialCloneFilter = blob:limit=5m
>
> [fetch "https://internal.corp.com/"]
> partialCloneFilter = blob:none
>
> Design points:
>
> - Accepts any filter spec, not just blob:limit. This
> addresses your point about not casting a specific filter
> into stone.
>
> - Uses fetch.<url>.partialCloneFilter, following the
> http.<url>.* precedent. The urlmatch.c infrastructure
> handles URL normalization, host globs (*.example.com),
> default port stripping, and path-based specificity
> ordering — so no new matching logic would be needed.
>
> - A bare fetch.partialCloneFilter (no URL) acts as the
> global default, the same way http.sslVerify is the
> global default that http.<url>.sslVerify can override.
>
> - Only applies to initial clone and to fetches where no
> existing remote.<name>.partialCloneFilter is set. Existing
> repos continue using their per-remote config.
>
> - Explicit --filter on the command line still takes
> precedence over everything.
>
> - If the server does not support object filtering, the
> setting is silently ignored (existing behavior).
>
> I chose fetch.* rather than clone.* so that both git-clone
> and git-fetch can use the same config. In practice this
> mainly matters for the initial clone, since once the promisor
> remote is registered, subsequent fetches inherit the filter
> from remote.<name>.partialCloneFilter anyway.
I think using something like "clone.<url>.defaultObjectFilter" would be
a more sensible design. The idea is that we'd only honor this filter on
the initial clone to basically be equivalent to `git clone --filter=`. I
don't think any subsequent fetches should be impacted at all, as turning
a full clone into a partial clone would need more consideration.
Patrick
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] fetch, clone: add fetch.blobSizeLimit config
2026-03-03 6:30 ` Patrick Steinhardt
2026-03-03 14:00 ` Alan Braithwaite
@ 2026-03-03 17:05 ` Junio C Hamano
1 sibling, 0 replies; 28+ messages in thread
From: Junio C Hamano @ 2026-03-03 17:05 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: Alan Braithwaite, Alan Braithwaite via GitGitGadget, git,
christian.couder, jonathantanmy, me
Patrick Steinhardt <ps@pks.im> writes:
> No, you're right about this one, and I think this is a sensible thing to
> want. But what I'd like to see is a bit more nuance, I guess:
>
> - It should be possible to specify the configuration per URL. If you
> know that git.example.com knows object filters you may want to turn
> them on for that domain specifically. So the mechanism would work
> similar to "url.<base>.insteadOf" or "http.<url>.*" settings.
>
> - The infrastructure shouldn't cast any specific filter into stone.
> Instead, it should be possible to specify a default filter.
>
> I'd assume that these settings should only impact the initial clone to
> use a default filter in case the cloned URL matches the configured URL.
> For existing repositories it shouldn't have any impact, as we should
> continue to respect the ".git/config" there when it comes to promisors
> and filters.
Ahh, thanks for pointing out the flaw in my thinking that forgets
that "remote.<name>.partialCloneFilter" would not work in the
initial state where there is no <name> associated with the remote
repository you are trying to contact. I agree that something like
"remote.<url>.particialCloneFilter" is a more proper way forward.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] fetch, clone: add fetch.blobSizeLimit config
2026-03-03 15:08 ` Patrick Steinhardt
@ 2026-03-03 17:58 ` Junio C Hamano
2026-03-04 5:07 ` Patrick Steinhardt
0 siblings, 1 reply; 28+ messages in thread
From: Junio C Hamano @ 2026-03-03 17:58 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: Alan Braithwaite, Alan Braithwaite, git, christian.couder, me
Patrick Steinhardt <ps@pks.im> writes:
> I think using something like "clone.<url>.defaultObjectFilter" would be
> a more sensible design. The idea is that we'd only honor this filter on
> the initial clone to basically be equivalent to `git clone --filter=`. I
> don't think any subsequent fetches should be impacted at all, as turning
> a full clone into a partial clone would need more consideration.
Yup, I like this one. Should <url> be giving a repository fully, or
be some pattern that groups similar repositories together? You
would not be wanting to clone exactly the same repository so many
times for a configuration variable to matter in general.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] fetch, clone: add fetch.blobSizeLimit config
2026-03-03 17:58 ` Junio C Hamano
@ 2026-03-04 5:07 ` Patrick Steinhardt
0 siblings, 0 replies; 28+ messages in thread
From: Patrick Steinhardt @ 2026-03-04 5:07 UTC (permalink / raw)
To: Junio C Hamano
Cc: Alan Braithwaite, Alan Braithwaite, git, christian.couder, me
On Tue, Mar 03, 2026 at 09:58:09AM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > I think using something like "clone.<url>.defaultObjectFilter" would be
> > a more sensible design. The idea is that we'd only honor this filter on
> > the initial clone to basically be equivalent to `git clone --filter=`. I
> > don't think any subsequent fetches should be impacted at all, as turning
> > a full clone into a partial clone would need more consideration.
>
> Yup, I like this one. Should <url> be giving a repository fully, or
> be some pattern that groups similar repositories together? You
> would not be wanting to clone exactly the same repository so many
> times for a configuration variable to matter in general.
I'd propose that it should work the same as our "http.<url>.*" config:
- You can enable partial clones for a complete domain, like for
example "github.com" or "gitlab.com".
- You can specify a namespace, like "gitlab.com/example", so that all
projects in there would be using the filter.
- You can specify a project, like "gitlab.com/example/project.git".
I'd say that this should be sufficient for most usecases.
Patrick
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v2] clone: add clone.<url>.defaultObjectFilter config
2026-03-01 16:44 [PATCH] fetch, clone: add fetch.blobSizeLimit config Alan Braithwaite via GitGitGadget
2026-03-02 11:53 ` Patrick Steinhardt
@ 2026-03-05 0:57 ` Alan Braithwaite via GitGitGadget
2026-03-05 19:01 ` Junio C Hamano
2026-03-06 6:55 ` [PATCH v3] " Alan Braithwaite via GitGitGadget
1 sibling, 2 replies; 28+ messages in thread
From: Alan Braithwaite via GitGitGadget @ 2026-03-05 0:57 UTC (permalink / raw)
To: git
Cc: ps, christian.couder, jonathantanmy, me, gitster, Jeff King,
Alan Braithwaite, Alan Braithwaite
From: Alan Braithwaite <alan@braithwaite.dev>
Add a new configuration option that lets users specify a default
partial clone filter per URL pattern. When cloning a repository
whose URL matches a configured pattern, git-clone automatically
applies the filter, equivalent to passing --filter on the command
line.
[clone "https://github.com/"]
defaultObjectFilter = blob:limit=5m
[clone "https://internal.corp.com/large-project/"]
defaultObjectFilter = blob:none
URL matching uses the existing urlmatch_config_entry() infrastructure,
following the same rules as http.<url>.* — you can match a domain,
a namespace path, or a specific project, and the most specific match
wins.
The config only affects the initial clone. Once the clone completes,
the filter is recorded in remote.<name>.partialCloneFilter, so
subsequent fetches inherit it automatically. An explicit --filter
flag on the command line takes precedence.
Only the URL-qualified form (clone.<url>.defaultObjectFilter) is
honored; a bare clone.defaultObjectFilter without a URL subsection
is ignored.
Signed-off-by: Alan Braithwaite <alan@braithwaite.dev>
---
fetch, clone: add fetch.blobSizeLimit config
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2058%2Fabraithwaite%2Falan%2Ffetch-blob-size-limit-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2058/abraithwaite/alan/fetch-blob-size-limit-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/2058
Range-diff vs v1:
1: 818b64e2e2 ! 1: 4a73edd2e8 fetch, clone: add fetch.blobSizeLimit config
@@ Metadata
Author: Alan Braithwaite <alan@braithwaite.dev>
## Commit message ##
- fetch, clone: add fetch.blobSizeLimit config
+ clone: add clone.<url>.defaultObjectFilter config
- External tools like git-lfs and git-fat use the filter clean/smudge
- mechanism to manage large binary objects, but this requires pointer
- files, a separate storage backend, and careful coordination. Git's
- partial clone infrastructure provides a more native approach: large
- blobs can be excluded at the protocol level during fetch and lazily
- retrieved on demand. However, enabling this requires passing
- `--filter=blob:limit=<size>` on every clone, which is not
- discoverable and cannot be set as a global default.
+ Add a new configuration option that lets users specify a default
+ partial clone filter per URL pattern. When cloning a repository
+ whose URL matches a configured pattern, git-clone automatically
+ applies the filter, equivalent to passing --filter on the command
+ line.
- Add a new `fetch.blobSizeLimit` configuration option that enables
- size-based partial clone behavior globally. When set, both `git
- clone` and `git fetch` automatically apply a `blob:limit=<size>`
- filter. Blobs larger than the threshold that are not needed for the
- current worktree are excluded from the transfer and lazily fetched
- on demand when needed (e.g., during checkout, diff, or merge).
+ [clone "https://github.com/"]
+ defaultObjectFilter = blob:limit=5m
- This makes it easy to work with repositories that have accumulated
- large binary files in their history, without downloading all of
- them upfront.
+ [clone "https://internal.corp.com/large-project/"]
+ defaultObjectFilter = blob:none
- The precedence order is:
- 1. Explicit `--filter=` on the command line (highest)
- 2. Existing `remote.<name>.partialclonefilter`
- 3. `fetch.blobSizeLimit` (new, lowest)
+ URL matching uses the existing urlmatch_config_entry() infrastructure,
+ following the same rules as http.<url>.* — you can match a domain,
+ a namespace path, or a specific project, and the most specific match
+ wins.
- Once a clone or fetch applies this setting, the remote is registered
- as a promisor remote with the corresponding filter spec, so
- subsequent fetches inherit it automatically. If the server does not
- support object filtering, the setting is silently ignored.
+ The config only affects the initial clone. Once the clone completes,
+ the filter is recorded in remote.<name>.partialCloneFilter, so
+ subsequent fetches inherit it automatically. An explicit --filter
+ flag on the command line takes precedence.
+
+ Only the URL-qualified form (clone.<url>.defaultObjectFilter) is
+ honored; a bare clone.defaultObjectFilter without a URL subsection
+ is ignored.
Signed-off-by: Alan Braithwaite <alan@braithwaite.dev>
- ## Documentation/config/fetch.adoc ##
-@@ Documentation/config/fetch.adoc: config setting.
- file helps performance of many Git commands, including `git merge-base`,
- `git push -f`, and `git log --graph`. Defaults to `false`.
-
-+`fetch.blobSizeLimit`::
-+ When set to a size value (e.g., `1m`, `100k`, `1g`), both
-+ linkgit:git-clone[1] and linkgit:git-fetch[1] will automatically
-+ use `--filter=blob:limit=<value>` to enable partial clone
-+ behavior. Blobs larger than this threshold are excluded from the
-+ initial transfer and lazily fetched on demand when needed (e.g.,
-+ during checkout).
+ ## Documentation/config/clone.adoc ##
+@@ Documentation/config/clone.adoc: endif::[]
+ If a partial clone filter is provided (see `--filter` in
+ linkgit:git-rev-list[1]) and `--recurse-submodules` is used, also apply
+ the filter to submodules.
++
++`clone.<url>.defaultObjectFilter`::
++ When set to a filter spec string (e.g., `blob:limit=1m`,
++ `blob:none`, `tree:0`), linkgit:git-clone[1] will automatically
++ use `--filter=<value>` when the clone URL matches `<url>`.
++ Objects matching the filter are excluded from the initial
++ transfer and lazily fetched on demand (e.g., during checkout).
++ Subsequent fetches inherit the filter via the per-remote config
++ that is written during the clone.
++
-+This provides a convenient way to enable size-based partial clones
-+globally without passing `--filter` on every command. Once a clone or
-+fetch applies this setting, the remote is registered as a promisor
-+remote with the corresponding filter, so subsequent fetches inherit
-+the filter automatically.
++The URL matching follows the same rules as `http.<url>.*` (see
++linkgit:git-config[1]). The most specific URL match wins. You can
++match a complete domain, a namespace, or a specific project:
++
-+An explicit `--filter` option on the command line takes precedence over
-+this config. An existing `remote.<name>.partialclonefilter` also takes
-+precedence. If the server does not support object filtering, the
-+setting is silently ignored.
++----
++[clone "https://github.com/"]
++ defaultObjectFilter = blob:limit=5m
+
- `fetch.bundleURI`::
- This value stores a URI for downloading Git object data from a bundle
- URI before performing an incremental fetch from the origin Git server.
++[clone "https://internal.corp.com/large-project/"]
++ defaultObjectFilter = blob:none
++----
+++
++An explicit `--filter` option on the command line takes precedence
++over this config. Only affects the initial clone; it has no effect
++on later fetches into an existing repository. If the server does
++not support object filtering, the setting is silently ignored.
## builtin/clone.c ##
-@@ builtin/clone.c: static struct string_list option_optional_reference = STRING_LIST_INIT_NODUP;
- static int max_jobs = -1;
- static struct string_list option_recurse_submodules = STRING_LIST_INIT_NODUP;
- static int config_filter_submodules = -1; /* unspecified */
-+static char *config_blob_size_limit;
- static int option_remote_submodules;
-
- static int recurse_submodules_cb(const struct option *opt,
+@@
+ #include "path.h"
+ #include "pkt-line.h"
+ #include "list-objects-filter-options.h"
++#include "urlmatch.h"
+ #include "hook.h"
+ #include "bundle.h"
+ #include "bundle-uri.h"
@@ builtin/clone.c: static int git_clone_config(const char *k, const char *v,
- config_reject_shallow = git_config_bool(k, v);
- if (!strcmp(k, "clone.filtersubmodules"))
- config_filter_submodules = git_config_bool(k, v);
-+ if (!strcmp(k, "fetch.blobsizelimit")) {
-+ free(config_blob_size_limit);
-+ git_config_string(&config_blob_size_limit, k, v);
-+ }
-
return git_default_config(k, v, ctx, cb);
}
-@@ builtin/clone.c: int cmd_clone(int argc,
- argc = parse_options(argc, argv, prefix, builtin_clone_options,
- builtin_clone_usage, 0);
-+ if (!filter_options.choice && config_blob_size_limit) {
-+ struct strbuf buf = STRBUF_INIT;
-+ strbuf_addf(&buf, "blob:limit=%s", config_blob_size_limit);
-+ parse_list_objects_filter(&filter_options, buf.buf);
-+ strbuf_release(&buf);
++struct clone_filter_data {
++ char *default_object_filter;
++};
++
++static int clone_filter_collect(const char *var, const char *value,
++ const struct config_context *ctx UNUSED,
++ void *cb)
++{
++ struct clone_filter_data *data = cb;
++
++ if (!strcmp(var, "clone.defaultobjectfilter")) {
++ free(data->default_object_filter);
++ data->default_object_filter = xstrdup(value);
+ }
++ return 0;
++}
++
++/*
++ * Look up clone.<url>.defaultObjectFilter using the urlmatch
++ * infrastructure. Only URL-qualified forms are supported; a bare
++ * clone.defaultObjectFilter (without a URL) is ignored.
++ */
++static char *get_default_object_filter(const char *url)
++{
++ struct urlmatch_config config = URLMATCH_CONFIG_INIT;
++ struct clone_filter_data data = { 0 };
++ struct string_list_item *item;
++ char *normalized_url;
++
++ config.section = "clone";
++ config.key = "defaultobjectfilter";
++ config.collect_fn = clone_filter_collect;
++ config.cascade_fn = git_clone_config;
++ config.cb = &data;
++
++ normalized_url = url_normalize(url, &config.url);
++
++ repo_config(the_repository, urlmatch_config_entry, &config);
++ free(normalized_url);
+
- if (argc > 2)
- usage_msg_opt(_("Too many arguments."),
- builtin_clone_usage, builtin_clone_options);
-@@ builtin/clone.c: int cmd_clone(int argc,
- ref_storage_format);
-
- list_objects_filter_release(&filter_options);
-+ free(config_blob_size_limit);
-
- string_list_clear(&option_not, 0);
- string_list_clear(&option_config, 0);
-
- ## builtin/fetch.c ##
-@@ builtin/fetch.c: struct fetch_config {
- int recurse_submodules;
- int parallel;
- int submodule_fetch_jobs;
-+ char *blob_size_limit;
- };
-
- static int git_fetch_config(const char *k, const char *v,
-@@ builtin/fetch.c: static int git_fetch_config(const char *k, const char *v,
- return 0;
- }
-
-+ if (!strcmp(k, "fetch.blobsizelimit"))
-+ return git_config_string(&fetch_config->blob_size_limit, k, v);
-+
- if (!strcmp(k, "fetch.output")) {
- if (!v)
- return config_error_nonbool(k);
-@@ builtin/fetch.c: static int fetch_multiple(struct string_list *list, int max_children,
- * or inherit the default filter-spec from the config.
- */
- static inline void fetch_one_setup_partial(struct remote *remote,
-- struct list_objects_filter_options *filter_options)
-+ struct list_objects_filter_options *filter_options,
-+ const struct fetch_config *config)
- {
- /*
- * Explicit --no-filter argument overrides everything, regardless
-@@ builtin/fetch.c: static inline void fetch_one_setup_partial(struct remote *remote,
- return;
-
- /*
-- * If no prior partial clone/fetch and the current fetch DID NOT
-- * request a partial-fetch, do a normal fetch.
-+ * If no prior partial clone/fetch, the current fetch did not
-+ * request a partial-fetch, and no global blob size limit is
-+ * configured, do a normal fetch.
- */
-- if (!repo_has_promisor_remote(the_repository) && !filter_options->choice)
-+ if (!repo_has_promisor_remote(the_repository) &&
-+ !filter_options->choice && !config->blob_size_limit)
- return;
-
- /*
-@@ builtin/fetch.c: static inline void fetch_one_setup_partial(struct remote *remote,
- /*
- * Do a partial-fetch from the promisor remote using either the
- * explicitly given filter-spec or inherit the filter-spec from
-- * the config.
-+ * the per-remote config.
++ /*
++ * Reject the bare form clone.defaultObjectFilter (no URL
++ * subsection). urlmatch stores the best match in vars with
++ * hostmatch_len == 0 for non-URL-qualified entries; discard
++ * the result if that is what we got.
+ */
-+ if (repo_has_promisor_remote(the_repository)) {
-+ partial_clone_get_default_filter_spec(filter_options,
-+ remote->name);
-+ if (filter_options->choice)
-+ return;
++ item = string_list_lookup(&config.vars, "defaultobjectfilter");
++ if (item) {
++ const struct urlmatch_item *m = item->util;
++ if (!m->hostmatch_len && !m->pathmatch_len) {
++ FREE_AND_NULL(data.default_object_filter);
++ }
+ }
+
-+ /*
-+ * Fall back to the global fetch.blobSizeLimit config. This
-+ * enables partial clone behavior without requiring --filter
-+ * on the command line or a pre-existing promisor remote.
- */
-- if (!filter_options->choice)
-- partial_clone_get_default_filter_spec(filter_options, remote->name);
-- return;
-+ if (!filter_options->choice && config->blob_size_limit) {
-+ struct strbuf buf = STRBUF_INIT;
-+ strbuf_addf(&buf, "blob:limit=%s", config->blob_size_limit);
-+ parse_list_objects_filter(filter_options, buf.buf);
-+ strbuf_release(&buf);
-+ partial_clone_register(remote->name, filter_options);
-+ }
- }
++ urlmatch_config_release(&config);
++
++ return data.default_object_filter;
++}
++
+ static int write_one_config(const char *key, const char *value,
+ const struct config_context *ctx,
+ void *data)
+@@ builtin/clone.c: int cmd_clone(int argc,
+ } else
+ die(_("repository '%s' does not exist"), repo_name);
- static int fetch_one(struct remote *remote, int argc, const char **argv,
-@@ builtin/fetch.c: int cmd_fetch(int argc,
- oidset_clear(&acked_commits);
- trace2_region_leave("fetch", "negotiate-only", the_repository);
- } else if (remote) {
-- if (filter_options.choice || repo_has_promisor_remote(the_repository)) {
-+ if (filter_options.choice || repo_has_promisor_remote(the_repository) ||
-+ config.blob_size_limit) {
- trace2_region_enter("fetch", "setup-partial", the_repository);
-- fetch_one_setup_partial(remote, &filter_options);
-+ fetch_one_setup_partial(remote, &filter_options, &config);
- trace2_region_leave("fetch", "setup-partial", the_repository);
- }
- trace2_region_enter("fetch", "fetch-one", the_repository);
-@@ builtin/fetch.c: int cmd_fetch(int argc,
- cleanup:
- string_list_clear(&list, 0);
- list_objects_filter_release(&filter_options);
-+ free(config.blob_size_limit);
- return result;
- }
++ if (!filter_options.choice) {
++ char *config_filter = get_default_object_filter(repo);
++ if (config_filter) {
++ parse_list_objects_filter(&filter_options, config_filter);
++ free(config_filter);
++ }
++ }
++
+ /* no need to be strict, transport_set_option() will validate it again */
+ if (option_depth && atoi(option_depth) < 1)
+ die(_("depth %s is not a positive number"), option_depth);
## t/t5616-partial-clone.sh ##
@@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non-promisor commits, gc work
git -C partial gc --prune=now
'
-+# Test fetch.blobSizeLimit config
++# Test clone.<url>.defaultObjectFilter config
++
++test_expect_success 'setup for clone.defaultObjectFilter tests' '
++ git init default-filter-src &&
++ echo "small" >default-filter-src/small.txt &&
++ dd if=/dev/zero of=default-filter-src/large.bin bs=1024 count=100 2>/dev/null &&
++ git -C default-filter-src add . &&
++ git -C default-filter-src commit -m "initial" &&
++
++ git clone --bare "file://$(pwd)/default-filter-src" default-filter-srv.bare &&
++ git -C default-filter-srv.bare config --local uploadpack.allowfilter 1 &&
++ git -C default-filter-srv.bare config --local uploadpack.allowanysha1inwant 1
++'
+
-+test_expect_success 'setup for fetch.blobSizeLimit tests' '
-+ git init blob-limit-src &&
-+ echo "small" >blob-limit-src/small.txt &&
-+ dd if=/dev/zero of=blob-limit-src/large.bin bs=1024 count=100 2>/dev/null &&
-+ git -C blob-limit-src add . &&
-+ git -C blob-limit-src commit -m "initial" &&
++test_expect_success 'clone with clone.<url>.defaultObjectFilter applies filter' '
++ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
++ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:limit=1k" clone \
++ "$SERVER_URL" default-filter-clone &&
+
-+ git clone --bare "file://$(pwd)/blob-limit-src" blob-limit-srv.bare &&
-+ git -C blob-limit-srv.bare config --local uploadpack.allowfilter 1 &&
-+ git -C blob-limit-srv.bare config --local uploadpack.allowanysha1inwant 1
++ test "$(git -C default-filter-clone config --local remote.origin.promisor)" = "true" &&
++ test "$(git -C default-filter-clone config --local remote.origin.partialclonefilter)" = "blob:limit=1024"
+'
+
-+test_expect_success 'clone with fetch.blobSizeLimit config applies filter' '
-+ git -c fetch.blobSizeLimit=1k clone \
-+ "file://$(pwd)/blob-limit-srv.bare" blob-limit-clone &&
++test_expect_success 'clone with --filter overrides clone.<url>.defaultObjectFilter' '
++ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
++ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:limit=1k" \
++ clone --filter=blob:none "$SERVER_URL" default-filter-override &&
+
-+ test "$(git -C blob-limit-clone config --local remote.origin.promisor)" = "true" &&
-+ test "$(git -C blob-limit-clone config --local remote.origin.partialclonefilter)" = "blob:limit=1024"
++ test "$(git -C default-filter-override config --local remote.origin.partialclonefilter)" = "blob:none"
+'
+
-+test_expect_success 'clone with --filter overrides fetch.blobSizeLimit' '
-+ git -c fetch.blobSizeLimit=1k clone --filter=blob:none \
-+ "file://$(pwd)/blob-limit-srv.bare" blob-limit-override &&
++test_expect_success 'clone with clone.<url>.defaultObjectFilter=blob:none works' '
++ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
++ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" clone \
++ "$SERVER_URL" default-filter-blobnone &&
+
-+ test "$(git -C blob-limit-override config --local remote.origin.partialclonefilter)" = "blob:none"
++ test "$(git -C default-filter-blobnone config --local remote.origin.promisor)" = "true" &&
++ test "$(git -C default-filter-blobnone config --local remote.origin.partialclonefilter)" = "blob:none"
+'
+
-+test_expect_success 'fetch with fetch.blobSizeLimit registers promisor remote' '
-+ git clone --no-checkout "file://$(pwd)/blob-limit-srv.bare" blob-limit-fetch &&
++test_expect_success 'clone.<url>.defaultObjectFilter with tree:0 works' '
++ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
++ git -c "clone.$SERVER_URL.defaultObjectFilter=tree:0" clone \
++ "$SERVER_URL" default-filter-tree0 &&
+
-+ # Sanity: not yet a partial clone
-+ test_must_fail git -C blob-limit-fetch config --local remote.origin.promisor &&
-+
-+ # Add a new commit to the server
-+ echo "new-small" >blob-limit-src/new-small.txt &&
-+ dd if=/dev/zero of=blob-limit-src/new-large.bin bs=1024 count=100 2>/dev/null &&
-+ git -C blob-limit-src add . &&
-+ git -C blob-limit-src commit -m "second" &&
-+ git -C blob-limit-src push "file://$(pwd)/blob-limit-srv.bare" main &&
++ test "$(git -C default-filter-tree0 config --local remote.origin.promisor)" = "true" &&
++ test "$(git -C default-filter-tree0 config --local remote.origin.partialclonefilter)" = "tree:0"
++'
+
-+ # Fetch with the config set
-+ git -C blob-limit-fetch -c fetch.blobSizeLimit=1k fetch origin &&
++test_expect_success 'most specific URL match wins for clone.defaultObjectFilter' '
++ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
++ git \
++ -c "clone.file://.defaultObjectFilter=blob:limit=1k" \
++ -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" \
++ clone "$SERVER_URL" default-filter-url-specific &&
+
-+ test "$(git -C blob-limit-fetch config --local remote.origin.promisor)" = "true" &&
-+ test "$(git -C blob-limit-fetch config --local remote.origin.partialclonefilter)" = "blob:limit=1024"
++ test "$(git -C default-filter-url-specific config --local remote.origin.partialclonefilter)" = "blob:none"
+'
+
-+test_expect_success 'fetch.blobSizeLimit does not override existing partialclonefilter' '
-+ git clone --filter=blob:none \
-+ "file://$(pwd)/blob-limit-srv.bare" blob-limit-existing &&
++test_expect_success 'non-matching URL does not apply clone.defaultObjectFilter' '
++ git \
++ -c "clone.https://other.example.com/.defaultObjectFilter=blob:none" \
++ clone "file://$(pwd)/default-filter-srv.bare" default-filter-url-nomatch &&
+
-+ test "$(git -C blob-limit-existing config --local remote.origin.partialclonefilter)" = "blob:none" &&
++ test_must_fail git -C default-filter-url-nomatch config --local remote.origin.promisor
++'
+
-+ # Fetch with a different blobSizeLimit; existing filter should win
-+ git -C blob-limit-existing -c fetch.blobSizeLimit=1k fetch origin &&
++test_expect_success 'bare clone.defaultObjectFilter without URL is ignored' '
++ git -c clone.defaultObjectFilter=blob:none \
++ clone "file://$(pwd)/default-filter-srv.bare" default-filter-bare-key &&
+
-+ test "$(git -C blob-limit-existing config --local remote.origin.partialclonefilter)" = "blob:none"
++ test_must_fail git -C default-filter-bare-key config --local remote.origin.promisor
+'
. "$TEST_DIRECTORY"/lib-httpd.sh
Documentation/config/clone.adoc | 26 ++++++++++++
builtin/clone.c | 68 ++++++++++++++++++++++++++++++
t/t5616-partial-clone.sh | 73 +++++++++++++++++++++++++++++++++
3 files changed, 167 insertions(+)
diff --git a/Documentation/config/clone.adoc b/Documentation/config/clone.adoc
index 0a10efd174..5805ab51c2 100644
--- a/Documentation/config/clone.adoc
+++ b/Documentation/config/clone.adoc
@@ -21,3 +21,29 @@ endif::[]
If a partial clone filter is provided (see `--filter` in
linkgit:git-rev-list[1]) and `--recurse-submodules` is used, also apply
the filter to submodules.
+
+`clone.<url>.defaultObjectFilter`::
+ When set to a filter spec string (e.g., `blob:limit=1m`,
+ `blob:none`, `tree:0`), linkgit:git-clone[1] will automatically
+ use `--filter=<value>` when the clone URL matches `<url>`.
+ Objects matching the filter are excluded from the initial
+ transfer and lazily fetched on demand (e.g., during checkout).
+ Subsequent fetches inherit the filter via the per-remote config
+ that is written during the clone.
++
+The URL matching follows the same rules as `http.<url>.*` (see
+linkgit:git-config[1]). The most specific URL match wins. You can
+match a complete domain, a namespace, or a specific project:
++
+----
+[clone "https://github.com/"]
+ defaultObjectFilter = blob:limit=5m
+
+[clone "https://internal.corp.com/large-project/"]
+ defaultObjectFilter = blob:none
+----
++
+An explicit `--filter` option on the command line takes precedence
+over this config. Only affects the initial clone; it has no effect
+on later fetches into an existing repository. If the server does
+not support object filtering, the setting is silently ignored.
diff --git a/builtin/clone.c b/builtin/clone.c
index 45d8fa0eed..5e20b5343d 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -44,6 +44,7 @@
#include "path.h"
#include "pkt-line.h"
#include "list-objects-filter-options.h"
+#include "urlmatch.h"
#include "hook.h"
#include "bundle.h"
#include "bundle-uri.h"
@@ -757,6 +758,65 @@ static int git_clone_config(const char *k, const char *v,
return git_default_config(k, v, ctx, cb);
}
+struct clone_filter_data {
+ char *default_object_filter;
+};
+
+static int clone_filter_collect(const char *var, const char *value,
+ const struct config_context *ctx UNUSED,
+ void *cb)
+{
+ struct clone_filter_data *data = cb;
+
+ if (!strcmp(var, "clone.defaultobjectfilter")) {
+ free(data->default_object_filter);
+ data->default_object_filter = xstrdup(value);
+ }
+ return 0;
+}
+
+/*
+ * Look up clone.<url>.defaultObjectFilter using the urlmatch
+ * infrastructure. Only URL-qualified forms are supported; a bare
+ * clone.defaultObjectFilter (without a URL) is ignored.
+ */
+static char *get_default_object_filter(const char *url)
+{
+ struct urlmatch_config config = URLMATCH_CONFIG_INIT;
+ struct clone_filter_data data = { 0 };
+ struct string_list_item *item;
+ char *normalized_url;
+
+ config.section = "clone";
+ config.key = "defaultobjectfilter";
+ config.collect_fn = clone_filter_collect;
+ config.cascade_fn = git_clone_config;
+ config.cb = &data;
+
+ normalized_url = url_normalize(url, &config.url);
+
+ repo_config(the_repository, urlmatch_config_entry, &config);
+ free(normalized_url);
+
+ /*
+ * Reject the bare form clone.defaultObjectFilter (no URL
+ * subsection). urlmatch stores the best match in vars with
+ * hostmatch_len == 0 for non-URL-qualified entries; discard
+ * the result if that is what we got.
+ */
+ item = string_list_lookup(&config.vars, "defaultobjectfilter");
+ if (item) {
+ const struct urlmatch_item *m = item->util;
+ if (!m->hostmatch_len && !m->pathmatch_len) {
+ FREE_AND_NULL(data.default_object_filter);
+ }
+ }
+
+ urlmatch_config_release(&config);
+
+ return data.default_object_filter;
+}
+
static int write_one_config(const char *key, const char *value,
const struct config_context *ctx,
void *data)
@@ -1057,6 +1117,14 @@ int cmd_clone(int argc,
} else
die(_("repository '%s' does not exist"), repo_name);
+ if (!filter_options.choice) {
+ char *config_filter = get_default_object_filter(repo);
+ if (config_filter) {
+ parse_list_objects_filter(&filter_options, config_filter);
+ free(config_filter);
+ }
+ }
+
/* no need to be strict, transport_set_option() will validate it again */
if (option_depth && atoi(option_depth) < 1)
die(_("depth %s is not a positive number"), option_depth);
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 1e354e057f..33010f3b7d 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -722,6 +722,79 @@ test_expect_success 'after fetching descendants of non-promisor commits, gc work
git -C partial gc --prune=now
'
+# Test clone.<url>.defaultObjectFilter config
+
+test_expect_success 'setup for clone.defaultObjectFilter tests' '
+ git init default-filter-src &&
+ echo "small" >default-filter-src/small.txt &&
+ dd if=/dev/zero of=default-filter-src/large.bin bs=1024 count=100 2>/dev/null &&
+ git -C default-filter-src add . &&
+ git -C default-filter-src commit -m "initial" &&
+
+ git clone --bare "file://$(pwd)/default-filter-src" default-filter-srv.bare &&
+ git -C default-filter-srv.bare config --local uploadpack.allowfilter 1 &&
+ git -C default-filter-srv.bare config --local uploadpack.allowanysha1inwant 1
+'
+
+test_expect_success 'clone with clone.<url>.defaultObjectFilter applies filter' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:limit=1k" clone \
+ "$SERVER_URL" default-filter-clone &&
+
+ test "$(git -C default-filter-clone config --local remote.origin.promisor)" = "true" &&
+ test "$(git -C default-filter-clone config --local remote.origin.partialclonefilter)" = "blob:limit=1024"
+'
+
+test_expect_success 'clone with --filter overrides clone.<url>.defaultObjectFilter' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:limit=1k" \
+ clone --filter=blob:none "$SERVER_URL" default-filter-override &&
+
+ test "$(git -C default-filter-override config --local remote.origin.partialclonefilter)" = "blob:none"
+'
+
+test_expect_success 'clone with clone.<url>.defaultObjectFilter=blob:none works' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" clone \
+ "$SERVER_URL" default-filter-blobnone &&
+
+ test "$(git -C default-filter-blobnone config --local remote.origin.promisor)" = "true" &&
+ test "$(git -C default-filter-blobnone config --local remote.origin.partialclonefilter)" = "blob:none"
+'
+
+test_expect_success 'clone.<url>.defaultObjectFilter with tree:0 works' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=tree:0" clone \
+ "$SERVER_URL" default-filter-tree0 &&
+
+ test "$(git -C default-filter-tree0 config --local remote.origin.promisor)" = "true" &&
+ test "$(git -C default-filter-tree0 config --local remote.origin.partialclonefilter)" = "tree:0"
+'
+
+test_expect_success 'most specific URL match wins for clone.defaultObjectFilter' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git \
+ -c "clone.file://.defaultObjectFilter=blob:limit=1k" \
+ -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" \
+ clone "$SERVER_URL" default-filter-url-specific &&
+
+ test "$(git -C default-filter-url-specific config --local remote.origin.partialclonefilter)" = "blob:none"
+'
+
+test_expect_success 'non-matching URL does not apply clone.defaultObjectFilter' '
+ git \
+ -c "clone.https://other.example.com/.defaultObjectFilter=blob:none" \
+ clone "file://$(pwd)/default-filter-srv.bare" default-filter-url-nomatch &&
+
+ test_must_fail git -C default-filter-url-nomatch config --local remote.origin.promisor
+'
+
+test_expect_success 'bare clone.defaultObjectFilter without URL is ignored' '
+ git -c clone.defaultObjectFilter=blob:none \
+ clone "file://$(pwd)/default-filter-srv.bare" default-filter-bare-key &&
+
+ test_must_fail git -C default-filter-bare-key config --local remote.origin.promisor
+'
. "$TEST_DIRECTORY"/lib-httpd.sh
start_httpd
base-commit: 7b2bccb0d58d4f24705bf985de1f4612e4cf06e5
--
gitgitgadget
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v2] clone: add clone.<url>.defaultObjectFilter config
2026-03-05 0:57 ` [PATCH v2] clone: add clone.<url>.defaultObjectFilter config Alan Braithwaite via GitGitGadget
@ 2026-03-05 19:01 ` Junio C Hamano
2026-03-05 23:11 ` Alan Braithwaite
2026-03-06 6:55 ` [PATCH v3] " Alan Braithwaite via GitGitGadget
1 sibling, 1 reply; 28+ messages in thread
From: Junio C Hamano @ 2026-03-05 19:01 UTC (permalink / raw)
To: Alan Braithwaite via GitGitGadget
Cc: git, ps, christian.couder, jonathantanmy, me, Jeff King,
Alan Braithwaite
"Alan Braithwaite via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Alan Braithwaite <alan@braithwaite.dev>
>
> Add a new configuration option that lets users specify a default
> partial clone filter per URL pattern. When cloning a repository
> whose URL matches a configured pattern, git-clone automatically
> applies the filter, equivalent to passing --filter on the command
> line.
>
> [clone "https://github.com/"]
> defaultObjectFilter = blob:limit=5m
>
> [clone "https://internal.corp.com/large-project/"]
> defaultObjectFilter = blob:none
>
> URL matching uses the existing urlmatch_config_entry() infrastructure,
> following the same rules as http.<url>.* — you can match a domain,
> a namespace path, or a specific project, and the most specific match
> wins.
>
> The config only affects the initial clone. Once the clone completes,
> the filter is recorded in remote.<name>.partialCloneFilter, so
> subsequent fetches inherit it automatically. An explicit --filter
> flag on the command line takes precedence.
The motivation behind the change is clearly described. Reusing the
existing urlmatch_config_entry() infrastructure is very appropriate
as it makes the feature intuitive for those familiar with
http.<url>.* settings.
> Only the URL-qualified form (clone.<url>.defaultObjectFilter) is
> honored; a bare clone.defaultObjectFilter without a URL subsection
> is ignored.
This is unlike how http.<url>.<var> configuration variables work,
and while I can see that server operators may not want to see users
set clone.defaultObjectFilter and affect traffic with _all_ sites, I
am afraid that this design choice may appear a bit counter-intuitive
to end users.
> Signed-off-by: Alan Braithwaite <alan@braithwaite.dev>
> Documentation/config/clone.adoc | 26 ++++++++++++
> builtin/clone.c | 68 ++++++++++++++++++++++++++++++
> t/t5616-partial-clone.sh | 73 +++++++++++++++++++++++++++++++++
> 3 files changed, 167 insertions(+)
>
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 45d8fa0eed..5e20b5343d 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -44,6 +44,7 @@
> #include "path.h"
> #include "pkt-line.h"
> #include "list-objects-filter-options.h"
> +#include "urlmatch.h"
> #include "hook.h"
> #include "bundle.h"
> #include "bundle-uri.h"
> @@ -757,6 +758,65 @@ static int git_clone_config(const char *k, const char *v,
> return git_default_config(k, v, ctx, cb);
> }
>
> +struct clone_filter_data {
> + char *default_object_filter;
> +};
> +
> +static int clone_filter_collect(const char *var, const char *value,
> + const struct config_context *ctx UNUSED,
> + void *cb)
> +{
> + struct clone_filter_data *data = cb;
> +
> + if (!strcmp(var, "clone.defaultobjectfilter")) {
> + free(data->default_object_filter);
> + data->default_object_filter = xstrdup(value);
> + }
> + return 0;
> +}
This will segfault with a "value-less truth", i.e.,
[clone "<URL>"]
defaultObjectFilter
so there should be
if (!value)
return config_error_nonbool(var);
in it.
I cannot convince myself that a new structure only to hold a single
"char *" member is not over-engineering. Wouldn't it work equally
well (unless you have an immediate plan to add more members to the
struct, that is):
char **filter_spec_p = cb;
if (!strcmp(var, "clone.defaultobjectfilter")) {
if (!value)
retgurn config_error_nonbool(var);
free(*filter_spec_p);
*filter_spec_p = xstrdup(value);
}
return 0;
> +/*
> + * Look up clone.<url>.defaultObjectFilter using the urlmatch
> + * infrastructure. Only URL-qualified forms are supported; a bare
> + * clone.defaultObjectFilter (without a URL) is ignored.
> + */
> +static char *get_default_object_filter(const char *url)
> +{
> + struct urlmatch_config config = URLMATCH_CONFIG_INIT;
> + struct clone_filter_data data = { 0 };
> + struct string_list_item *item;
> + char *normalized_url;
> +
> + config.section = "clone";
> + config.key = "defaultobjectfilter";
> + config.collect_fn = clone_filter_collect;
> + config.cascade_fn = git_clone_config;
> + config.cb = &data;
> +
> + normalized_url = url_normalize(url, &config.url);
> +
> + repo_config(the_repository, urlmatch_config_entry, &config);
> + free(normalized_url);
This forces a second full scan of the configuration space. But it
cannot be avoided, because the existing repo_config() call has to
happen early before we call parse_options() to give us the
configured default to overwrite with the command line, and we would
not know what our URL is before we called parse_options().
However, I thihk you want to leave the .cascade_fn NULL; you do not
want urlmatch_config_entry() to call git_clone_config() AGAIN on the
configuration variables, as the first call to repo_config() before
we call parse_options() should have already handled them, no?
Thanks.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2] clone: add clone.<url>.defaultObjectFilter config
2026-03-05 19:01 ` Junio C Hamano
@ 2026-03-05 23:11 ` Alan Braithwaite
0 siblings, 0 replies; 28+ messages in thread
From: Alan Braithwaite @ 2026-03-05 23:11 UTC (permalink / raw)
To: Junio C Hamano, Alan Braithwaite
Cc: git, Patrick Steinhardt, christian.couder, jonathantanmy, me,
Jeff King
Junio C Hamano wrote:
> This is unlike how http.<url>.<var> configuration variables work,
> and while I can see that server operators may not want to see users
> set clone.defaultObjectFilter and affect traffic with _all_ sites, I
> am afraid that this design choice may appear a bit counter-intuitive
> to end users.
Funny enough, I actually prefer that but gathered from the previous
commentary that it wasn't desired. I'd be more than content to add it.
Junio C Hamano wrote:
> I cannot convince myself that a new structure only to hold a single
> "char *" member is not over-engineering. Wouldn't it work equally
> well (unless you have an immediate plan to add more members to the
> struct, that is):
You're right, it's been a while I've written C. Thanks for catching
that. I think my mind was going somewhere else with it, but YAGNI.
Junio C Hamano wrote:
> However, I think you want to leave the .cascade_fn NULL; you do not
> want urlmatch_config_entry() to call git_clone_config() AGAIN on the
> configuration variables, as the first call to repo_config() before
> we call parse_options() should have already handled them, no?
Good catch. I'll fix it. Will set cascade_fn to NULL so the second
pass only looks at clone.<url>.defaultObjectFilter entries.
Thanks for the review and for your patience as I shake the gopher
out of me and figure out how to do real programming again.
Thanks,
- Alan
On Thu, Mar 5, 2026, at 11:01, Junio C Hamano wrote:
> "Alan Braithwaite via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> From: Alan Braithwaite <alan@braithwaite.dev>
>>
>> Add a new configuration option that lets users specify a default
>> partial clone filter per URL pattern. When cloning a repository
>> whose URL matches a configured pattern, git-clone automatically
>> applies the filter, equivalent to passing --filter on the command
>> line.
>>
>> [clone "https://github.com/"]
>> defaultObjectFilter = blob:limit=5m
>>
>> [clone "https://internal.corp.com/large-project/"]
>> defaultObjectFilter = blob:none
>>
>> URL matching uses the existing urlmatch_config_entry() infrastructure,
>> following the same rules as http.<url>.* — you can match a domain,
>> a namespace path, or a specific project, and the most specific match
>> wins.
>>
>> The config only affects the initial clone. Once the clone completes,
>> the filter is recorded in remote.<name>.partialCloneFilter, so
>> subsequent fetches inherit it automatically. An explicit --filter
>> flag on the command line takes precedence.
>
> The motivation behind the change is clearly described. Reusing the
> existing urlmatch_config_entry() infrastructure is very appropriate
> as it makes the feature intuitive for those familiar with
> http.<url>.* settings.
>
>> Only the URL-qualified form (clone.<url>.defaultObjectFilter) is
>> honored; a bare clone.defaultObjectFilter without a URL subsection
>> is ignored.
>
> This is unlike how http.<url>.<var> configuration variables work,
> and while I can see that server operators may not want to see users
> set clone.defaultObjectFilter and affect traffic with _all_ sites, I
> am afraid that this design choice may appear a bit counter-intuitive
> to end users.
>
>
>> Signed-off-by: Alan Braithwaite <alan@braithwaite.dev>
>
>> Documentation/config/clone.adoc | 26 ++++++++++++
>> builtin/clone.c | 68 ++++++++++++++++++++++++++++++
>> t/t5616-partial-clone.sh | 73 +++++++++++++++++++++++++++++++++
>> 3 files changed, 167 insertions(+)
>>
>> diff --git a/builtin/clone.c b/builtin/clone.c
>> index 45d8fa0eed..5e20b5343d 100644
>> --- a/builtin/clone.c
>> +++ b/builtin/clone.c
>> @@ -44,6 +44,7 @@
>> #include "path.h"
>> #include "pkt-line.h"
>> #include "list-objects-filter-options.h"
>> +#include "urlmatch.h"
>> #include "hook.h"
>> #include "bundle.h"
>> #include "bundle-uri.h"
>> @@ -757,6 +758,65 @@ static int git_clone_config(const char *k, const char *v,
>> return git_default_config(k, v, ctx, cb);
>> }
>>
>> +struct clone_filter_data {
>> + char *default_object_filter;
>> +};
>> +
>> +static int clone_filter_collect(const char *var, const char *value,
>> + const struct config_context *ctx UNUSED,
>> + void *cb)
>> +{
>> + struct clone_filter_data *data = cb;
>> +
>> + if (!strcmp(var, "clone.defaultobjectfilter")) {
>> + free(data->default_object_filter);
>> + data->default_object_filter = xstrdup(value);
>> + }
>> + return 0;
>> +}
>
> This will segfault with a "value-less truth", i.e.,
>
> [clone "<URL>"]
> defaultObjectFilter
>
> so there should be
>
> if (!value)
> return config_error_nonbool(var);
>
> in it.
>
> I cannot convince myself that a new structure only to hold a single
> "char *" member is not over-engineering. Wouldn't it work equally
> well (unless you have an immediate plan to add more members to the
> struct, that is):
>
> char **filter_spec_p = cb;
>
> if (!strcmp(var, "clone.defaultobjectfilter")) {
> if (!value)
> retgurn config_error_nonbool(var);
> free(*filter_spec_p);
> *filter_spec_p = xstrdup(value);
> }
> return 0;
>
>> +/*
>> + * Look up clone.<url>.defaultObjectFilter using the urlmatch
>> + * infrastructure. Only URL-qualified forms are supported; a bare
>> + * clone.defaultObjectFilter (without a URL) is ignored.
>> + */
>> +static char *get_default_object_filter(const char *url)
>> +{
>> + struct urlmatch_config config = URLMATCH_CONFIG_INIT;
>> + struct clone_filter_data data = { 0 };
>> + struct string_list_item *item;
>> + char *normalized_url;
>> +
>> + config.section = "clone";
>> + config.key = "defaultobjectfilter";
>> + config.collect_fn = clone_filter_collect;
>> + config.cascade_fn = git_clone_config;
>> + config.cb = &data;
>> +
>> + normalized_url = url_normalize(url, &config.url);
>> +
>> + repo_config(the_repository, urlmatch_config_entry, &config);
>> + free(normalized_url);
>
> This forces a second full scan of the configuration space. But it
> cannot be avoided, because the existing repo_config() call has to
> happen early before we call parse_options() to give us the
> configured default to overwrite with the command line, and we would
> not know what our URL is before we called parse_options().
>
> However, I thihk you want to leave the .cascade_fn NULL; you do not
> want urlmatch_config_entry() to call git_clone_config() AGAIN on the
> configuration variables, as the first call to repo_config() before
> we call parse_options() should have already handled them, no?
>
> Thanks.
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3] clone: add clone.<url>.defaultObjectFilter config
2026-03-05 0:57 ` [PATCH v2] clone: add clone.<url>.defaultObjectFilter config Alan Braithwaite via GitGitGadget
2026-03-05 19:01 ` Junio C Hamano
@ 2026-03-06 6:55 ` Alan Braithwaite via GitGitGadget
2026-03-06 10:39 ` brian m. carlson
2026-03-06 21:47 ` [PATCH v4] " Alan Braithwaite via GitGitGadget
1 sibling, 2 replies; 28+ messages in thread
From: Alan Braithwaite via GitGitGadget @ 2026-03-06 6:55 UTC (permalink / raw)
To: git
Cc: ps, christian.couder, jonathantanmy, me, gitster, Jeff King,
Alan Braithwaite, Alan Braithwaite
From: Alan Braithwaite <alan@braithwaite.dev>
Add a new configuration option that lets users specify a default
partial clone filter per URL pattern. When cloning a repository
whose URL matches a configured pattern, git-clone automatically
applies the filter, equivalent to passing --filter on the command
line.
[clone "https://github.com/"]
defaultObjectFilter = blob:limit=5m
[clone "https://internal.corp.com/large-project/"]
defaultObjectFilter = blob:none
URL matching uses the existing urlmatch_config_entry() infrastructure,
following the same rules as http.<url>.* — you can match a domain,
a namespace path, or a specific project, and the most specific match
wins.
The config only affects the initial clone. Once the clone completes,
the filter is recorded in remote.<name>.partialCloneFilter, so
subsequent fetches inherit it automatically. An explicit --filter
flag on the command line takes precedence.
Only the URL-qualified form (clone.<url>.defaultObjectFilter) is
honored; a bare clone.defaultObjectFilter without a URL subsection
is ignored.
Signed-off-by: Alan Braithwaite <alan@braithwaite.dev>
---
fetch, clone: add fetch.blobSizeLimit config
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2058%2Fabraithwaite%2Falan%2Ffetch-blob-size-limit-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2058/abraithwaite/alan/fetch-blob-size-limit-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/2058
Range-diff vs v2:
1: 4a73edd2e8 ! 1: 5408412f2a clone: add clone.<url>.defaultObjectFilter config
@@ Documentation/config/clone.adoc: endif::[]
linkgit:git-rev-list[1]) and `--recurse-submodules` is used, also apply
the filter to submodules.
+
++`clone.defaultObjectFilter`::
+`clone.<url>.defaultObjectFilter`::
+ When set to a filter spec string (e.g., `blob:limit=1m`,
+ `blob:none`, `tree:0`), linkgit:git-clone[1] will automatically
-+ use `--filter=<value>` when the clone URL matches `<url>`.
++ use `--filter=<value>` to enable partial clone behavior.
+ Objects matching the filter are excluded from the initial
+ transfer and lazily fetched on demand (e.g., during checkout).
+ Subsequent fetches inherit the filter via the per-remote config
+ that is written during the clone.
++
-+The URL matching follows the same rules as `http.<url>.*` (see
-+linkgit:git-config[1]). The most specific URL match wins. You can
-+match a complete domain, a namespace, or a specific project:
++The bare `clone.defaultObjectFilter` applies to all clones. The
++URL-qualified form `clone.<url>.defaultObjectFilter` restricts the
++setting to clones whose URL matches `<url>`, following the same
++rules as `http.<url>.*` (see linkgit:git-config[1]). The most
++specific URL match wins. You can match a domain, a namespace, or a
++specific project:
++
+----
++[clone]
++ defaultObjectFilter = blob:limit=1m
++
+[clone "https://github.com/"]
+ defaultObjectFilter = blob:limit=5m
+
@@ builtin/clone.c: static int git_clone_config(const char *k, const char *v,
return git_default_config(k, v, ctx, cb);
}
-+struct clone_filter_data {
-+ char *default_object_filter;
-+};
-+
+static int clone_filter_collect(const char *var, const char *value,
+ const struct config_context *ctx UNUSED,
+ void *cb)
+{
-+ struct clone_filter_data *data = cb;
++ char **filter_spec_p = cb;
+
+ if (!strcmp(var, "clone.defaultobjectfilter")) {
-+ free(data->default_object_filter);
-+ data->default_object_filter = xstrdup(value);
++ if (!value)
++ return config_error_nonbool(var);
++ free(*filter_spec_p);
++ *filter_spec_p = xstrdup(value);
+ }
+ return 0;
+}
+
+/*
-+ * Look up clone.<url>.defaultObjectFilter using the urlmatch
-+ * infrastructure. Only URL-qualified forms are supported; a bare
-+ * clone.defaultObjectFilter (without a URL) is ignored.
++ * Look up clone.defaultObjectFilter or clone.<url>.defaultObjectFilter
++ * using the urlmatch infrastructure. A URL-qualified entry that matches
++ * the clone URL takes precedence over the bare form, following the same
++ * rules as http.<url>.* configuration variables.
+ */
+static char *get_default_object_filter(const char *url)
+{
+ struct urlmatch_config config = URLMATCH_CONFIG_INIT;
-+ struct clone_filter_data data = { 0 };
-+ struct string_list_item *item;
++ char *filter_spec = NULL;
+ char *normalized_url;
+
+ config.section = "clone";
+ config.key = "defaultobjectfilter";
+ config.collect_fn = clone_filter_collect;
-+ config.cascade_fn = git_clone_config;
-+ config.cb = &data;
++ config.cb = &filter_spec;
+
+ normalized_url = url_normalize(url, &config.url);
+
+ repo_config(the_repository, urlmatch_config_entry, &config);
+ free(normalized_url);
-+
-+ /*
-+ * Reject the bare form clone.defaultObjectFilter (no URL
-+ * subsection). urlmatch stores the best match in vars with
-+ * hostmatch_len == 0 for non-URL-qualified entries; discard
-+ * the result if that is what we got.
-+ */
-+ item = string_list_lookup(&config.vars, "defaultobjectfilter");
-+ if (item) {
-+ const struct urlmatch_item *m = item->util;
-+ if (!m->hostmatch_len && !m->pathmatch_len) {
-+ FREE_AND_NULL(data.default_object_filter);
-+ }
-+ }
-+
+ urlmatch_config_release(&config);
+
-+ return data.default_object_filter;
++ return filter_spec;
+}
+
static int write_one_config(const char *key, const char *value,
@@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non
+ test_must_fail git -C default-filter-url-nomatch config --local remote.origin.promisor
+'
+
-+test_expect_success 'bare clone.defaultObjectFilter without URL is ignored' '
++test_expect_success 'bare clone.defaultObjectFilter applies to all clones' '
+ git -c clone.defaultObjectFilter=blob:none \
+ clone "file://$(pwd)/default-filter-srv.bare" default-filter-bare-key &&
+
-+ test_must_fail git -C default-filter-bare-key config --local remote.origin.promisor
++ test "$(git -C default-filter-bare-key config --local remote.origin.promisor)" = "true" &&
++ test "$(git -C default-filter-bare-key config --local remote.origin.partialclonefilter)" = "blob:none"
++'
++
++test_expect_success 'URL-specific clone.defaultObjectFilter overrides bare form' '
++ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
++ git \
++ -c clone.defaultObjectFilter=blob:limit=1k \
++ -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" \
++ clone "$SERVER_URL" default-filter-url-over-bare &&
++
++ test "$(git -C default-filter-url-over-bare config --local remote.origin.partialclonefilter)" = "blob:none"
+'
. "$TEST_DIRECTORY"/lib-httpd.sh
Documentation/config/clone.adoc | 33 +++++++++++++
builtin/clone.c | 50 ++++++++++++++++++++
t/t5616-partial-clone.sh | 84 +++++++++++++++++++++++++++++++++
3 files changed, 167 insertions(+)
diff --git a/Documentation/config/clone.adoc b/Documentation/config/clone.adoc
index 0a10efd174..7ef6321be2 100644
--- a/Documentation/config/clone.adoc
+++ b/Documentation/config/clone.adoc
@@ -21,3 +21,36 @@ endif::[]
If a partial clone filter is provided (see `--filter` in
linkgit:git-rev-list[1]) and `--recurse-submodules` is used, also apply
the filter to submodules.
+
+`clone.defaultObjectFilter`::
+`clone.<url>.defaultObjectFilter`::
+ When set to a filter spec string (e.g., `blob:limit=1m`,
+ `blob:none`, `tree:0`), linkgit:git-clone[1] will automatically
+ use `--filter=<value>` to enable partial clone behavior.
+ Objects matching the filter are excluded from the initial
+ transfer and lazily fetched on demand (e.g., during checkout).
+ Subsequent fetches inherit the filter via the per-remote config
+ that is written during the clone.
++
+The bare `clone.defaultObjectFilter` applies to all clones. The
+URL-qualified form `clone.<url>.defaultObjectFilter` restricts the
+setting to clones whose URL matches `<url>`, following the same
+rules as `http.<url>.*` (see linkgit:git-config[1]). The most
+specific URL match wins. You can match a domain, a namespace, or a
+specific project:
++
+----
+[clone]
+ defaultObjectFilter = blob:limit=1m
+
+[clone "https://github.com/"]
+ defaultObjectFilter = blob:limit=5m
+
+[clone "https://internal.corp.com/large-project/"]
+ defaultObjectFilter = blob:none
+----
++
+An explicit `--filter` option on the command line takes precedence
+over this config. Only affects the initial clone; it has no effect
+on later fetches into an existing repository. If the server does
+not support object filtering, the setting is silently ignored.
diff --git a/builtin/clone.c b/builtin/clone.c
index 45d8fa0eed..b549191707 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -44,6 +44,7 @@
#include "path.h"
#include "pkt-line.h"
#include "list-objects-filter-options.h"
+#include "urlmatch.h"
#include "hook.h"
#include "bundle.h"
#include "bundle-uri.h"
@@ -757,6 +758,47 @@ static int git_clone_config(const char *k, const char *v,
return git_default_config(k, v, ctx, cb);
}
+static int clone_filter_collect(const char *var, const char *value,
+ const struct config_context *ctx UNUSED,
+ void *cb)
+{
+ char **filter_spec_p = cb;
+
+ if (!strcmp(var, "clone.defaultobjectfilter")) {
+ if (!value)
+ return config_error_nonbool(var);
+ free(*filter_spec_p);
+ *filter_spec_p = xstrdup(value);
+ }
+ return 0;
+}
+
+/*
+ * Look up clone.defaultObjectFilter or clone.<url>.defaultObjectFilter
+ * using the urlmatch infrastructure. A URL-qualified entry that matches
+ * the clone URL takes precedence over the bare form, following the same
+ * rules as http.<url>.* configuration variables.
+ */
+static char *get_default_object_filter(const char *url)
+{
+ struct urlmatch_config config = URLMATCH_CONFIG_INIT;
+ char *filter_spec = NULL;
+ char *normalized_url;
+
+ config.section = "clone";
+ config.key = "defaultobjectfilter";
+ config.collect_fn = clone_filter_collect;
+ config.cb = &filter_spec;
+
+ normalized_url = url_normalize(url, &config.url);
+
+ repo_config(the_repository, urlmatch_config_entry, &config);
+ free(normalized_url);
+ urlmatch_config_release(&config);
+
+ return filter_spec;
+}
+
static int write_one_config(const char *key, const char *value,
const struct config_context *ctx,
void *data)
@@ -1057,6 +1099,14 @@ int cmd_clone(int argc,
} else
die(_("repository '%s' does not exist"), repo_name);
+ if (!filter_options.choice) {
+ char *config_filter = get_default_object_filter(repo);
+ if (config_filter) {
+ parse_list_objects_filter(&filter_options, config_filter);
+ free(config_filter);
+ }
+ }
+
/* no need to be strict, transport_set_option() will validate it again */
if (option_depth && atoi(option_depth) < 1)
die(_("depth %s is not a positive number"), option_depth);
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 1e354e057f..a4bfdb329e 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -722,6 +722,90 @@ test_expect_success 'after fetching descendants of non-promisor commits, gc work
git -C partial gc --prune=now
'
+# Test clone.<url>.defaultObjectFilter config
+
+test_expect_success 'setup for clone.defaultObjectFilter tests' '
+ git init default-filter-src &&
+ echo "small" >default-filter-src/small.txt &&
+ dd if=/dev/zero of=default-filter-src/large.bin bs=1024 count=100 2>/dev/null &&
+ git -C default-filter-src add . &&
+ git -C default-filter-src commit -m "initial" &&
+
+ git clone --bare "file://$(pwd)/default-filter-src" default-filter-srv.bare &&
+ git -C default-filter-srv.bare config --local uploadpack.allowfilter 1 &&
+ git -C default-filter-srv.bare config --local uploadpack.allowanysha1inwant 1
+'
+
+test_expect_success 'clone with clone.<url>.defaultObjectFilter applies filter' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:limit=1k" clone \
+ "$SERVER_URL" default-filter-clone &&
+
+ test "$(git -C default-filter-clone config --local remote.origin.promisor)" = "true" &&
+ test "$(git -C default-filter-clone config --local remote.origin.partialclonefilter)" = "blob:limit=1024"
+'
+
+test_expect_success 'clone with --filter overrides clone.<url>.defaultObjectFilter' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:limit=1k" \
+ clone --filter=blob:none "$SERVER_URL" default-filter-override &&
+
+ test "$(git -C default-filter-override config --local remote.origin.partialclonefilter)" = "blob:none"
+'
+
+test_expect_success 'clone with clone.<url>.defaultObjectFilter=blob:none works' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" clone \
+ "$SERVER_URL" default-filter-blobnone &&
+
+ test "$(git -C default-filter-blobnone config --local remote.origin.promisor)" = "true" &&
+ test "$(git -C default-filter-blobnone config --local remote.origin.partialclonefilter)" = "blob:none"
+'
+
+test_expect_success 'clone.<url>.defaultObjectFilter with tree:0 works' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=tree:0" clone \
+ "$SERVER_URL" default-filter-tree0 &&
+
+ test "$(git -C default-filter-tree0 config --local remote.origin.promisor)" = "true" &&
+ test "$(git -C default-filter-tree0 config --local remote.origin.partialclonefilter)" = "tree:0"
+'
+
+test_expect_success 'most specific URL match wins for clone.defaultObjectFilter' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git \
+ -c "clone.file://.defaultObjectFilter=blob:limit=1k" \
+ -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" \
+ clone "$SERVER_URL" default-filter-url-specific &&
+
+ test "$(git -C default-filter-url-specific config --local remote.origin.partialclonefilter)" = "blob:none"
+'
+
+test_expect_success 'non-matching URL does not apply clone.defaultObjectFilter' '
+ git \
+ -c "clone.https://other.example.com/.defaultObjectFilter=blob:none" \
+ clone "file://$(pwd)/default-filter-srv.bare" default-filter-url-nomatch &&
+
+ test_must_fail git -C default-filter-url-nomatch config --local remote.origin.promisor
+'
+
+test_expect_success 'bare clone.defaultObjectFilter applies to all clones' '
+ git -c clone.defaultObjectFilter=blob:none \
+ clone "file://$(pwd)/default-filter-srv.bare" default-filter-bare-key &&
+
+ test "$(git -C default-filter-bare-key config --local remote.origin.promisor)" = "true" &&
+ test "$(git -C default-filter-bare-key config --local remote.origin.partialclonefilter)" = "blob:none"
+'
+
+test_expect_success 'URL-specific clone.defaultObjectFilter overrides bare form' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git \
+ -c clone.defaultObjectFilter=blob:limit=1k \
+ -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" \
+ clone "$SERVER_URL" default-filter-url-over-bare &&
+
+ test "$(git -C default-filter-url-over-bare config --local remote.origin.partialclonefilter)" = "blob:none"
+'
. "$TEST_DIRECTORY"/lib-httpd.sh
start_httpd
base-commit: 7b2bccb0d58d4f24705bf985de1f4612e4cf06e5
--
gitgitgadget
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v3] clone: add clone.<url>.defaultObjectFilter config
2026-03-06 6:55 ` [PATCH v3] " Alan Braithwaite via GitGitGadget
@ 2026-03-06 10:39 ` brian m. carlson
2026-03-06 19:33 ` Junio C Hamano
2026-03-06 21:47 ` [PATCH v4] " Alan Braithwaite via GitGitGadget
1 sibling, 1 reply; 28+ messages in thread
From: brian m. carlson @ 2026-03-06 10:39 UTC (permalink / raw)
To: Alan Braithwaite via GitGitGadget
Cc: git, ps, christian.couder, jonathantanmy, me, gitster, Jeff King,
Alan Braithwaite
[-- Attachment #1: Type: text/plain, Size: 2237 bytes --]
On 2026-03-06 at 06:55:13, Alan Braithwaite via GitGitGadget wrote:
> From: Alan Braithwaite <alan@braithwaite.dev>
>
> Add a new configuration option that lets users specify a default
> partial clone filter per URL pattern. When cloning a repository
> whose URL matches a configured pattern, git-clone automatically
> applies the filter, equivalent to passing --filter on the command
> line.
>
> [clone "https://github.com/"]
> defaultObjectFilter = blob:limit=5m
>
> [clone "https://internal.corp.com/large-project/"]
> defaultObjectFilter = blob:none
>
> URL matching uses the existing urlmatch_config_entry() infrastructure,
> following the same rules as http.<url>.* — you can match a domain,
> a namespace path, or a specific project, and the most specific match
> wins.
>
> The config only affects the initial clone. Once the clone completes,
> the filter is recorded in remote.<name>.partialCloneFilter, so
> subsequent fetches inherit it automatically. An explicit --filter
> flag on the command line takes precedence.
>
> Only the URL-qualified form (clone.<url>.defaultObjectFilter) is
> honored; a bare clone.defaultObjectFilter without a URL subsection
> is ignored.
We've historically not implemented default filtering for clones because
it makes it hard to reason about the behaviour of the clone command.
For instance, if I have a script that clones a repository, it almost
certainly expects a full clone unless it requested something else.
For instance, I run `foo setup` which clones my repository and then I
suspend my laptop. I go the airport and get on an airplane which lacks
Wi-Fi. I then run `foo blargle`, which operates on the repository, but
that fails because it was a partial clone and I'm offline. I didn't
realize this wouldn't work because I didn't know that the foo command
required a full clone since it's just a script I got from my distro.
We've traditionally placed this kind of customizable configuration into
`scalar` instead, which is designed to be configurable and set options
for large repositories that would want to control clone and fetch
options.
--
brian m. carlson (they/them)
Toronto, Ontario, CA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3] clone: add clone.<url>.defaultObjectFilter config
2026-03-06 10:39 ` brian m. carlson
@ 2026-03-06 19:33 ` Junio C Hamano
2026-03-06 21:50 ` Alan Braithwaite
0 siblings, 1 reply; 28+ messages in thread
From: Junio C Hamano @ 2026-03-06 19:33 UTC (permalink / raw)
To: brian m. carlson
Cc: Alan Braithwaite via GitGitGadget, git, ps, christian.couder, me,
Jeff King, Alan Braithwaite
"brian m. carlson" <sandals@crustytoothpaste.net> writes:
> We've historically not implemented default filtering for clones because
> it makes it hard to reason about the behaviour of the clone command.
> For instance, if I have a script that clones a repository, it almost
> certainly expects a full clone unless it requested something else.
> ...
> We've traditionally placed this kind of customizable configuration into
> `scalar` instead, which is designed to be configurable and set options
> for large repositories that would want to control clone and fetch
> options.
Hmph, my knee-jerk reaction to the early part of your message was
"oh, but isn't clone a Porcelain (admittedly without corresponding
plumbing) whose defaults and end-user experiences are meant to be
updated from time to time to help users?" but I didn't realize that
we have another class, which is "scalar", these days that we can add
these settings to. I do not have objections to add something to
"scalar", but I personally feel that the configuration for clone is
such a bad thing to have.
Do we have a way to defeat the configured filter to say "no
filtering, we want everything" from the command line? If not, that
needs to be addressed, if we were to add this configuration.
Thanks.
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v4] clone: add clone.<url>.defaultObjectFilter config
2026-03-06 6:55 ` [PATCH v3] " Alan Braithwaite via GitGitGadget
2026-03-06 10:39 ` brian m. carlson
@ 2026-03-06 21:47 ` Alan Braithwaite via GitGitGadget
2026-03-06 22:18 ` Junio C Hamano
2026-03-07 1:33 ` [PATCH v5] " Alan Braithwaite via GitGitGadget
1 sibling, 2 replies; 28+ messages in thread
From: Alan Braithwaite via GitGitGadget @ 2026-03-06 21:47 UTC (permalink / raw)
To: git
Cc: ps, christian.couder, jonathantanmy, me, gitster, Jeff King,
brian m. carlson, Alan Braithwaite, Alan Braithwaite
From: Alan Braithwaite <alan@braithwaite.dev>
Add a new configuration option that lets users specify a default
partial clone filter per URL pattern. When cloning a repository
whose URL matches a configured pattern, git-clone automatically
applies the filter, equivalent to passing --filter on the command
line.
[clone "https://github.com/"]
defaultObjectFilter = blob:limit=5m
[clone "https://internal.corp.com/large-project/"]
defaultObjectFilter = blob:none
URL matching uses the existing urlmatch_config_entry() infrastructure,
following the same rules as http.<url>.* — you can match a domain,
a namespace path, or a specific project, and the most specific match
wins.
The config only affects the initial clone. Once the clone completes,
the filter is recorded in remote.<name>.partialCloneFilter, so
subsequent fetches inherit it automatically. An explicit --filter
flag on the command line takes precedence.
Only the URL-qualified form (clone.<url>.defaultObjectFilter) is
honored; a bare clone.defaultObjectFilter without a URL subsection
is ignored.
Signed-off-by: Alan Braithwaite <alan@braithwaite.dev>
---
fetch, clone: add fetch.blobSizeLimit config
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2058%2Fabraithwaite%2Falan%2Ffetch-blob-size-limit-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2058/abraithwaite/alan/fetch-blob-size-limit-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/2058
Range-diff vs v3:
1: 5408412f2a ! 1: 4bf3e1ec63 clone: add clone.<url>.defaultObjectFilter config
@@ Documentation/config/clone.adoc: endif::[]
+----
++
+An explicit `--filter` option on the command line takes precedence
-+over this config. Only affects the initial clone; it has no effect
-+on later fetches into an existing repository. If the server does
-+not support object filtering, the setting is silently ignored.
++over this config, and `--no-filter` defeats it entirely to force a
++full clone. Only affects the initial clone; it has no effect on
++later fetches into an existing repository. If the server does not
++support object filtering, the setting is silently ignored.
## builtin/clone.c ##
@@
@@ builtin/clone.c: int cmd_clone(int argc,
} else
die(_("repository '%s' does not exist"), repo_name);
-+ if (!filter_options.choice) {
++ if (!filter_options.choice && !filter_options.no_filter) {
+ char *config_filter = get_default_object_filter(repo);
+ if (config_filter) {
+ parse_list_objects_filter(&filter_options, config_filter);
@@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non
+ clone "$SERVER_URL" default-filter-url-over-bare &&
+
+ test "$(git -C default-filter-url-over-bare config --local remote.origin.partialclonefilter)" = "blob:none"
++'
++
++test_expect_success '--no-filter defeats clone.defaultObjectFilter' '
++ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
++ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" \
++ clone --no-filter "$SERVER_URL" default-filter-no-filter &&
++
++ test_must_fail git -C default-filter-no-filter config --local remote.origin.promisor
+'
. "$TEST_DIRECTORY"/lib-httpd.sh
Documentation/config/clone.adoc | 34 ++++++++++++
builtin/clone.c | 50 ++++++++++++++++++
t/t5616-partial-clone.sh | 92 +++++++++++++++++++++++++++++++++
3 files changed, 176 insertions(+)
diff --git a/Documentation/config/clone.adoc b/Documentation/config/clone.adoc
index 0a10efd174..1d6c0957a0 100644
--- a/Documentation/config/clone.adoc
+++ b/Documentation/config/clone.adoc
@@ -21,3 +21,37 @@ endif::[]
If a partial clone filter is provided (see `--filter` in
linkgit:git-rev-list[1]) and `--recurse-submodules` is used, also apply
the filter to submodules.
+
+`clone.defaultObjectFilter`::
+`clone.<url>.defaultObjectFilter`::
+ When set to a filter spec string (e.g., `blob:limit=1m`,
+ `blob:none`, `tree:0`), linkgit:git-clone[1] will automatically
+ use `--filter=<value>` to enable partial clone behavior.
+ Objects matching the filter are excluded from the initial
+ transfer and lazily fetched on demand (e.g., during checkout).
+ Subsequent fetches inherit the filter via the per-remote config
+ that is written during the clone.
++
+The bare `clone.defaultObjectFilter` applies to all clones. The
+URL-qualified form `clone.<url>.defaultObjectFilter` restricts the
+setting to clones whose URL matches `<url>`, following the same
+rules as `http.<url>.*` (see linkgit:git-config[1]). The most
+specific URL match wins. You can match a domain, a namespace, or a
+specific project:
++
+----
+[clone]
+ defaultObjectFilter = blob:limit=1m
+
+[clone "https://github.com/"]
+ defaultObjectFilter = blob:limit=5m
+
+[clone "https://internal.corp.com/large-project/"]
+ defaultObjectFilter = blob:none
+----
++
+An explicit `--filter` option on the command line takes precedence
+over this config, and `--no-filter` defeats it entirely to force a
+full clone. Only affects the initial clone; it has no effect on
+later fetches into an existing repository. If the server does not
+support object filtering, the setting is silently ignored.
diff --git a/builtin/clone.c b/builtin/clone.c
index 45d8fa0eed..1207655815 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -44,6 +44,7 @@
#include "path.h"
#include "pkt-line.h"
#include "list-objects-filter-options.h"
+#include "urlmatch.h"
#include "hook.h"
#include "bundle.h"
#include "bundle-uri.h"
@@ -757,6 +758,47 @@ static int git_clone_config(const char *k, const char *v,
return git_default_config(k, v, ctx, cb);
}
+static int clone_filter_collect(const char *var, const char *value,
+ const struct config_context *ctx UNUSED,
+ void *cb)
+{
+ char **filter_spec_p = cb;
+
+ if (!strcmp(var, "clone.defaultobjectfilter")) {
+ if (!value)
+ return config_error_nonbool(var);
+ free(*filter_spec_p);
+ *filter_spec_p = xstrdup(value);
+ }
+ return 0;
+}
+
+/*
+ * Look up clone.defaultObjectFilter or clone.<url>.defaultObjectFilter
+ * using the urlmatch infrastructure. A URL-qualified entry that matches
+ * the clone URL takes precedence over the bare form, following the same
+ * rules as http.<url>.* configuration variables.
+ */
+static char *get_default_object_filter(const char *url)
+{
+ struct urlmatch_config config = URLMATCH_CONFIG_INIT;
+ char *filter_spec = NULL;
+ char *normalized_url;
+
+ config.section = "clone";
+ config.key = "defaultobjectfilter";
+ config.collect_fn = clone_filter_collect;
+ config.cb = &filter_spec;
+
+ normalized_url = url_normalize(url, &config.url);
+
+ repo_config(the_repository, urlmatch_config_entry, &config);
+ free(normalized_url);
+ urlmatch_config_release(&config);
+
+ return filter_spec;
+}
+
static int write_one_config(const char *key, const char *value,
const struct config_context *ctx,
void *data)
@@ -1057,6 +1099,14 @@ int cmd_clone(int argc,
} else
die(_("repository '%s' does not exist"), repo_name);
+ if (!filter_options.choice && !filter_options.no_filter) {
+ char *config_filter = get_default_object_filter(repo);
+ if (config_filter) {
+ parse_list_objects_filter(&filter_options, config_filter);
+ free(config_filter);
+ }
+ }
+
/* no need to be strict, transport_set_option() will validate it again */
if (option_depth && atoi(option_depth) < 1)
die(_("depth %s is not a positive number"), option_depth);
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 1e354e057f..e85d2a8ce8 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -722,6 +722,98 @@ test_expect_success 'after fetching descendants of non-promisor commits, gc work
git -C partial gc --prune=now
'
+# Test clone.<url>.defaultObjectFilter config
+
+test_expect_success 'setup for clone.defaultObjectFilter tests' '
+ git init default-filter-src &&
+ echo "small" >default-filter-src/small.txt &&
+ dd if=/dev/zero of=default-filter-src/large.bin bs=1024 count=100 2>/dev/null &&
+ git -C default-filter-src add . &&
+ git -C default-filter-src commit -m "initial" &&
+
+ git clone --bare "file://$(pwd)/default-filter-src" default-filter-srv.bare &&
+ git -C default-filter-srv.bare config --local uploadpack.allowfilter 1 &&
+ git -C default-filter-srv.bare config --local uploadpack.allowanysha1inwant 1
+'
+
+test_expect_success 'clone with clone.<url>.defaultObjectFilter applies filter' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:limit=1k" clone \
+ "$SERVER_URL" default-filter-clone &&
+
+ test "$(git -C default-filter-clone config --local remote.origin.promisor)" = "true" &&
+ test "$(git -C default-filter-clone config --local remote.origin.partialclonefilter)" = "blob:limit=1024"
+'
+
+test_expect_success 'clone with --filter overrides clone.<url>.defaultObjectFilter' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:limit=1k" \
+ clone --filter=blob:none "$SERVER_URL" default-filter-override &&
+
+ test "$(git -C default-filter-override config --local remote.origin.partialclonefilter)" = "blob:none"
+'
+
+test_expect_success 'clone with clone.<url>.defaultObjectFilter=blob:none works' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" clone \
+ "$SERVER_URL" default-filter-blobnone &&
+
+ test "$(git -C default-filter-blobnone config --local remote.origin.promisor)" = "true" &&
+ test "$(git -C default-filter-blobnone config --local remote.origin.partialclonefilter)" = "blob:none"
+'
+
+test_expect_success 'clone.<url>.defaultObjectFilter with tree:0 works' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=tree:0" clone \
+ "$SERVER_URL" default-filter-tree0 &&
+
+ test "$(git -C default-filter-tree0 config --local remote.origin.promisor)" = "true" &&
+ test "$(git -C default-filter-tree0 config --local remote.origin.partialclonefilter)" = "tree:0"
+'
+
+test_expect_success 'most specific URL match wins for clone.defaultObjectFilter' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git \
+ -c "clone.file://.defaultObjectFilter=blob:limit=1k" \
+ -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" \
+ clone "$SERVER_URL" default-filter-url-specific &&
+
+ test "$(git -C default-filter-url-specific config --local remote.origin.partialclonefilter)" = "blob:none"
+'
+
+test_expect_success 'non-matching URL does not apply clone.defaultObjectFilter' '
+ git \
+ -c "clone.https://other.example.com/.defaultObjectFilter=blob:none" \
+ clone "file://$(pwd)/default-filter-srv.bare" default-filter-url-nomatch &&
+
+ test_must_fail git -C default-filter-url-nomatch config --local remote.origin.promisor
+'
+
+test_expect_success 'bare clone.defaultObjectFilter applies to all clones' '
+ git -c clone.defaultObjectFilter=blob:none \
+ clone "file://$(pwd)/default-filter-srv.bare" default-filter-bare-key &&
+
+ test "$(git -C default-filter-bare-key config --local remote.origin.promisor)" = "true" &&
+ test "$(git -C default-filter-bare-key config --local remote.origin.partialclonefilter)" = "blob:none"
+'
+
+test_expect_success 'URL-specific clone.defaultObjectFilter overrides bare form' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git \
+ -c clone.defaultObjectFilter=blob:limit=1k \
+ -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" \
+ clone "$SERVER_URL" default-filter-url-over-bare &&
+
+ test "$(git -C default-filter-url-over-bare config --local remote.origin.partialclonefilter)" = "blob:none"
+'
+
+test_expect_success '--no-filter defeats clone.defaultObjectFilter' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" \
+ clone --no-filter "$SERVER_URL" default-filter-no-filter &&
+
+ test_must_fail git -C default-filter-no-filter config --local remote.origin.promisor
+'
. "$TEST_DIRECTORY"/lib-httpd.sh
start_httpd
base-commit: 7b2bccb0d58d4f24705bf985de1f4612e4cf06e5
--
gitgitgadget
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v3] clone: add clone.<url>.defaultObjectFilter config
2026-03-06 19:33 ` Junio C Hamano
@ 2026-03-06 21:50 ` Alan Braithwaite
0 siblings, 0 replies; 28+ messages in thread
From: Alan Braithwaite @ 2026-03-06 21:50 UTC (permalink / raw)
To: Junio C Hamano, brian m. carlson
Cc: Alan Braithwaite, git, Patrick Steinhardt, christian.couder, me,
Jeff King
> Do we have a way to defeat the configured filter to say "no
> filtering, we want everything" from the command line? If not, that
> needs to be addressed, if we were to add this configuration.
Great point, added a check for the no-filter flag and made that
override any defaultObjectFilter setting for the clone.
Thanks,
- Alan
On Fri, Mar 6, 2026, at 11:33, Junio C Hamano wrote:
> "brian m. carlson" <sandals@crustytoothpaste.net> writes:
>
>> We've historically not implemented default filtering for clones because
>> it makes it hard to reason about the behaviour of the clone command.
>> For instance, if I have a script that clones a repository, it almost
>> certainly expects a full clone unless it requested something else.
>> ...
>> We've traditionally placed this kind of customizable configuration into
>> `scalar` instead, which is designed to be configurable and set options
>> for large repositories that would want to control clone and fetch
>> options.
>
> Hmph, my knee-jerk reaction to the early part of your message was
> "oh, but isn't clone a Porcelain (admittedly without corresponding
> plumbing) whose defaults and end-user experiences are meant to be
> updated from time to time to help users?" but I didn't realize that
> we have another class, which is "scalar", these days that we can add
> these settings to. I do not have objections to add something to
> "scalar", but I personally feel that the configuration for clone is
> such a bad thing to have.
>
> Do we have a way to defeat the configured filter to say "no
> filtering, we want everything" from the command line? If not, that
> needs to be addressed, if we were to add this configuration.
>
> Thanks.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v4] clone: add clone.<url>.defaultObjectFilter config
2026-03-06 21:47 ` [PATCH v4] " Alan Braithwaite via GitGitGadget
@ 2026-03-06 22:18 ` Junio C Hamano
2026-03-07 1:04 ` Alan Braithwaite
2026-03-07 1:33 ` [PATCH v5] " Alan Braithwaite via GitGitGadget
1 sibling, 1 reply; 28+ messages in thread
From: Junio C Hamano @ 2026-03-06 22:18 UTC (permalink / raw)
To: Alan Braithwaite via GitGitGadget
Cc: git, ps, christian.couder, jonathantanmy, me, Jeff King,
brian m. carlson, Alan Braithwaite
"Alan Braithwaite via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Alan Braithwaite <alan@braithwaite.dev>
>
> Add a new configuration option that lets users specify a default
> partial clone filter per URL pattern. When cloning a repository
> whose URL matches a configured pattern, git-clone automatically
> applies the filter, equivalent to passing --filter on the command
> line.
>
> [clone "https://github.com/"]
> defaultObjectFilter = blob:limit=5m
>
> [clone "https://internal.corp.com/large-project/"]
> defaultObjectFilter = blob:none
>
> URL matching uses the existing urlmatch_config_entry() infrastructure,
> following the same rules as http.<url>.* — you can match a domain,
> a namespace path, or a specific project, and the most specific match
> wins.
>
> The config only affects the initial clone. Once the clone completes,
> the filter is recorded in remote.<name>.partialCloneFilter, so
> subsequent fetches inherit it automatically. An explicit --filter
> flag on the command line takes precedence.
>
> Only the URL-qualified form (clone.<url>.defaultObjectFilter) is
> honored; a bare clone.defaultObjectFilter without a URL subsection
> is ignored.
Is this still valid? It is inconsistent with the updated
documentation where both clone.defaultObjectFilter and
clone.<url>.defaultObjectFilter are listed.
These iterations of patches may require a bit more careful
proofreading before getting sent to the mailing list for others to
comment on, I suspect?
> Signed-off-by: Alan Braithwaite <alan@braithwaite.dev>
> ---
> ...
> +`clone.defaultObjectFilter`::
> +`clone.<url>.defaultObjectFilter`::
> + When set to a filter spec string (e.g., `blob:limit=1m`,
> + `blob:none`, `tree:0`), linkgit:git-clone[1] will automatically
> + use `--filter=<value>` to enable partial clone behavior.
> + Objects matching the filter are excluded from the initial
> + transfer and lazily fetched on demand (e.g., during checkout).
> + Subsequent fetches inherit the filter via the per-remote config
> + that is written during the clone.
> ++
> +The bare `clone.defaultObjectFilter` applies to all clones. The
> +URL-qualified form `clone.<url>.defaultObjectFilter` restricts the
> +setting to clones whose URL matches `<url>`, following the same
> +rules as `http.<url>.*` (see linkgit:git-config[1]). The most
> +specific URL match wins. You can match a domain, a namespace, or a
> +specific project:
In the test script we see a handful of lines like these
> + test "$(git -C default-filter-blobnone config --local remote.origin.promisor)" = "true" &&
> + test "$(git -C default-filter-blobnone config --local remote.origin.partialclonefilter)" = "blob:none"
added. They may have been written to mimick an existing line in a
test elsewhere, but see efforts by others like
https://lore.kernel.org/git/20260305225128.54283-1-francescopaparatto@gmail.com/
Thanks.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v4] clone: add clone.<url>.defaultObjectFilter config
2026-03-06 22:18 ` Junio C Hamano
@ 2026-03-07 1:04 ` Alan Braithwaite
0 siblings, 0 replies; 28+ messages in thread
From: Alan Braithwaite @ 2026-03-07 1:04 UTC (permalink / raw)
To: Junio C Hamano
Cc: git, Patrick Steinhardt, christian.couder, me, Jeff King,
brian m. carlson
Thanks for the careful review, Junio. You're right on both
counts. The stale commit message and the test style were
sloppy oversights that I should have caught before resubmitting.
I'll be more disciplined about reviewing the full diff
(including the commit message) against the actual behavior for
future patches. Thanks for helping out on my first patch.
The updated incoming patch addresses both issues: the commit
message now accurately describes the bare and URL-qualified
forms, and all tests use the test_cmp pattern. I'll be
submitting what I think should be the final version shortly,
but I'm happy to continue iterating if anything else looks
concerning.
Thanks,
- Alan
On Fri, Mar 6, 2026, at 14:18, Junio C Hamano wrote:
> "Alan Braithwaite via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> From: Alan Braithwaite <alan@braithwaite.dev>
>>
>> Add a new configuration option that lets users specify a default
>> partial clone filter per URL pattern. When cloning a repository
>> whose URL matches a configured pattern, git-clone automatically
>> applies the filter, equivalent to passing --filter on the command
>> line.
>>
>> [clone "https://github.com/"]
>> defaultObjectFilter = blob:limit=5m
>>
>> [clone "https://internal.corp.com/large-project/"]
>> defaultObjectFilter = blob:none
>>
>> URL matching uses the existing urlmatch_config_entry() infrastructure,
>> following the same rules as http.<url>.* — you can match a domain,
>> a namespace path, or a specific project, and the most specific match
>> wins.
>>
>> The config only affects the initial clone. Once the clone completes,
>> the filter is recorded in remote.<name>.partialCloneFilter, so
>> subsequent fetches inherit it automatically. An explicit --filter
>> flag on the command line takes precedence.
>>
>> Only the URL-qualified form (clone.<url>.defaultObjectFilter) is
>> honored; a bare clone.defaultObjectFilter without a URL subsection
>> is ignored.
>
> Is this still valid? It is inconsistent with the updated
> documentation where both clone.defaultObjectFilter and
> clone.<url>.defaultObjectFilter are listed.
>
> These iterations of patches may require a bit more careful
> proofreading before getting sent to the mailing list for others to
> comment on, I suspect?
>
>> Signed-off-by: Alan Braithwaite <alan@braithwaite.dev>
>> ---
>> ...
>> +`clone.defaultObjectFilter`::
>> +`clone.<url>.defaultObjectFilter`::
>> + When set to a filter spec string (e.g., `blob:limit=1m`,
>> + `blob:none`, `tree:0`), linkgit:git-clone[1] will automatically
>> + use `--filter=<value>` to enable partial clone behavior.
>> + Objects matching the filter are excluded from the initial
>> + transfer and lazily fetched on demand (e.g., during checkout).
>> + Subsequent fetches inherit the filter via the per-remote config
>> + that is written during the clone.
>> ++
>> +The bare `clone.defaultObjectFilter` applies to all clones. The
>> +URL-qualified form `clone.<url>.defaultObjectFilter` restricts the
>> +setting to clones whose URL matches `<url>`, following the same
>> +rules as `http.<url>.*` (see linkgit:git-config[1]). The most
>> +specific URL match wins. You can match a domain, a namespace, or a
>> +specific project:
>
>
> In the test script we see a handful of lines like these
>
>> + test "$(git -C default-filter-blobnone config --local remote.origin.promisor)" = "true" &&
>> + test "$(git -C default-filter-blobnone config --local remote.origin.partialclonefilter)" = "blob:none"
>
> added. They may have been written to mimick an existing line in a
> test elsewhere, but see efforts by others like
>
>
> https://lore.kernel.org/git/20260305225128.54283-1-francescopaparatto@gmail.com/
>
> Thanks.
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v5] clone: add clone.<url>.defaultObjectFilter config
2026-03-06 21:47 ` [PATCH v4] " Alan Braithwaite via GitGitGadget
2026-03-06 22:18 ` Junio C Hamano
@ 2026-03-07 1:33 ` Alan Braithwaite via GitGitGadget
2026-03-11 7:44 ` Patrick Steinhardt
2026-03-15 5:37 ` [PATCH v6] " Alan Braithwaite via GitGitGadget
1 sibling, 2 replies; 28+ messages in thread
From: Alan Braithwaite via GitGitGadget @ 2026-03-07 1:33 UTC (permalink / raw)
To: git
Cc: ps, christian.couder, jonathantanmy, me, gitster, Jeff King,
brian m. carlson, Alan Braithwaite, Alan Braithwaite
From: Alan Braithwaite <alan@braithwaite.dev>
Add a new configuration option that lets users specify a default
partial clone filter, optionally scoped by URL pattern. When
cloning a repository whose URL matches a configured pattern,
git-clone automatically applies the filter, equivalent to passing
--filter on the command line.
[clone]
defaultObjectFilter = blob:limit=1m
[clone "https://github.com/"]
defaultObjectFilter = blob:limit=5m
[clone "https://internal.corp.com/large-project/"]
defaultObjectFilter = blob:none
The bare clone.defaultObjectFilter applies to all clones. The
URL-qualified form clone.<url>.defaultObjectFilter restricts the
setting to matching URLs. URL matching uses the existing
urlmatch_config_entry() infrastructure, following the same rules as
http.<url>.* — a domain, namespace, or specific project can be
matched, and the most specific match wins.
The config only affects the initial clone. Once the clone completes,
the filter is recorded in remote.<name>.partialCloneFilter, so
subsequent fetches inherit it automatically. An explicit --filter
on the command line takes precedence, and --no-filter defeats the
configured default entirely.
Signed-off-by: Alan Braithwaite <alan@braithwaite.dev>
---
fetch, clone: add fetch.blobSizeLimit config
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2058%2Fabraithwaite%2Falan%2Ffetch-blob-size-limit-v5
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2058/abraithwaite/alan/fetch-blob-size-limit-v5
Pull-Request: https://github.com/gitgitgadget/git/pull/2058
Range-diff vs v4:
1: 4bf3e1ec63 ! 1: fa1ea69bdb clone: add clone.<url>.defaultObjectFilter config
@@ Commit message
clone: add clone.<url>.defaultObjectFilter config
Add a new configuration option that lets users specify a default
- partial clone filter per URL pattern. When cloning a repository
- whose URL matches a configured pattern, git-clone automatically
- applies the filter, equivalent to passing --filter on the command
- line.
+ partial clone filter, optionally scoped by URL pattern. When
+ cloning a repository whose URL matches a configured pattern,
+ git-clone automatically applies the filter, equivalent to passing
+ --filter on the command line.
+
+ [clone]
+ defaultObjectFilter = blob:limit=1m
[clone "https://github.com/"]
defaultObjectFilter = blob:limit=5m
@@ Commit message
[clone "https://internal.corp.com/large-project/"]
defaultObjectFilter = blob:none
- URL matching uses the existing urlmatch_config_entry() infrastructure,
- following the same rules as http.<url>.* — you can match a domain,
- a namespace path, or a specific project, and the most specific match
- wins.
+ The bare clone.defaultObjectFilter applies to all clones. The
+ URL-qualified form clone.<url>.defaultObjectFilter restricts the
+ setting to matching URLs. URL matching uses the existing
+ urlmatch_config_entry() infrastructure, following the same rules as
+ http.<url>.* — a domain, namespace, or specific project can be
+ matched, and the most specific match wins.
The config only affects the initial clone. Once the clone completes,
the filter is recorded in remote.<name>.partialCloneFilter, so
subsequent fetches inherit it automatically. An explicit --filter
- flag on the command line takes precedence.
-
- Only the URL-qualified form (clone.<url>.defaultObjectFilter) is
- honored; a bare clone.defaultObjectFilter without a URL subsection
- is ignored.
+ on the command line takes precedence, and --no-filter defeats the
+ configured default entirely.
Signed-off-by: Alan Braithwaite <alan@braithwaite.dev>
@@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:limit=1k" clone \
+ "$SERVER_URL" default-filter-clone &&
+
-+ test "$(git -C default-filter-clone config --local remote.origin.promisor)" = "true" &&
-+ test "$(git -C default-filter-clone config --local remote.origin.partialclonefilter)" = "blob:limit=1024"
++ echo true >expect &&
++ git -C default-filter-clone config --local remote.origin.promisor >actual &&
++ test_cmp expect actual &&
++
++ echo "blob:limit=1024" >expect &&
++ git -C default-filter-clone config --local remote.origin.partialclonefilter >actual &&
++ test_cmp expect actual
+'
+
+test_expect_success 'clone with --filter overrides clone.<url>.defaultObjectFilter' '
@@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:limit=1k" \
+ clone --filter=blob:none "$SERVER_URL" default-filter-override &&
+
-+ test "$(git -C default-filter-override config --local remote.origin.partialclonefilter)" = "blob:none"
++ echo "blob:none" >expect &&
++ git -C default-filter-override config --local remote.origin.partialclonefilter >actual &&
++ test_cmp expect actual
+'
+
+test_expect_success 'clone with clone.<url>.defaultObjectFilter=blob:none works' '
@@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" clone \
+ "$SERVER_URL" default-filter-blobnone &&
+
-+ test "$(git -C default-filter-blobnone config --local remote.origin.promisor)" = "true" &&
-+ test "$(git -C default-filter-blobnone config --local remote.origin.partialclonefilter)" = "blob:none"
++ echo true >expect &&
++ git -C default-filter-blobnone config --local remote.origin.promisor >actual &&
++ test_cmp expect actual &&
++
++ echo "blob:none" >expect &&
++ git -C default-filter-blobnone config --local remote.origin.partialclonefilter >actual &&
++ test_cmp expect actual
+'
+
+test_expect_success 'clone.<url>.defaultObjectFilter with tree:0 works' '
@@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non
+ git -c "clone.$SERVER_URL.defaultObjectFilter=tree:0" clone \
+ "$SERVER_URL" default-filter-tree0 &&
+
-+ test "$(git -C default-filter-tree0 config --local remote.origin.promisor)" = "true" &&
-+ test "$(git -C default-filter-tree0 config --local remote.origin.partialclonefilter)" = "tree:0"
++ echo true >expect &&
++ git -C default-filter-tree0 config --local remote.origin.promisor >actual &&
++ test_cmp expect actual &&
++
++ echo "tree:0" >expect &&
++ git -C default-filter-tree0 config --local remote.origin.partialclonefilter >actual &&
++ test_cmp expect actual
+'
+
+test_expect_success 'most specific URL match wins for clone.defaultObjectFilter' '
@@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non
+ -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" \
+ clone "$SERVER_URL" default-filter-url-specific &&
+
-+ test "$(git -C default-filter-url-specific config --local remote.origin.partialclonefilter)" = "blob:none"
++ echo "blob:none" >expect &&
++ git -C default-filter-url-specific config --local remote.origin.partialclonefilter >actual &&
++ test_cmp expect actual
+'
+
+test_expect_success 'non-matching URL does not apply clone.defaultObjectFilter' '
@@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non
+ git -c clone.defaultObjectFilter=blob:none \
+ clone "file://$(pwd)/default-filter-srv.bare" default-filter-bare-key &&
+
-+ test "$(git -C default-filter-bare-key config --local remote.origin.promisor)" = "true" &&
-+ test "$(git -C default-filter-bare-key config --local remote.origin.partialclonefilter)" = "blob:none"
++ echo true >expect &&
++ git -C default-filter-bare-key config --local remote.origin.promisor >actual &&
++ test_cmp expect actual &&
++
++ echo "blob:none" >expect &&
++ git -C default-filter-bare-key config --local remote.origin.partialclonefilter >actual &&
++ test_cmp expect actual
+'
+
+test_expect_success 'URL-specific clone.defaultObjectFilter overrides bare form' '
@@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non
+ -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" \
+ clone "$SERVER_URL" default-filter-url-over-bare &&
+
-+ test "$(git -C default-filter-url-over-bare config --local remote.origin.partialclonefilter)" = "blob:none"
++ echo "blob:none" >expect &&
++ git -C default-filter-url-over-bare config --local remote.origin.partialclonefilter >actual &&
++ test_cmp expect actual
+'
+
+test_expect_success '--no-filter defeats clone.defaultObjectFilter' '
Documentation/config/clone.adoc | 34 +++++++++
builtin/clone.c | 50 ++++++++++++++
t/t5616-partial-clone.sh | 118 ++++++++++++++++++++++++++++++++
3 files changed, 202 insertions(+)
diff --git a/Documentation/config/clone.adoc b/Documentation/config/clone.adoc
index 0a10efd174..1d6c0957a0 100644
--- a/Documentation/config/clone.adoc
+++ b/Documentation/config/clone.adoc
@@ -21,3 +21,37 @@ endif::[]
If a partial clone filter is provided (see `--filter` in
linkgit:git-rev-list[1]) and `--recurse-submodules` is used, also apply
the filter to submodules.
+
+`clone.defaultObjectFilter`::
+`clone.<url>.defaultObjectFilter`::
+ When set to a filter spec string (e.g., `blob:limit=1m`,
+ `blob:none`, `tree:0`), linkgit:git-clone[1] will automatically
+ use `--filter=<value>` to enable partial clone behavior.
+ Objects matching the filter are excluded from the initial
+ transfer and lazily fetched on demand (e.g., during checkout).
+ Subsequent fetches inherit the filter via the per-remote config
+ that is written during the clone.
++
+The bare `clone.defaultObjectFilter` applies to all clones. The
+URL-qualified form `clone.<url>.defaultObjectFilter` restricts the
+setting to clones whose URL matches `<url>`, following the same
+rules as `http.<url>.*` (see linkgit:git-config[1]). The most
+specific URL match wins. You can match a domain, a namespace, or a
+specific project:
++
+----
+[clone]
+ defaultObjectFilter = blob:limit=1m
+
+[clone "https://github.com/"]
+ defaultObjectFilter = blob:limit=5m
+
+[clone "https://internal.corp.com/large-project/"]
+ defaultObjectFilter = blob:none
+----
++
+An explicit `--filter` option on the command line takes precedence
+over this config, and `--no-filter` defeats it entirely to force a
+full clone. Only affects the initial clone; it has no effect on
+later fetches into an existing repository. If the server does not
+support object filtering, the setting is silently ignored.
diff --git a/builtin/clone.c b/builtin/clone.c
index 45d8fa0eed..1207655815 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -44,6 +44,7 @@
#include "path.h"
#include "pkt-line.h"
#include "list-objects-filter-options.h"
+#include "urlmatch.h"
#include "hook.h"
#include "bundle.h"
#include "bundle-uri.h"
@@ -757,6 +758,47 @@ static int git_clone_config(const char *k, const char *v,
return git_default_config(k, v, ctx, cb);
}
+static int clone_filter_collect(const char *var, const char *value,
+ const struct config_context *ctx UNUSED,
+ void *cb)
+{
+ char **filter_spec_p = cb;
+
+ if (!strcmp(var, "clone.defaultobjectfilter")) {
+ if (!value)
+ return config_error_nonbool(var);
+ free(*filter_spec_p);
+ *filter_spec_p = xstrdup(value);
+ }
+ return 0;
+}
+
+/*
+ * Look up clone.defaultObjectFilter or clone.<url>.defaultObjectFilter
+ * using the urlmatch infrastructure. A URL-qualified entry that matches
+ * the clone URL takes precedence over the bare form, following the same
+ * rules as http.<url>.* configuration variables.
+ */
+static char *get_default_object_filter(const char *url)
+{
+ struct urlmatch_config config = URLMATCH_CONFIG_INIT;
+ char *filter_spec = NULL;
+ char *normalized_url;
+
+ config.section = "clone";
+ config.key = "defaultobjectfilter";
+ config.collect_fn = clone_filter_collect;
+ config.cb = &filter_spec;
+
+ normalized_url = url_normalize(url, &config.url);
+
+ repo_config(the_repository, urlmatch_config_entry, &config);
+ free(normalized_url);
+ urlmatch_config_release(&config);
+
+ return filter_spec;
+}
+
static int write_one_config(const char *key, const char *value,
const struct config_context *ctx,
void *data)
@@ -1057,6 +1099,14 @@ int cmd_clone(int argc,
} else
die(_("repository '%s' does not exist"), repo_name);
+ if (!filter_options.choice && !filter_options.no_filter) {
+ char *config_filter = get_default_object_filter(repo);
+ if (config_filter) {
+ parse_list_objects_filter(&filter_options, config_filter);
+ free(config_filter);
+ }
+ }
+
/* no need to be strict, transport_set_option() will validate it again */
if (option_depth && atoi(option_depth) < 1)
die(_("depth %s is not a positive number"), option_depth);
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 1e354e057f..1254901f3e 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -722,6 +722,124 @@ test_expect_success 'after fetching descendants of non-promisor commits, gc work
git -C partial gc --prune=now
'
+# Test clone.<url>.defaultObjectFilter config
+
+test_expect_success 'setup for clone.defaultObjectFilter tests' '
+ git init default-filter-src &&
+ echo "small" >default-filter-src/small.txt &&
+ dd if=/dev/zero of=default-filter-src/large.bin bs=1024 count=100 2>/dev/null &&
+ git -C default-filter-src add . &&
+ git -C default-filter-src commit -m "initial" &&
+
+ git clone --bare "file://$(pwd)/default-filter-src" default-filter-srv.bare &&
+ git -C default-filter-srv.bare config --local uploadpack.allowfilter 1 &&
+ git -C default-filter-srv.bare config --local uploadpack.allowanysha1inwant 1
+'
+
+test_expect_success 'clone with clone.<url>.defaultObjectFilter applies filter' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:limit=1k" clone \
+ "$SERVER_URL" default-filter-clone &&
+
+ echo true >expect &&
+ git -C default-filter-clone config --local remote.origin.promisor >actual &&
+ test_cmp expect actual &&
+
+ echo "blob:limit=1024" >expect &&
+ git -C default-filter-clone config --local remote.origin.partialclonefilter >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'clone with --filter overrides clone.<url>.defaultObjectFilter' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:limit=1k" \
+ clone --filter=blob:none "$SERVER_URL" default-filter-override &&
+
+ echo "blob:none" >expect &&
+ git -C default-filter-override config --local remote.origin.partialclonefilter >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'clone with clone.<url>.defaultObjectFilter=blob:none works' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" clone \
+ "$SERVER_URL" default-filter-blobnone &&
+
+ echo true >expect &&
+ git -C default-filter-blobnone config --local remote.origin.promisor >actual &&
+ test_cmp expect actual &&
+
+ echo "blob:none" >expect &&
+ git -C default-filter-blobnone config --local remote.origin.partialclonefilter >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'clone.<url>.defaultObjectFilter with tree:0 works' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=tree:0" clone \
+ "$SERVER_URL" default-filter-tree0 &&
+
+ echo true >expect &&
+ git -C default-filter-tree0 config --local remote.origin.promisor >actual &&
+ test_cmp expect actual &&
+
+ echo "tree:0" >expect &&
+ git -C default-filter-tree0 config --local remote.origin.partialclonefilter >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'most specific URL match wins for clone.defaultObjectFilter' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git \
+ -c "clone.file://.defaultObjectFilter=blob:limit=1k" \
+ -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" \
+ clone "$SERVER_URL" default-filter-url-specific &&
+
+ echo "blob:none" >expect &&
+ git -C default-filter-url-specific config --local remote.origin.partialclonefilter >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'non-matching URL does not apply clone.defaultObjectFilter' '
+ git \
+ -c "clone.https://other.example.com/.defaultObjectFilter=blob:none" \
+ clone "file://$(pwd)/default-filter-srv.bare" default-filter-url-nomatch &&
+
+ test_must_fail git -C default-filter-url-nomatch config --local remote.origin.promisor
+'
+
+test_expect_success 'bare clone.defaultObjectFilter applies to all clones' '
+ git -c clone.defaultObjectFilter=blob:none \
+ clone "file://$(pwd)/default-filter-srv.bare" default-filter-bare-key &&
+
+ echo true >expect &&
+ git -C default-filter-bare-key config --local remote.origin.promisor >actual &&
+ test_cmp expect actual &&
+
+ echo "blob:none" >expect &&
+ git -C default-filter-bare-key config --local remote.origin.partialclonefilter >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'URL-specific clone.defaultObjectFilter overrides bare form' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git \
+ -c clone.defaultObjectFilter=blob:limit=1k \
+ -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" \
+ clone "$SERVER_URL" default-filter-url-over-bare &&
+
+ echo "blob:none" >expect &&
+ git -C default-filter-url-over-bare config --local remote.origin.partialclonefilter >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success '--no-filter defeats clone.defaultObjectFilter' '
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" \
+ clone --no-filter "$SERVER_URL" default-filter-no-filter &&
+
+ test_must_fail git -C default-filter-no-filter config --local remote.origin.promisor
+'
. "$TEST_DIRECTORY"/lib-httpd.sh
start_httpd
base-commit: 7b2bccb0d58d4f24705bf985de1f4612e4cf06e5
--
gitgitgadget
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v5] clone: add clone.<url>.defaultObjectFilter config
2026-03-07 1:33 ` [PATCH v5] " Alan Braithwaite via GitGitGadget
@ 2026-03-11 7:44 ` Patrick Steinhardt
2026-03-15 1:33 ` Alan Braithwaite
2026-03-15 5:37 ` [PATCH v6] " Alan Braithwaite via GitGitGadget
1 sibling, 1 reply; 28+ messages in thread
From: Patrick Steinhardt @ 2026-03-11 7:44 UTC (permalink / raw)
To: Alan Braithwaite via GitGitGadget
Cc: git, christian.couder, jonathantanmy, me, gitster, Jeff King,
brian m. carlson, Alan Braithwaite
On Sat, Mar 07, 2026 at 01:33:56AM +0000, Alan Braithwaite via GitGitGadget wrote:
> diff --git a/Documentation/config/clone.adoc b/Documentation/config/clone.adoc
> index 0a10efd174..1d6c0957a0 100644
> --- a/Documentation/config/clone.adoc
> +++ b/Documentation/config/clone.adoc
> @@ -21,3 +21,37 @@ endif::[]
> If a partial clone filter is provided (see `--filter` in
> linkgit:git-rev-list[1]) and `--recurse-submodules` is used, also apply
> the filter to submodules.
> +
> +`clone.defaultObjectFilter`::
> +`clone.<url>.defaultObjectFilter`::
> + When set to a filter spec string (e.g., `blob:limit=1m`,
> + `blob:none`, `tree:0`), linkgit:git-clone[1] will automatically
> + use `--filter=<value>` to enable partial clone behavior.
> + Objects matching the filter are excluded from the initial
> + transfer and lazily fetched on demand (e.g., during checkout).
> + Subsequent fetches inherit the filter via the per-remote config
> + that is written during the clone.
> ++
> +The bare `clone.defaultObjectFilter` applies to all clones. The
> +URL-qualified form `clone.<url>.defaultObjectFilter` restricts the
> +setting to clones whose URL matches `<url>`, following the same
> +rules as `http.<url>.*` (see linkgit:git-config[1]). The most
> +specific URL match wins. You can match a domain, a namespace, or a
> +specific project:
> ++
> +----
> +[clone]
> + defaultObjectFilter = blob:limit=1m
> +
> +[clone "https://github.com/"]
> + defaultObjectFilter = blob:limit=5m
> +
> +[clone "https://internal.corp.com/large-project/"]
> + defaultObjectFilter = blob:none
> +----
> ++
> +An explicit `--filter` option on the command line takes precedence
> +over this config, and `--no-filter` defeats it entirely to force a
> +full clone. Only affects the initial clone; it has no effect on
> +later fetches into an existing repository. If the server does not
> +support object filtering, the setting is silently ignored.
This all reads good to me.
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 45d8fa0eed..1207655815 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -757,6 +758,47 @@ static int git_clone_config(const char *k, const char *v,
> return git_default_config(k, v, ctx, cb);
> }
>
> +static int clone_filter_collect(const char *var, const char *value,
> + const struct config_context *ctx UNUSED,
> + void *cb)
> +{
> + char **filter_spec_p = cb;
> +
> + if (!strcmp(var, "clone.defaultobjectfilter")) {
> + if (!value)
> + return config_error_nonbool(var);
> + free(*filter_spec_p);
> + *filter_spec_p = xstrdup(value);
> + }
> + return 0;
> +}
> +
> +/*
> + * Look up clone.defaultObjectFilter or clone.<url>.defaultObjectFilter
> + * using the urlmatch infrastructure. A URL-qualified entry that matches
> + * the clone URL takes precedence over the bare form, following the same
> + * rules as http.<url>.* configuration variables.
> + */
> +static char *get_default_object_filter(const char *url)
> +{
> + struct urlmatch_config config = URLMATCH_CONFIG_INIT;
> + char *filter_spec = NULL;
> + char *normalized_url;
> +
> + config.section = "clone";
> + config.key = "defaultobjectfilter";
> + config.collect_fn = clone_filter_collect;
> + config.cb = &filter_spec;
> +
> + normalized_url = url_normalize(url, &config.url);
`url_normalize()` will return a `NULL` pointer in case it cannot parse
the URL. We need to be prepared for this, otherwise we might segfault.
I guess the best route is to simply ignore the URL in that case --
otherwise, we would always error out in case the remote has a weird URL
configured.
> diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
> index 1e354e057f..1254901f3e 100755
> --- a/t/t5616-partial-clone.sh
> +++ b/t/t5616-partial-clone.sh
> @@ -722,6 +722,124 @@ test_expect_success 'after fetching descendants of non-promisor commits, gc work
> git -C partial gc --prune=now
> '
>
> +# Test clone.<url>.defaultObjectFilter config
> +
> +test_expect_success 'setup for clone.defaultObjectFilter tests' '
> + git init default-filter-src &&
> + echo "small" >default-filter-src/small.txt &&
> + dd if=/dev/zero of=default-filter-src/large.bin bs=1024 count=100 2>/dev/null &&
> + git -C default-filter-src add . &&
> + git -C default-filter-src commit -m "initial" &&
> +
> + git clone --bare "file://$(pwd)/default-filter-src" default-filter-srv.bare &&
> + git -C default-filter-srv.bare config --local uploadpack.allowfilter 1 &&
> + git -C default-filter-srv.bare config --local uploadpack.allowanysha1inwant 1
> +'
> +
> +test_expect_success 'clone with clone.<url>.defaultObjectFilter applies filter' '
> + SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
> + git -c "clone.$SERVER_URL.defaultObjectFilter=blob:limit=1k" clone \
> + "$SERVER_URL" default-filter-clone &&
Do we want to "test_when_finished rm -rf default-filter-clone" here and
for all the subsequent tests?
Patrick
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v5] clone: add clone.<url>.defaultObjectFilter config
2026-03-11 7:44 ` Patrick Steinhardt
@ 2026-03-15 1:33 ` Alan Braithwaite
0 siblings, 0 replies; 28+ messages in thread
From: Alan Braithwaite @ 2026-03-15 1:33 UTC (permalink / raw)
To: Patrick Steinhardt, Alan Braithwaite
Cc: git, christian.couder, jonathantanmy, me, Junio C Hamano,
Jeff King, brian m. carlson
Thanks for the review, Patrick.
> `url_normalize()` will return a `NULL` pointer in case
> it cannot parse the URL. We need to be prepared for
> this, otherwise we might segfault.
Good catch. The updated patch guards on the return value
and skips the urlmatch lookup entirely when the URL cannot
be normalized. Today `match_urls()` happens to handle
this safely (it returns 0 when `url->url` is NULL), but an
explicit NULL check guards against future regressions in
that code path.
> Do we want to "test_when_finished rm -rf
> default-filter-clone" here and for all the subsequent
> tests?
Done -- added `test_when_finished` cleanup to each test.
Patch incoming. :)
Thanks,
- Alan
On Wed, Mar 11, 2026, at 00:44, Patrick Steinhardt wrote:
> On Sat, Mar 07, 2026 at 01:33:56AM +0000, Alan Braithwaite via
> GitGitGadget wrote:
>> diff --git a/Documentation/config/clone.adoc b/Documentation/config/clone.adoc
>> index 0a10efd174..1d6c0957a0 100644
>> --- a/Documentation/config/clone.adoc
>> +++ b/Documentation/config/clone.adoc
>> @@ -21,3 +21,37 @@ endif::[]
>> If a partial clone filter is provided (see `--filter` in
>> linkgit:git-rev-list[1]) and `--recurse-submodules` is used, also apply
>> the filter to submodules.
>> +
>> +`clone.defaultObjectFilter`::
>> +`clone.<url>.defaultObjectFilter`::
>> + When set to a filter spec string (e.g., `blob:limit=1m`,
>> + `blob:none`, `tree:0`), linkgit:git-clone[1] will automatically
>> + use `--filter=<value>` to enable partial clone behavior.
>> + Objects matching the filter are excluded from the initial
>> + transfer and lazily fetched on demand (e.g., during checkout).
>> + Subsequent fetches inherit the filter via the per-remote config
>> + that is written during the clone.
>> ++
>> +The bare `clone.defaultObjectFilter` applies to all clones. The
>> +URL-qualified form `clone.<url>.defaultObjectFilter` restricts the
>> +setting to clones whose URL matches `<url>`, following the same
>> +rules as `http.<url>.*` (see linkgit:git-config[1]). The most
>> +specific URL match wins. You can match a domain, a namespace, or a
>> +specific project:
>> ++
>> +----
>> +[clone]
>> + defaultObjectFilter = blob:limit=1m
>> +
>> +[clone "https://github.com/"]
>> + defaultObjectFilter = blob:limit=5m
>> +
>> +[clone "https://internal.corp.com/large-project/"]
>> + defaultObjectFilter = blob:none
>> +----
>> ++
>> +An explicit `--filter` option on the command line takes precedence
>> +over this config, and `--no-filter` defeats it entirely to force a
>> +full clone. Only affects the initial clone; it has no effect on
>> +later fetches into an existing repository. If the server does not
>> +support object filtering, the setting is silently ignored.
>
> This all reads good to me.
>
>> diff --git a/builtin/clone.c b/builtin/clone.c
>> index 45d8fa0eed..1207655815 100644
>> --- a/builtin/clone.c
>> +++ b/builtin/clone.c
>> @@ -757,6 +758,47 @@ static int git_clone_config(const char *k, const char *v,
>> return git_default_config(k, v, ctx, cb);
>> }
>>
>> +static int clone_filter_collect(const char *var, const char *value,
>> + const struct config_context *ctx UNUSED,
>> + void *cb)
>> +{
>> + char **filter_spec_p = cb;
>> +
>> + if (!strcmp(var, "clone.defaultobjectfilter")) {
>> + if (!value)
>> + return config_error_nonbool(var);
>> + free(*filter_spec_p);
>> + *filter_spec_p = xstrdup(value);
>> + }
>> + return 0;
>> +}
>> +
>> +/*
>> + * Look up clone.defaultObjectFilter or clone.<url>.defaultObjectFilter
>> + * using the urlmatch infrastructure. A URL-qualified entry that matches
>> + * the clone URL takes precedence over the bare form, following the same
>> + * rules as http.<url>.* configuration variables.
>> + */
>> +static char *get_default_object_filter(const char *url)
>> +{
>> + struct urlmatch_config config = URLMATCH_CONFIG_INIT;
>> + char *filter_spec = NULL;
>> + char *normalized_url;
>> +
>> + config.section = "clone";
>> + config.key = "defaultobjectfilter";
>> + config.collect_fn = clone_filter_collect;
>> + config.cb = &filter_spec;
>> +
>> + normalized_url = url_normalize(url, &config.url);
>
> `url_normalize()` will return a `NULL` pointer in case it cannot parse
> the URL. We need to be prepared for this, otherwise we might segfault.
> I guess the best route is to simply ignore the URL in that case --
> otherwise, we would always error out in case the remote has a weird URL
> configured.
>
>> diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
>> index 1e354e057f..1254901f3e 100755
>> --- a/t/t5616-partial-clone.sh
>> +++ b/t/t5616-partial-clone.sh
>> @@ -722,6 +722,124 @@ test_expect_success 'after fetching descendants of non-promisor commits, gc work
>> git -C partial gc --prune=now
>> '
>>
>> +# Test clone.<url>.defaultObjectFilter config
>> +
>> +test_expect_success 'setup for clone.defaultObjectFilter tests' '
>> + git init default-filter-src &&
>> + echo "small" >default-filter-src/small.txt &&
>> + dd if=/dev/zero of=default-filter-src/large.bin bs=1024 count=100 2>/dev/null &&
>> + git -C default-filter-src add . &&
>> + git -C default-filter-src commit -m "initial" &&
>> +
>> + git clone --bare "file://$(pwd)/default-filter-src" default-filter-srv.bare &&
>> + git -C default-filter-srv.bare config --local uploadpack.allowfilter 1 &&
>> + git -C default-filter-srv.bare config --local uploadpack.allowanysha1inwant 1
>> +'
>> +
>> +test_expect_success 'clone with clone.<url>.defaultObjectFilter applies filter' '
>> + SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
>> + git -c "clone.$SERVER_URL.defaultObjectFilter=blob:limit=1k" clone \
>> + "$SERVER_URL" default-filter-clone &&
>
> Do we want to "test_when_finished rm -rf default-filter-clone" here and
> for all the subsequent tests?
>
> Patrick
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v6] clone: add clone.<url>.defaultObjectFilter config
2026-03-07 1:33 ` [PATCH v5] " Alan Braithwaite via GitGitGadget
2026-03-11 7:44 ` Patrick Steinhardt
@ 2026-03-15 5:37 ` Alan Braithwaite via GitGitGadget
2026-03-15 21:32 ` Junio C Hamano
2026-03-16 7:47 ` Patrick Steinhardt
1 sibling, 2 replies; 28+ messages in thread
From: Alan Braithwaite via GitGitGadget @ 2026-03-15 5:37 UTC (permalink / raw)
To: git
Cc: ps, christian.couder, jonathantanmy, me, gitster, Jeff King,
brian m. carlson, Alan Braithwaite, Alan Braithwaite
From: Alan Braithwaite <alan@braithwaite.dev>
Add a new configuration option that lets users specify a default
partial clone filter, optionally scoped by URL pattern. When
cloning a repository whose URL matches a configured pattern,
git-clone automatically applies the filter, equivalent to passing
--filter on the command line.
[clone]
defaultObjectFilter = blob:limit=1m
[clone "https://github.com/"]
defaultObjectFilter = blob:limit=5m
[clone "https://internal.corp.com/large-project/"]
defaultObjectFilter = blob:none
The bare clone.defaultObjectFilter applies to all clones. The
URL-qualified form clone.<url>.defaultObjectFilter restricts the
setting to matching URLs. URL matching uses the existing
urlmatch_config_entry() infrastructure, following the same rules as
http.<url>.* — a domain, namespace, or specific project can be
matched, and the most specific match wins.
The config only affects the initial clone. Once the clone completes,
the filter is recorded in remote.<name>.partialCloneFilter, so
subsequent fetches inherit it automatically. An explicit --filter
on the command line takes precedence, and --no-filter defeats the
configured default entirely.
Signed-off-by: Alan Braithwaite <alan@braithwaite.dev>
---
fetch, clone: add fetch.blobSizeLimit config
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2058%2Fabraithwaite%2Falan%2Ffetch-blob-size-limit-v6
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2058/abraithwaite/alan/fetch-blob-size-limit-v6
Pull-Request: https://github.com/gitgitgadget/git/pull/2058
Range-diff vs v5:
1: fa1ea69bdb ! 1: 480453b2e7 clone: add clone.<url>.defaultObjectFilter config
@@ builtin/clone.c: static int git_clone_config(const char *k, const char *v,
+ config.cb = &filter_spec;
+
+ normalized_url = url_normalize(url, &config.url);
++ if (!normalized_url) {
++ urlmatch_config_release(&config);
++ return NULL;
++ }
+
+ repo_config(the_repository, urlmatch_config_entry, &config);
+ free(normalized_url);
@@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non
+test_expect_success 'setup for clone.defaultObjectFilter tests' '
+ git init default-filter-src &&
+ echo "small" >default-filter-src/small.txt &&
-+ dd if=/dev/zero of=default-filter-src/large.bin bs=1024 count=100 2>/dev/null &&
+ git -C default-filter-src add . &&
+ git -C default-filter-src commit -m "initial" &&
+
@@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non
+'
+
+test_expect_success 'clone with clone.<url>.defaultObjectFilter applies filter' '
++ test_when_finished "rm -r default-filter-clone" &&
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:limit=1k" clone \
+ "$SERVER_URL" default-filter-clone &&
@@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non
+'
+
+test_expect_success 'clone with --filter overrides clone.<url>.defaultObjectFilter' '
++ test_when_finished "rm -r default-filter-override" &&
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:limit=1k" \
+ clone --filter=blob:none "$SERVER_URL" default-filter-override &&
@@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non
+'
+
+test_expect_success 'clone with clone.<url>.defaultObjectFilter=blob:none works' '
++ test_when_finished "rm -r default-filter-blobnone" &&
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" clone \
+ "$SERVER_URL" default-filter-blobnone &&
@@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non
+'
+
+test_expect_success 'clone.<url>.defaultObjectFilter with tree:0 works' '
++ test_when_finished "rm -r default-filter-tree0" &&
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=tree:0" clone \
+ "$SERVER_URL" default-filter-tree0 &&
@@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non
+'
+
+test_expect_success 'most specific URL match wins for clone.defaultObjectFilter' '
++ test_when_finished "rm -r default-filter-url-specific" &&
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git \
+ -c "clone.file://.defaultObjectFilter=blob:limit=1k" \
@@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non
+'
+
+test_expect_success 'non-matching URL does not apply clone.defaultObjectFilter' '
++ test_when_finished "rm -r default-filter-url-nomatch" &&
+ git \
+ -c "clone.https://other.example.com/.defaultObjectFilter=blob:none" \
+ clone "file://$(pwd)/default-filter-srv.bare" default-filter-url-nomatch &&
@@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non
+'
+
+test_expect_success 'bare clone.defaultObjectFilter applies to all clones' '
++ test_when_finished "rm -r default-filter-bare-key" &&
+ git -c clone.defaultObjectFilter=blob:none \
+ clone "file://$(pwd)/default-filter-srv.bare" default-filter-bare-key &&
+
@@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non
+'
+
+test_expect_success 'URL-specific clone.defaultObjectFilter overrides bare form' '
++ test_when_finished "rm -r default-filter-url-over-bare" &&
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git \
+ -c clone.defaultObjectFilter=blob:limit=1k \
@@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non
+'
+
+test_expect_success '--no-filter defeats clone.defaultObjectFilter' '
++ test_when_finished "rm -r default-filter-no-filter" &&
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" \
+ clone --no-filter "$SERVER_URL" default-filter-no-filter &&
Documentation/config/clone.adoc | 34 +++++++++
builtin/clone.c | 54 ++++++++++++++
t/t5616-partial-clone.sh | 126 ++++++++++++++++++++++++++++++++
3 files changed, 214 insertions(+)
diff --git a/Documentation/config/clone.adoc b/Documentation/config/clone.adoc
index 0a10efd174..1d6c0957a0 100644
--- a/Documentation/config/clone.adoc
+++ b/Documentation/config/clone.adoc
@@ -21,3 +21,37 @@ endif::[]
If a partial clone filter is provided (see `--filter` in
linkgit:git-rev-list[1]) and `--recurse-submodules` is used, also apply
the filter to submodules.
+
+`clone.defaultObjectFilter`::
+`clone.<url>.defaultObjectFilter`::
+ When set to a filter spec string (e.g., `blob:limit=1m`,
+ `blob:none`, `tree:0`), linkgit:git-clone[1] will automatically
+ use `--filter=<value>` to enable partial clone behavior.
+ Objects matching the filter are excluded from the initial
+ transfer and lazily fetched on demand (e.g., during checkout).
+ Subsequent fetches inherit the filter via the per-remote config
+ that is written during the clone.
++
+The bare `clone.defaultObjectFilter` applies to all clones. The
+URL-qualified form `clone.<url>.defaultObjectFilter` restricts the
+setting to clones whose URL matches `<url>`, following the same
+rules as `http.<url>.*` (see linkgit:git-config[1]). The most
+specific URL match wins. You can match a domain, a namespace, or a
+specific project:
++
+----
+[clone]
+ defaultObjectFilter = blob:limit=1m
+
+[clone "https://github.com/"]
+ defaultObjectFilter = blob:limit=5m
+
+[clone "https://internal.corp.com/large-project/"]
+ defaultObjectFilter = blob:none
+----
++
+An explicit `--filter` option on the command line takes precedence
+over this config, and `--no-filter` defeats it entirely to force a
+full clone. Only affects the initial clone; it has no effect on
+later fetches into an existing repository. If the server does not
+support object filtering, the setting is silently ignored.
diff --git a/builtin/clone.c b/builtin/clone.c
index 45d8fa0eed..18316a7da9 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -44,6 +44,7 @@
#include "path.h"
#include "pkt-line.h"
#include "list-objects-filter-options.h"
+#include "urlmatch.h"
#include "hook.h"
#include "bundle.h"
#include "bundle-uri.h"
@@ -757,6 +758,51 @@ static int git_clone_config(const char *k, const char *v,
return git_default_config(k, v, ctx, cb);
}
+static int clone_filter_collect(const char *var, const char *value,
+ const struct config_context *ctx UNUSED,
+ void *cb)
+{
+ char **filter_spec_p = cb;
+
+ if (!strcmp(var, "clone.defaultobjectfilter")) {
+ if (!value)
+ return config_error_nonbool(var);
+ free(*filter_spec_p);
+ *filter_spec_p = xstrdup(value);
+ }
+ return 0;
+}
+
+/*
+ * Look up clone.defaultObjectFilter or clone.<url>.defaultObjectFilter
+ * using the urlmatch infrastructure. A URL-qualified entry that matches
+ * the clone URL takes precedence over the bare form, following the same
+ * rules as http.<url>.* configuration variables.
+ */
+static char *get_default_object_filter(const char *url)
+{
+ struct urlmatch_config config = URLMATCH_CONFIG_INIT;
+ char *filter_spec = NULL;
+ char *normalized_url;
+
+ config.section = "clone";
+ config.key = "defaultobjectfilter";
+ config.collect_fn = clone_filter_collect;
+ config.cb = &filter_spec;
+
+ normalized_url = url_normalize(url, &config.url);
+ if (!normalized_url) {
+ urlmatch_config_release(&config);
+ return NULL;
+ }
+
+ repo_config(the_repository, urlmatch_config_entry, &config);
+ free(normalized_url);
+ urlmatch_config_release(&config);
+
+ return filter_spec;
+}
+
static int write_one_config(const char *key, const char *value,
const struct config_context *ctx,
void *data)
@@ -1057,6 +1103,14 @@ int cmd_clone(int argc,
} else
die(_("repository '%s' does not exist"), repo_name);
+ if (!filter_options.choice && !filter_options.no_filter) {
+ char *config_filter = get_default_object_filter(repo);
+ if (config_filter) {
+ parse_list_objects_filter(&filter_options, config_filter);
+ free(config_filter);
+ }
+ }
+
/* no need to be strict, transport_set_option() will validate it again */
if (option_depth && atoi(option_depth) < 1)
die(_("depth %s is not a positive number"), option_depth);
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 1e354e057f..e8cf5e353a 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -722,6 +722,132 @@ test_expect_success 'after fetching descendants of non-promisor commits, gc work
git -C partial gc --prune=now
'
+# Test clone.<url>.defaultObjectFilter config
+
+test_expect_success 'setup for clone.defaultObjectFilter tests' '
+ git init default-filter-src &&
+ echo "small" >default-filter-src/small.txt &&
+ git -C default-filter-src add . &&
+ git -C default-filter-src commit -m "initial" &&
+
+ git clone --bare "file://$(pwd)/default-filter-src" default-filter-srv.bare &&
+ git -C default-filter-srv.bare config --local uploadpack.allowfilter 1 &&
+ git -C default-filter-srv.bare config --local uploadpack.allowanysha1inwant 1
+'
+
+test_expect_success 'clone with clone.<url>.defaultObjectFilter applies filter' '
+ test_when_finished "rm -r default-filter-clone" &&
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:limit=1k" clone \
+ "$SERVER_URL" default-filter-clone &&
+
+ echo true >expect &&
+ git -C default-filter-clone config --local remote.origin.promisor >actual &&
+ test_cmp expect actual &&
+
+ echo "blob:limit=1024" >expect &&
+ git -C default-filter-clone config --local remote.origin.partialclonefilter >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'clone with --filter overrides clone.<url>.defaultObjectFilter' '
+ test_when_finished "rm -r default-filter-override" &&
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:limit=1k" \
+ clone --filter=blob:none "$SERVER_URL" default-filter-override &&
+
+ echo "blob:none" >expect &&
+ git -C default-filter-override config --local remote.origin.partialclonefilter >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'clone with clone.<url>.defaultObjectFilter=blob:none works' '
+ test_when_finished "rm -r default-filter-blobnone" &&
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" clone \
+ "$SERVER_URL" default-filter-blobnone &&
+
+ echo true >expect &&
+ git -C default-filter-blobnone config --local remote.origin.promisor >actual &&
+ test_cmp expect actual &&
+
+ echo "blob:none" >expect &&
+ git -C default-filter-blobnone config --local remote.origin.partialclonefilter >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'clone.<url>.defaultObjectFilter with tree:0 works' '
+ test_when_finished "rm -r default-filter-tree0" &&
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=tree:0" clone \
+ "$SERVER_URL" default-filter-tree0 &&
+
+ echo true >expect &&
+ git -C default-filter-tree0 config --local remote.origin.promisor >actual &&
+ test_cmp expect actual &&
+
+ echo "tree:0" >expect &&
+ git -C default-filter-tree0 config --local remote.origin.partialclonefilter >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'most specific URL match wins for clone.defaultObjectFilter' '
+ test_when_finished "rm -r default-filter-url-specific" &&
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git \
+ -c "clone.file://.defaultObjectFilter=blob:limit=1k" \
+ -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" \
+ clone "$SERVER_URL" default-filter-url-specific &&
+
+ echo "blob:none" >expect &&
+ git -C default-filter-url-specific config --local remote.origin.partialclonefilter >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'non-matching URL does not apply clone.defaultObjectFilter' '
+ test_when_finished "rm -r default-filter-url-nomatch" &&
+ git \
+ -c "clone.https://other.example.com/.defaultObjectFilter=blob:none" \
+ clone "file://$(pwd)/default-filter-srv.bare" default-filter-url-nomatch &&
+
+ test_must_fail git -C default-filter-url-nomatch config --local remote.origin.promisor
+'
+
+test_expect_success 'bare clone.defaultObjectFilter applies to all clones' '
+ test_when_finished "rm -r default-filter-bare-key" &&
+ git -c clone.defaultObjectFilter=blob:none \
+ clone "file://$(pwd)/default-filter-srv.bare" default-filter-bare-key &&
+
+ echo true >expect &&
+ git -C default-filter-bare-key config --local remote.origin.promisor >actual &&
+ test_cmp expect actual &&
+
+ echo "blob:none" >expect &&
+ git -C default-filter-bare-key config --local remote.origin.partialclonefilter >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'URL-specific clone.defaultObjectFilter overrides bare form' '
+ test_when_finished "rm -r default-filter-url-over-bare" &&
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git \
+ -c clone.defaultObjectFilter=blob:limit=1k \
+ -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" \
+ clone "$SERVER_URL" default-filter-url-over-bare &&
+
+ echo "blob:none" >expect &&
+ git -C default-filter-url-over-bare config --local remote.origin.partialclonefilter >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success '--no-filter defeats clone.defaultObjectFilter' '
+ test_when_finished "rm -r default-filter-no-filter" &&
+ SERVER_URL="file://$(pwd)/default-filter-srv.bare" &&
+ git -c "clone.$SERVER_URL.defaultObjectFilter=blob:none" \
+ clone --no-filter "$SERVER_URL" default-filter-no-filter &&
+
+ test_must_fail git -C default-filter-no-filter config --local remote.origin.promisor
+'
. "$TEST_DIRECTORY"/lib-httpd.sh
start_httpd
base-commit: 7b2bccb0d58d4f24705bf985de1f4612e4cf06e5
--
gitgitgadget
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v6] clone: add clone.<url>.defaultObjectFilter config
2026-03-15 5:37 ` [PATCH v6] " Alan Braithwaite via GitGitGadget
@ 2026-03-15 21:32 ` Junio C Hamano
2026-03-16 7:47 ` Patrick Steinhardt
1 sibling, 0 replies; 28+ messages in thread
From: Junio C Hamano @ 2026-03-15 21:32 UTC (permalink / raw)
To: Alan Braithwaite via GitGitGadget
Cc: git, ps, christian.couder, jonathantanmy, me, Jeff King,
brian m. carlson, Alan Braithwaite
"Alan Braithwaite via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Alan Braithwaite <alan@braithwaite.dev>
>
> Add a new configuration option that lets users specify a default
> partial clone filter, optionally scoped by URL pattern. When
> cloning a repository whose URL matches a configured pattern,
> git-clone automatically applies the filter, equivalent to passing
> --filter on the command line.
>
> [clone]
> defaultObjectFilter = blob:limit=1m
>
> [clone "https://github.com/"]
> defaultObjectFilter = blob:limit=5m
>
> [clone "https://internal.corp.com/large-project/"]
> defaultObjectFilter = blob:none
>
> The bare clone.defaultObjectFilter applies to all clones. The
> URL-qualified form clone.<url>.defaultObjectFilter restricts the
> setting to matching URLs. URL matching uses the existing
> urlmatch_config_entry() infrastructure, following the same rules as
> http.<url>.* — a domain, namespace, or specific project can be
> matched, and the most specific match wins.
>
> The config only affects the initial clone. Once the clone completes,
> the filter is recorded in remote.<name>.partialCloneFilter, so
> subsequent fetches inherit it automatically. An explicit --filter
> on the command line takes precedence, and --no-filter defeats the
> configured default entirely.
>
> Signed-off-by: Alan Braithwaite <alan@braithwaite.dev>
> ---
> fetch, clone: add fetch.blobSizeLimit config
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2058%2Fabraithwaite%2Falan%2Ffetch-blob-size-limit-v6
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2058/abraithwaite/alan/fetch-blob-size-limit-v6
> Pull-Request: https://github.com/gitgitgadget/git/pull/2058
I as a bistander reviewer would have appreciated some mention of
where some changes relative to the previous iteration came from.
E.g., check for !normalized_url case is from a realization that
url_normaize() can return NULL. Use of test_when_finished all of
the place is to clean cruft after each test did its thing.
What I am most unsure about is what the removal of "large.bin" in a
test is about. What was it trying to achieve by having the file
that weighs 100kB, and what was the reason the file got removed (is
it because whatever the presence of the file was trying to verify in
the previous iteration is already checked by other means and if so
what is it? Or is it something else?).
Mechanically generated range-diff alone does not answer questions
like the above.
Other than the "dd" thing, everything is looking good.
Will replace. Thanks.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v6] clone: add clone.<url>.defaultObjectFilter config
2026-03-15 5:37 ` [PATCH v6] " Alan Braithwaite via GitGitGadget
2026-03-15 21:32 ` Junio C Hamano
@ 2026-03-16 7:47 ` Patrick Steinhardt
1 sibling, 0 replies; 28+ messages in thread
From: Patrick Steinhardt @ 2026-03-16 7:47 UTC (permalink / raw)
To: Alan Braithwaite via GitGitGadget
Cc: git, christian.couder, jonathantanmy, me, gitster, Jeff King,
brian m. carlson, Alan Braithwaite
On Sun, Mar 15, 2026 at 05:37:02AM +0000, Alan Braithwaite via GitGitGadget wrote:
> 1: fa1ea69bdb ! 1: 480453b2e7 clone: add clone.<url>.defaultObjectFilter config
> @@ t/t5616-partial-clone.sh: test_expect_success 'after fetching descendants of non
> +test_expect_success 'setup for clone.defaultObjectFilter tests' '
> + git init default-filter-src &&
> + echo "small" >default-filter-src/small.txt &&
> -+ dd if=/dev/zero of=default-filter-src/large.bin bs=1024 count=100 2>/dev/null &&
> + git -C default-filter-src add . &&
> + git -C default-filter-src commit -m "initial" &&
> +
As Junio already pointed out, this change here is a bit puzzling. Not
that I think it's a problem, but one wonders why this existed in the
first place if it seemed to not be necessary.
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 45d8fa0eed..18316a7da9 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -757,6 +758,51 @@ static int git_clone_config(const char *k, const char *v,
> return git_default_config(k, v, ctx, cb);
> }
>
> +static int clone_filter_collect(const char *var, const char *value,
> + const struct config_context *ctx UNUSED,
> + void *cb)
> +{
> + char **filter_spec_p = cb;
> +
> + if (!strcmp(var, "clone.defaultobjectfilter")) {
> + if (!value)
> + return config_error_nonbool(var);
> + free(*filter_spec_p);
> + *filter_spec_p = xstrdup(value);
> + }
> + return 0;
> +}
> +
> +/*
> + * Look up clone.defaultObjectFilter or clone.<url>.defaultObjectFilter
> + * using the urlmatch infrastructure. A URL-qualified entry that matches
> + * the clone URL takes precedence over the bare form, following the same
> + * rules as http.<url>.* configuration variables.
> + */
> +static char *get_default_object_filter(const char *url)
> +{
> + struct urlmatch_config config = URLMATCH_CONFIG_INIT;
> + char *filter_spec = NULL;
> + char *normalized_url;
> +
> + config.section = "clone";
> + config.key = "defaultobjectfilter";
> + config.collect_fn = clone_filter_collect;
> + config.cb = &filter_spec;
> +
> + normalized_url = url_normalize(url, &config.url);
> + if (!normalized_url) {
> + urlmatch_config_release(&config);
> + return NULL;
> + }
We haven't allocated anything, right? So in theory, we should be able to
return early without calling `urlmatch_config_release()`. This could be
stressed further by moving the error path earlier, so that it's the
first thing we do in the function.
Patrick
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2026-03-16 7:47 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-01 16:44 [PATCH] fetch, clone: add fetch.blobSizeLimit config Alan Braithwaite via GitGitGadget
2026-03-02 11:53 ` Patrick Steinhardt
2026-03-02 18:28 ` Jeff King
2026-03-02 18:57 ` Junio C Hamano
2026-03-02 21:36 ` Alan Braithwaite
2026-03-03 6:30 ` Patrick Steinhardt
2026-03-03 14:00 ` Alan Braithwaite
2026-03-03 15:08 ` Patrick Steinhardt
2026-03-03 17:58 ` Junio C Hamano
2026-03-04 5:07 ` Patrick Steinhardt
2026-03-03 17:05 ` Junio C Hamano
2026-03-03 14:34 ` Jeff King
2026-03-05 0:57 ` [PATCH v2] clone: add clone.<url>.defaultObjectFilter config Alan Braithwaite via GitGitGadget
2026-03-05 19:01 ` Junio C Hamano
2026-03-05 23:11 ` Alan Braithwaite
2026-03-06 6:55 ` [PATCH v3] " Alan Braithwaite via GitGitGadget
2026-03-06 10:39 ` brian m. carlson
2026-03-06 19:33 ` Junio C Hamano
2026-03-06 21:50 ` Alan Braithwaite
2026-03-06 21:47 ` [PATCH v4] " Alan Braithwaite via GitGitGadget
2026-03-06 22:18 ` Junio C Hamano
2026-03-07 1:04 ` Alan Braithwaite
2026-03-07 1:33 ` [PATCH v5] " Alan Braithwaite via GitGitGadget
2026-03-11 7:44 ` Patrick Steinhardt
2026-03-15 1:33 ` Alan Braithwaite
2026-03-15 5:37 ` [PATCH v6] " Alan Braithwaite via GitGitGadget
2026-03-15 21:32 ` Junio C Hamano
2026-03-16 7:47 ` Patrick Steinhardt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox