git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Make git-clone(1) more resilient when using bundle-URI
@ 2025-09-12 16:46 Toon Claes
  2025-09-12 16:46 ` [PATCH 1/2] bundle-uri: ignore bundles without uri Toon Claes
  2025-09-12 16:46 ` [PATCH 2/2] bundle-uri: do not abort on invalid packet line Toon Claes
  0 siblings, 2 replies; 10+ messages in thread
From: Toon Claes @ 2025-09-12 16:46 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler, Toon Claes

I've found a few scenarios where the server can make git-clone(1) abort
in case it sends bogus responses to the bundle-uri command. Because the
use of bundle URI is optional, make git-clone(1) handle these cases
better and enable it to continue the clone.

Greets,
Toon

---
Toon Claes (2):
      bundle-uri: ignore bundles without uri
      bundle-uri: do not abort on invalid packet line

 bundle-uri.c                |  3 +++
 connect.c                   |  4 ++--
 t/t5558-clone-bundle-uri.sh | 49 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 54 insertions(+), 2 deletions(-)
---
base-commit: ab427cd991100e94792fce124b0934135abdea4b
change-id: 20250909-b4-toon-bundle-uri-no-uri-dab15bef3c00

Best regards,
--  
Toon Claes <toon@iotcl.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] bundle-uri: ignore bundles without uri
  2025-09-12 16:46 [PATCH 0/2] Make git-clone(1) more resilient when using bundle-URI Toon Claes
@ 2025-09-12 16:46 ` Toon Claes
  2025-09-12 17:11   ` Justin Tobler
  2025-09-12 16:46 ` [PATCH 2/2] bundle-uri: do not abort on invalid packet line Toon Claes
  1 sibling, 1 reply; 10+ messages in thread
From: Toon Claes @ 2025-09-12 16:46 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler, Toon Claes

Bundle-URI can use the heuristic 'creationToken'. With this heuristic
each bundle should specify a 'creationToken' next to the 'uri'
attribute. But this allows misconfiguration where only a 'creationToken'
and no 'uri' is specified for a bundle . Because Git expects each bundle
to have a 'uri', this causes a segmentation fault.

Harden Git against bundles with missing 'uri' and skip bundles which
miss this attribute.

Signed-off-by: Toon Claes <toon@iotcl.com>
---
 bundle-uri.c                |  3 +++
 t/t5558-clone-bundle-uri.sh | 24 ++++++++++++++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/bundle-uri.c b/bundle-uri.c
index 57cccfc6b8..a1120508bf 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -745,6 +745,9 @@ static int fetch_bundle_uri_internal(struct repository *r,
 	int result = 0;
 	struct remote_bundle_info *bcopy;
 
+	if (!bundle->uri)
+		return -1;
+
 	if (depth >= max_bundle_uri_depth) {
 		warning(_("exceeded bundle URI recursion limit (%d)"),
 			max_bundle_uri_depth);
diff --git a/t/t5558-clone-bundle-uri.sh b/t/t5558-clone-bundle-uri.sh
index 7a0943bd36..3cf498b950 100755
--- a/t/t5558-clone-bundle-uri.sh
+++ b/t/t5558-clone-bundle-uri.sh
@@ -468,6 +468,30 @@ test_expect_success 'negotiation: bundle list with all wanted commits' '
 	test_grep ! "clone> want " trace-packet.txt
 '
 
+test_expect_success 'negotiation: bundle list with heuristic but uri missing' '
+	cat >bundle-list <<-EOF &&
+	[bundle]
+		version = 1
+		mode = all
+		heuristic = creationToken
+
+	[bundle "bundle-1"]
+		creationToken = 1
+	EOF
+
+	git clone --no-local --single-branch --branch=left --no-tags \
+		--bundle-uri="file://$(pwd)/bundle-list" \
+		clone-from nego-bundle-list-uri-missing &&
+
+	git -C nego-bundle-list-all for-each-ref --format="%(refname)" >refs &&
+	grep "refs/bundles/heads/" refs >actual &&
+	cat >expect <<-\EOF &&
+	refs/bundles/heads/base
+	refs/bundles/heads/left
+	EOF
+	test_cmp expect actual
+'
+
 #########################################################################
 # HTTP tests begin here
 

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] bundle-uri: do not abort on invalid packet line
  2025-09-12 16:46 [PATCH 0/2] Make git-clone(1) more resilient when using bundle-URI Toon Claes
  2025-09-12 16:46 ` [PATCH 1/2] bundle-uri: ignore bundles without uri Toon Claes
@ 2025-09-12 16:46 ` Toon Claes
  2025-09-12 17:37   ` Justin Tobler
  1 sibling, 1 reply; 10+ messages in thread
From: Toon Claes @ 2025-09-12 16:46 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler, Toon Claes

On clone, when the client sends the `bundle-uri` command, the server
might respond with invalid data. For example if it sends information
about a bundle where the 'uri' is empty, it produces the following
error:

    Cloning into 'foo'...
    error: bundle-uri: line has empty key or value
    error: error on bundle-uri response line 4: bundle.bundle-1.uri=
    error: could not retrieve server-advertised bundle-uri list

This error doesn't cause git-clone(1) to abort, because the return value
from `transport_get_remote_bundle_uri()` is ignored in
`builtin/clone.c`. This should allow the clone to continue *without* the
use of bundle URIs.

Although when cloning over HTTP, the following error occurs after the
above error messages:

    fatal: expected 'packfile'

This is happens because there remains unprocessed data from the
bundle-URI negotiation.

Fix the error by continuing to read packet data when an invalid
bundle-uri line is received.

Signed-off-by: Toon Claes <toon@iotcl.com>
---
 connect.c                   |  4 ++--
 t/t5558-clone-bundle-uri.sh | 25 +++++++++++++++++++++++++
 2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/connect.c b/connect.c
index 8352b71faf..d2e2bd8cce 100644
--- a/connect.c
+++ b/connect.c
@@ -536,8 +536,8 @@ int get_remote_bundle_uri(int fd_out, struct packet_reader *reader,
 		if (!bundle_uri_parse_line(bundles, line))
 			continue;
 
-		return error(_("error on bundle-uri response line %d: %s"),
-			     line_nr, line);
+		warning(_("ignore invalid bundle-uri response line %d: %s"),
+			    line_nr, line);
 	}
 
 	if (reader->status != PACKET_READ_FLUSH)
diff --git a/t/t5558-clone-bundle-uri.sh b/t/t5558-clone-bundle-uri.sh
index 3cf498b950..73aebd0b81 100755
--- a/t/t5558-clone-bundle-uri.sh
+++ b/t/t5558-clone-bundle-uri.sh
@@ -1326,6 +1326,31 @@ test_expect_success 'bundles with newline in target path are rejected' '
 	test_path_is_missing escape
 '
 
+test_expect_success 'bundles advertised with missing URI' '
+	git clone --no-local --mirror clone-from \
+		"$HTTPD_DOCUMENT_ROOT_PATH/no-uri.git" &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/no-uri.git" config uploadpack.advertiseBundleURIs true &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/no-uri.git" config bundle.version 1 &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/no-uri.git" config bundle.mode all &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/no-uri.git" config bundle.bundle-1.creationToken 1 &&
+
+	git -c transfer.bundleURI=true clone \
+		"$HTTPD_URL/smart/no-uri.git" target-no-uri
+'
+
+test_expect_success 'bundles advertised with empty URI' '
+	git clone --no-local --mirror clone-from \
+		"$HTTPD_DOCUMENT_ROOT_PATH/empty-uri.git" &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/empty-uri.git" config uploadpack.advertiseBundleURIs true &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/empty-uri.git" config bundle.version 1 &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/empty-uri.git" config bundle.mode all &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/empty-uri.git" config bundle.bundle-1.uri "" &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/empty-uri.git" config bundle.bundle-1.creationToken 1 &&
+
+	git -c transfer.bundleURI=true clone \
+		"$HTTPD_URL/smart/empty-uri.git" target-empty-uri
+'
+
 # Do not add tests here unless they use the HTTP server, as they will
 # not run unless the HTTP dependencies exist.
 

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] bundle-uri: ignore bundles without uri
  2025-09-12 16:46 ` [PATCH 1/2] bundle-uri: ignore bundles without uri Toon Claes
@ 2025-09-12 17:11   ` Justin Tobler
  2025-09-12 17:18     ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Justin Tobler @ 2025-09-12 17:11 UTC (permalink / raw)
  To: Toon Claes; +Cc: git

On 25/09/12 06:46PM, Toon Claes wrote:
> Bundle-URI can use the heuristic 'creationToken'. With this heuristic
> each bundle should specify a 'creationToken' next to the 'uri'
> attribute. But this allows misconfiguration where only a 'creationToken'
> and no 'uri' is specified for a bundle . Because Git expects each bundle
> to have a 'uri', this causes a segmentation fault.
> 
> Harden Git against bundles with missing 'uri' and skip bundles which
> miss this attribute.

Ultimately, the remote Git server is the source of truth and bundle-uri
serves as a supplementary mechanism to retrieve objects. In cases where
there are errors related to retrieving objects from the specified
bundle-uri, it certainly makes sense for Git to warn/ignore these issues
and continue on as if there was no bundle-uri.

I'm not sure though if this should extend to client-side
misconfiguration. We don't want to segfault, but maybe we should return
an error indicating the misconfiguration instead of just papering over
it? At the very least, it would probably make sense to provide some sort
of warning that the bundle-uri was misconfigured and not used.

-Justin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] bundle-uri: ignore bundles without uri
  2025-09-12 17:11   ` Justin Tobler
@ 2025-09-12 17:18     ` Junio C Hamano
  2025-09-16 15:25       ` Toon Claes
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2025-09-12 17:18 UTC (permalink / raw)
  To: Justin Tobler; +Cc: Toon Claes, git

Justin Tobler <jltobler@gmail.com> writes:

> On 25/09/12 06:46PM, Toon Claes wrote:
>> Bundle-URI can use the heuristic 'creationToken'. With this heuristic
>> each bundle should specify a 'creationToken' next to the 'uri'
>> attribute. But this allows misconfiguration where only a 'creationToken'
>> and no 'uri' is specified for a bundle . Because Git expects each bundle
>> to have a 'uri', this causes a segmentation fault.
>> 
>> Harden Git against bundles with missing 'uri' and skip bundles which
>> miss this attribute.
>
> Ultimately, the remote Git server is the source of truth and bundle-uri
> serves as a supplementary mechanism to retrieve objects. In cases where
> there are errors related to retrieving objects from the specified
> bundle-uri, it certainly makes sense for Git to warn/ignore these issues
> and continue on as if there was no bundle-uri.

OK.

> I'm not sure though if this should extend to client-side
> misconfiguration. We don't want to segfault, but maybe we should return
> an error indicating the misconfiguration instead of just papering over
> it? At the very least, it would probably make sense to provide some sort
> of warning that the bundle-uri was misconfigured and not used.

I tend to agree.  Instead of papering over a misconfiguration, it
would be better to let the users know, so they have a chance to
report and/or correct such a misconfiguration.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] bundle-uri: do not abort on invalid packet line
  2025-09-12 16:46 ` [PATCH 2/2] bundle-uri: do not abort on invalid packet line Toon Claes
@ 2025-09-12 17:37   ` Justin Tobler
  2025-09-17  8:40     ` Toon Claes
  0 siblings, 1 reply; 10+ messages in thread
From: Justin Tobler @ 2025-09-12 17:37 UTC (permalink / raw)
  To: Toon Claes; +Cc: git

On 25/09/12 06:46PM, Toon Claes wrote:
> On clone, when the client sends the `bundle-uri` command, the server
> might respond with invalid data. For example if it sends information
> about a bundle where the 'uri' is empty, it produces the following
> error:
> 
>     Cloning into 'foo'...
>     error: bundle-uri: line has empty key or value
>     error: error on bundle-uri response line 4: bundle.bundle-1.uri=
>     error: could not retrieve server-advertised bundle-uri list
> 
> This error doesn't cause git-clone(1) to abort, because the return value
> from `transport_get_remote_bundle_uri()` is ignored in
> `builtin/clone.c`. This should allow the clone to continue *without* the
> use of bundle URIs.
> 
> Although when cloning over HTTP, the following error occurs after the
> above error messages:
> 
>     fatal: expected 'packfile'
> 
> This is happens because there remains unprocessed data from the
> bundle-URI negotiation.
> 
> Fix the error by continuing to read packet data when an invalid
> bundle-uri line is received.

Is there any reason that the server should be expected to invalid data
to the client? If the server is misconfigured, I wonder if it should
instead handle this issue by not sending the invalid bundle-uri in the
first place and printing a warning message on the server-side. From
client perspective, if it's a server-side issue there may not be much
they can do about the error and it could cause some confusion.

> Signed-off-by: Toon Claes <toon@iotcl.com>
> ---
>  connect.c                   |  4 ++--
>  t/t5558-clone-bundle-uri.sh | 25 +++++++++++++++++++++++++
>  2 files changed, 27 insertions(+), 2 deletions(-)
> 
> diff --git a/connect.c b/connect.c
> index 8352b71faf..d2e2bd8cce 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -536,8 +536,8 @@ int get_remote_bundle_uri(int fd_out, struct packet_reader *reader,
>  		if (!bundle_uri_parse_line(bundles, line))
>  			continue;
>  
> -		return error(_("error on bundle-uri response line %d: %s"),
> -			     line_nr, line);
> +		warning(_("ignore invalid bundle-uri response line %d: %s"),
> +			    line_nr, line);

If I'm understanding correctly, an error here indicates some sort of
issue between the client and remote Git server while figuring out the
bundle-uri capability. I think it is reasonable for the client to always
expect the server to communicate in a way it understands and IMO should
probably be handled by fixing the server-side instead.

-Justin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] bundle-uri: ignore bundles without uri
  2025-09-12 17:18     ` Junio C Hamano
@ 2025-09-16 15:25       ` Toon Claes
  2025-09-16 16:59         ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Toon Claes @ 2025-09-16 15:25 UTC (permalink / raw)
  To: Junio C Hamano, Justin Tobler; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> I tend to agree.  Instead of papering over a misconfiguration, it
> would be better to let the users know, so they have a chance to
> report and/or correct such a misconfiguration.

So are you okay if I do `return error("some message")` instead of
`return -1`, or do you expect more of a change?

-- 
Cheers,
Toon

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] bundle-uri: ignore bundles without uri
  2025-09-16 15:25       ` Toon Claes
@ 2025-09-16 16:59         ` Junio C Hamano
  0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2025-09-16 16:59 UTC (permalink / raw)
  To: Toon Claes; +Cc: Justin Tobler, git

Toon Claes <toon@iotcl.com> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> I tend to agree.  Instead of papering over a misconfiguration, it
>> would be better to let the users know, so they have a chance to
>> report and/or correct such a misconfiguration.
>
> So are you okay if I do `return error("some message")` instead of
> `return -1`, or do you expect more of a change?

I am not sure what you are asking.  The comment was more about my
preference of honestly returning failure with explanation over
pretending success without actually doing anything, wasn't it?



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] bundle-uri: do not abort on invalid packet line
  2025-09-12 17:37   ` Justin Tobler
@ 2025-09-17  8:40     ` Toon Claes
  2025-09-17 14:30       ` Justin Tobler
  0 siblings, 1 reply; 10+ messages in thread
From: Toon Claes @ 2025-09-17  8:40 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git

Justin Tobler <jltobler@gmail.com> writes:

> Is there any reason that the server should be expected to invalid data
> to the client?

You mean "should be sending"? Well, at $DAYJOB we've had some
unintentional misconfiguration that led to a similar failure[1]. I
wasn't able to reproduce the exact same failure in test, because it
requires to pass different configuration to the `GET info/refs` and
`POST upload-pack` handlers, but the effect is the same.

[1]: https://gitlab.com/gitlab-org/git/-/issues/564

> If the server is misconfigured, I wonder if it should
> instead handle this issue by not sending the invalid bundle-uri in the
> first place and printing a warning message on the server-side. From
> client perspective, if it's a server-side issue there may not be much
> they can do about the error and it could cause some confusion.

I can include a server-side fix in this series as well, but that doesn't
hold back there might be servers out there in the field that don't have
that fix and still serve invalid data. I think ideally we should fix it
on both sides.


>> Signed-off-by: Toon Claes <toon@iotcl.com>
>> ---
>>  connect.c                   |  4 ++--
>>  t/t5558-clone-bundle-uri.sh | 25 +++++++++++++++++++++++++
>>  2 files changed, 27 insertions(+), 2 deletions(-)
>> 
>> diff --git a/connect.c b/connect.c
>> index 8352b71faf..d2e2bd8cce 100644
>> --- a/connect.c
>> +++ b/connect.c
>> @@ -536,8 +536,8 @@ int get_remote_bundle_uri(int fd_out, struct packet_reader *reader,
>>  		if (!bundle_uri_parse_line(bundles, line))
>>  			continue;
>>  
>> -		return error(_("error on bundle-uri response line %d: %s"),
>> -			     line_nr, line);
>> +		warning(_("ignore invalid bundle-uri response line %d: %s"),
>> +			    line_nr, line);
>
> If I'm understanding correctly, an error here indicates some sort of
> issue between the client and remote Git server while figuring out the
> bundle-uri capability.


> I think it is reasonable for the client to always
> expect the server to communicate in a way it understands and IMO should
> probably be handled by fixing the server-side instead.

Ideally yes, but broken servers exist now. Having clients deal with it
properly is anyway advised in my opinion.

-- 
Cheers,
Toon

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] bundle-uri: do not abort on invalid packet line
  2025-09-17  8:40     ` Toon Claes
@ 2025-09-17 14:30       ` Justin Tobler
  0 siblings, 0 replies; 10+ messages in thread
From: Justin Tobler @ 2025-09-17 14:30 UTC (permalink / raw)
  To: Toon Claes; +Cc: git

On 25/09/17 10:40AM, Toon Claes wrote:
> Justin Tobler <jltobler@gmail.com> writes:
> 
> > If the server is misconfigured, I wonder if it should
> > instead handle this issue by not sending the invalid bundle-uri in the
> > first place and printing a warning message on the server-side. From
> > client perspective, if it's a server-side issue there may not be much
> > they can do about the error and it could cause some confusion.
> 
> I can include a server-side fix in this series as well, but that doesn't
> hold back there might be servers out there in the field that don't have
> that fix and still serve invalid data. I think ideally we should fix it
> on both sides.

From my perspective, a server-side fix is the most important as it
addresses the root of the problem. That is, a remote Git server with
misconfigured bundle-uri is communicating with a client in an invalid
manner.

We could address this by updating the client to simply ignore the
miscommunicating server. I'm hesitant though because this feels like a
slightly different class of issue here being that, from my
understanding, the server is not correctly negotiating the capability
with the client. Maybe it is fine though for clients to ignore errors in
this process?

I think I would be more inclined to agree with a client-side fix if the
issue was more pronounced in practice. From my understanding, for this
to occur:

  1. A remote Git server must be advertising bundle-uri.
  2. The bundle-uri must be misconfigured on the server-side.
  3. The client must request the bundle-uri from the server via
  transfer.bundleURI.

A user can workaround this problem simply by disabling bundle-uri.
I suspect there are not many users in this situation though.

> >> Signed-off-by: Toon Claes <toon@iotcl.com>
> >> ---
> >>  connect.c                   |  4 ++--
> >>  t/t5558-clone-bundle-uri.sh | 25 +++++++++++++++++++++++++
> >>  2 files changed, 27 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/connect.c b/connect.c
> >> index 8352b71faf..d2e2bd8cce 100644
> >> --- a/connect.c
> >> +++ b/connect.c
> >> @@ -536,8 +536,8 @@ int get_remote_bundle_uri(int fd_out, struct packet_reader *reader,
> >>  		if (!bundle_uri_parse_line(bundles, line))
> >>  			continue;
> >>  
> >> -		return error(_("error on bundle-uri response line %d: %s"),
> >> -			     line_nr, line);
> >> +		warning(_("ignore invalid bundle-uri response line %d: %s"),
> >> +			    line_nr, line);
> >
> > If I'm understanding correctly, an error here indicates some sort of
> > issue between the client and remote Git server while figuring out the
> > bundle-uri capability.
> 
> 
> > I think it is reasonable for the client to always
> > expect the server to communicate in a way it understands and IMO should
> > probably be handled by fixing the server-side instead.
> 
> Ideally yes, but broken servers exist now. Having clients deal with it
> properly is anyway advised in my opinion.

Another benefit of addressing this on the server-side is that it should
transparently fix the issue for clients. I would like to think server
providers using the bundle-uri feature would be more inclined than
clients to update quickly and address the problem. Regardless, I still
think the issue is niche enough not to be a huge problem for clients.

IMO, I think a server-side fix should come first. If this step isn't
enough, then it may make sense to also address it on the client side.

-Justin

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-09-17 14:30 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-12 16:46 [PATCH 0/2] Make git-clone(1) more resilient when using bundle-URI Toon Claes
2025-09-12 16:46 ` [PATCH 1/2] bundle-uri: ignore bundles without uri Toon Claes
2025-09-12 17:11   ` Justin Tobler
2025-09-12 17:18     ` Junio C Hamano
2025-09-16 15:25       ` Toon Claes
2025-09-16 16:59         ` Junio C Hamano
2025-09-12 16:46 ` [PATCH 2/2] bundle-uri: do not abort on invalid packet line Toon Claes
2025-09-12 17:37   ` Justin Tobler
2025-09-17  8:40     ` Toon Claes
2025-09-17 14:30       ` Justin Tobler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).