* [PATCH 0/2] Make git-clone(1) more resilient when using bundle-URI
@ 2025-09-12 16:46 Toon Claes
2025-09-12 16:46 ` [PATCH 1/2] bundle-uri: ignore bundles without uri Toon Claes
2025-09-12 16:46 ` [PATCH 2/2] bundle-uri: do not abort on invalid packet line Toon Claes
0 siblings, 2 replies; 10+ messages in thread
From: Toon Claes @ 2025-09-12 16:46 UTC (permalink / raw)
To: git; +Cc: Justin Tobler, Toon Claes
I've found a few scenarios where the server can make git-clone(1) abort
in case it sends bogus responses to the bundle-uri command. Because the
use of bundle URI is optional, make git-clone(1) handle these cases
better and enable it to continue the clone.
Greets,
Toon
---
Toon Claes (2):
bundle-uri: ignore bundles without uri
bundle-uri: do not abort on invalid packet line
bundle-uri.c | 3 +++
connect.c | 4 ++--
t/t5558-clone-bundle-uri.sh | 49 +++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 54 insertions(+), 2 deletions(-)
---
base-commit: ab427cd991100e94792fce124b0934135abdea4b
change-id: 20250909-b4-toon-bundle-uri-no-uri-dab15bef3c00
Best regards,
--
Toon Claes <toon@iotcl.com>
^ permalink raw reply [flat|nested] 10+ messages in thread* [PATCH 1/2] bundle-uri: ignore bundles without uri 2025-09-12 16:46 [PATCH 0/2] Make git-clone(1) more resilient when using bundle-URI Toon Claes @ 2025-09-12 16:46 ` Toon Claes 2025-09-12 17:11 ` Justin Tobler 2025-09-12 16:46 ` [PATCH 2/2] bundle-uri: do not abort on invalid packet line Toon Claes 1 sibling, 1 reply; 10+ messages in thread From: Toon Claes @ 2025-09-12 16:46 UTC (permalink / raw) To: git; +Cc: Justin Tobler, Toon Claes Bundle-URI can use the heuristic 'creationToken'. With this heuristic each bundle should specify a 'creationToken' next to the 'uri' attribute. But this allows misconfiguration where only a 'creationToken' and no 'uri' is specified for a bundle . Because Git expects each bundle to have a 'uri', this causes a segmentation fault. Harden Git against bundles with missing 'uri' and skip bundles which miss this attribute. Signed-off-by: Toon Claes <toon@iotcl.com> --- bundle-uri.c | 3 +++ t/t5558-clone-bundle-uri.sh | 24 ++++++++++++++++++++++++ 2 files changed, 27 insertions(+) diff --git a/bundle-uri.c b/bundle-uri.c index 57cccfc6b8..a1120508bf 100644 --- a/bundle-uri.c +++ b/bundle-uri.c @@ -745,6 +745,9 @@ static int fetch_bundle_uri_internal(struct repository *r, int result = 0; struct remote_bundle_info *bcopy; + if (!bundle->uri) + return -1; + if (depth >= max_bundle_uri_depth) { warning(_("exceeded bundle URI recursion limit (%d)"), max_bundle_uri_depth); diff --git a/t/t5558-clone-bundle-uri.sh b/t/t5558-clone-bundle-uri.sh index 7a0943bd36..3cf498b950 100755 --- a/t/t5558-clone-bundle-uri.sh +++ b/t/t5558-clone-bundle-uri.sh @@ -468,6 +468,30 @@ test_expect_success 'negotiation: bundle list with all wanted commits' ' test_grep ! "clone> want " trace-packet.txt ' +test_expect_success 'negotiation: bundle list with heuristic but uri missing' ' + cat >bundle-list <<-EOF && + [bundle] + version = 1 + mode = all + heuristic = creationToken + + [bundle "bundle-1"] + creationToken = 1 + EOF + + git clone --no-local --single-branch --branch=left --no-tags \ + --bundle-uri="file://$(pwd)/bundle-list" \ + clone-from nego-bundle-list-uri-missing && + + git -C nego-bundle-list-all for-each-ref --format="%(refname)" >refs && + grep "refs/bundles/heads/" refs >actual && + cat >expect <<-\EOF && + refs/bundles/heads/base + refs/bundles/heads/left + EOF + test_cmp expect actual +' + ######################################################################### # HTTP tests begin here -- 2.51.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] bundle-uri: ignore bundles without uri 2025-09-12 16:46 ` [PATCH 1/2] bundle-uri: ignore bundles without uri Toon Claes @ 2025-09-12 17:11 ` Justin Tobler 2025-09-12 17:18 ` Junio C Hamano 0 siblings, 1 reply; 10+ messages in thread From: Justin Tobler @ 2025-09-12 17:11 UTC (permalink / raw) To: Toon Claes; +Cc: git On 25/09/12 06:46PM, Toon Claes wrote: > Bundle-URI can use the heuristic 'creationToken'. With this heuristic > each bundle should specify a 'creationToken' next to the 'uri' > attribute. But this allows misconfiguration where only a 'creationToken' > and no 'uri' is specified for a bundle . Because Git expects each bundle > to have a 'uri', this causes a segmentation fault. > > Harden Git against bundles with missing 'uri' and skip bundles which > miss this attribute. Ultimately, the remote Git server is the source of truth and bundle-uri serves as a supplementary mechanism to retrieve objects. In cases where there are errors related to retrieving objects from the specified bundle-uri, it certainly makes sense for Git to warn/ignore these issues and continue on as if there was no bundle-uri. I'm not sure though if this should extend to client-side misconfiguration. We don't want to segfault, but maybe we should return an error indicating the misconfiguration instead of just papering over it? At the very least, it would probably make sense to provide some sort of warning that the bundle-uri was misconfigured and not used. -Justin ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] bundle-uri: ignore bundles without uri 2025-09-12 17:11 ` Justin Tobler @ 2025-09-12 17:18 ` Junio C Hamano 2025-09-16 15:25 ` Toon Claes 0 siblings, 1 reply; 10+ messages in thread From: Junio C Hamano @ 2025-09-12 17:18 UTC (permalink / raw) To: Justin Tobler; +Cc: Toon Claes, git Justin Tobler <jltobler@gmail.com> writes: > On 25/09/12 06:46PM, Toon Claes wrote: >> Bundle-URI can use the heuristic 'creationToken'. With this heuristic >> each bundle should specify a 'creationToken' next to the 'uri' >> attribute. But this allows misconfiguration where only a 'creationToken' >> and no 'uri' is specified for a bundle . Because Git expects each bundle >> to have a 'uri', this causes a segmentation fault. >> >> Harden Git against bundles with missing 'uri' and skip bundles which >> miss this attribute. > > Ultimately, the remote Git server is the source of truth and bundle-uri > serves as a supplementary mechanism to retrieve objects. In cases where > there are errors related to retrieving objects from the specified > bundle-uri, it certainly makes sense for Git to warn/ignore these issues > and continue on as if there was no bundle-uri. OK. > I'm not sure though if this should extend to client-side > misconfiguration. We don't want to segfault, but maybe we should return > an error indicating the misconfiguration instead of just papering over > it? At the very least, it would probably make sense to provide some sort > of warning that the bundle-uri was misconfigured and not used. I tend to agree. Instead of papering over a misconfiguration, it would be better to let the users know, so they have a chance to report and/or correct such a misconfiguration. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] bundle-uri: ignore bundles without uri 2025-09-12 17:18 ` Junio C Hamano @ 2025-09-16 15:25 ` Toon Claes 2025-09-16 16:59 ` Junio C Hamano 0 siblings, 1 reply; 10+ messages in thread From: Toon Claes @ 2025-09-16 15:25 UTC (permalink / raw) To: Junio C Hamano, Justin Tobler; +Cc: git Junio C Hamano <gitster@pobox.com> writes: > I tend to agree. Instead of papering over a misconfiguration, it > would be better to let the users know, so they have a chance to > report and/or correct such a misconfiguration. So are you okay if I do `return error("some message")` instead of `return -1`, or do you expect more of a change? -- Cheers, Toon ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] bundle-uri: ignore bundles without uri 2025-09-16 15:25 ` Toon Claes @ 2025-09-16 16:59 ` Junio C Hamano 0 siblings, 0 replies; 10+ messages in thread From: Junio C Hamano @ 2025-09-16 16:59 UTC (permalink / raw) To: Toon Claes; +Cc: Justin Tobler, git Toon Claes <toon@iotcl.com> writes: > Junio C Hamano <gitster@pobox.com> writes: > >> I tend to agree. Instead of papering over a misconfiguration, it >> would be better to let the users know, so they have a chance to >> report and/or correct such a misconfiguration. > > So are you okay if I do `return error("some message")` instead of > `return -1`, or do you expect more of a change? I am not sure what you are asking. The comment was more about my preference of honestly returning failure with explanation over pretending success without actually doing anything, wasn't it? ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/2] bundle-uri: do not abort on invalid packet line 2025-09-12 16:46 [PATCH 0/2] Make git-clone(1) more resilient when using bundle-URI Toon Claes 2025-09-12 16:46 ` [PATCH 1/2] bundle-uri: ignore bundles without uri Toon Claes @ 2025-09-12 16:46 ` Toon Claes 2025-09-12 17:37 ` Justin Tobler 1 sibling, 1 reply; 10+ messages in thread From: Toon Claes @ 2025-09-12 16:46 UTC (permalink / raw) To: git; +Cc: Justin Tobler, Toon Claes On clone, when the client sends the `bundle-uri` command, the server might respond with invalid data. For example if it sends information about a bundle where the 'uri' is empty, it produces the following error: Cloning into 'foo'... error: bundle-uri: line has empty key or value error: error on bundle-uri response line 4: bundle.bundle-1.uri= error: could not retrieve server-advertised bundle-uri list This error doesn't cause git-clone(1) to abort, because the return value from `transport_get_remote_bundle_uri()` is ignored in `builtin/clone.c`. This should allow the clone to continue *without* the use of bundle URIs. Although when cloning over HTTP, the following error occurs after the above error messages: fatal: expected 'packfile' This is happens because there remains unprocessed data from the bundle-URI negotiation. Fix the error by continuing to read packet data when an invalid bundle-uri line is received. Signed-off-by: Toon Claes <toon@iotcl.com> --- connect.c | 4 ++-- t/t5558-clone-bundle-uri.sh | 25 +++++++++++++++++++++++++ 2 files changed, 27 insertions(+), 2 deletions(-) diff --git a/connect.c b/connect.c index 8352b71faf..d2e2bd8cce 100644 --- a/connect.c +++ b/connect.c @@ -536,8 +536,8 @@ int get_remote_bundle_uri(int fd_out, struct packet_reader *reader, if (!bundle_uri_parse_line(bundles, line)) continue; - return error(_("error on bundle-uri response line %d: %s"), - line_nr, line); + warning(_("ignore invalid bundle-uri response line %d: %s"), + line_nr, line); } if (reader->status != PACKET_READ_FLUSH) diff --git a/t/t5558-clone-bundle-uri.sh b/t/t5558-clone-bundle-uri.sh index 3cf498b950..73aebd0b81 100755 --- a/t/t5558-clone-bundle-uri.sh +++ b/t/t5558-clone-bundle-uri.sh @@ -1326,6 +1326,31 @@ test_expect_success 'bundles with newline in target path are rejected' ' test_path_is_missing escape ' +test_expect_success 'bundles advertised with missing URI' ' + git clone --no-local --mirror clone-from \ + "$HTTPD_DOCUMENT_ROOT_PATH/no-uri.git" && + git -C "$HTTPD_DOCUMENT_ROOT_PATH/no-uri.git" config uploadpack.advertiseBundleURIs true && + git -C "$HTTPD_DOCUMENT_ROOT_PATH/no-uri.git" config bundle.version 1 && + git -C "$HTTPD_DOCUMENT_ROOT_PATH/no-uri.git" config bundle.mode all && + git -C "$HTTPD_DOCUMENT_ROOT_PATH/no-uri.git" config bundle.bundle-1.creationToken 1 && + + git -c transfer.bundleURI=true clone \ + "$HTTPD_URL/smart/no-uri.git" target-no-uri +' + +test_expect_success 'bundles advertised with empty URI' ' + git clone --no-local --mirror clone-from \ + "$HTTPD_DOCUMENT_ROOT_PATH/empty-uri.git" && + git -C "$HTTPD_DOCUMENT_ROOT_PATH/empty-uri.git" config uploadpack.advertiseBundleURIs true && + git -C "$HTTPD_DOCUMENT_ROOT_PATH/empty-uri.git" config bundle.version 1 && + git -C "$HTTPD_DOCUMENT_ROOT_PATH/empty-uri.git" config bundle.mode all && + git -C "$HTTPD_DOCUMENT_ROOT_PATH/empty-uri.git" config bundle.bundle-1.uri "" && + git -C "$HTTPD_DOCUMENT_ROOT_PATH/empty-uri.git" config bundle.bundle-1.creationToken 1 && + + git -c transfer.bundleURI=true clone \ + "$HTTPD_URL/smart/empty-uri.git" target-empty-uri +' + # Do not add tests here unless they use the HTTP server, as they will # not run unless the HTTP dependencies exist. -- 2.51.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] bundle-uri: do not abort on invalid packet line 2025-09-12 16:46 ` [PATCH 2/2] bundle-uri: do not abort on invalid packet line Toon Claes @ 2025-09-12 17:37 ` Justin Tobler 2025-09-17 8:40 ` Toon Claes 0 siblings, 1 reply; 10+ messages in thread From: Justin Tobler @ 2025-09-12 17:37 UTC (permalink / raw) To: Toon Claes; +Cc: git On 25/09/12 06:46PM, Toon Claes wrote: > On clone, when the client sends the `bundle-uri` command, the server > might respond with invalid data. For example if it sends information > about a bundle where the 'uri' is empty, it produces the following > error: > > Cloning into 'foo'... > error: bundle-uri: line has empty key or value > error: error on bundle-uri response line 4: bundle.bundle-1.uri= > error: could not retrieve server-advertised bundle-uri list > > This error doesn't cause git-clone(1) to abort, because the return value > from `transport_get_remote_bundle_uri()` is ignored in > `builtin/clone.c`. This should allow the clone to continue *without* the > use of bundle URIs. > > Although when cloning over HTTP, the following error occurs after the > above error messages: > > fatal: expected 'packfile' > > This is happens because there remains unprocessed data from the > bundle-URI negotiation. > > Fix the error by continuing to read packet data when an invalid > bundle-uri line is received. Is there any reason that the server should be expected to invalid data to the client? If the server is misconfigured, I wonder if it should instead handle this issue by not sending the invalid bundle-uri in the first place and printing a warning message on the server-side. From client perspective, if it's a server-side issue there may not be much they can do about the error and it could cause some confusion. > Signed-off-by: Toon Claes <toon@iotcl.com> > --- > connect.c | 4 ++-- > t/t5558-clone-bundle-uri.sh | 25 +++++++++++++++++++++++++ > 2 files changed, 27 insertions(+), 2 deletions(-) > > diff --git a/connect.c b/connect.c > index 8352b71faf..d2e2bd8cce 100644 > --- a/connect.c > +++ b/connect.c > @@ -536,8 +536,8 @@ int get_remote_bundle_uri(int fd_out, struct packet_reader *reader, > if (!bundle_uri_parse_line(bundles, line)) > continue; > > - return error(_("error on bundle-uri response line %d: %s"), > - line_nr, line); > + warning(_("ignore invalid bundle-uri response line %d: %s"), > + line_nr, line); If I'm understanding correctly, an error here indicates some sort of issue between the client and remote Git server while figuring out the bundle-uri capability. I think it is reasonable for the client to always expect the server to communicate in a way it understands and IMO should probably be handled by fixing the server-side instead. -Justin ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] bundle-uri: do not abort on invalid packet line 2025-09-12 17:37 ` Justin Tobler @ 2025-09-17 8:40 ` Toon Claes 2025-09-17 14:30 ` Justin Tobler 0 siblings, 1 reply; 10+ messages in thread From: Toon Claes @ 2025-09-17 8:40 UTC (permalink / raw) To: Justin Tobler; +Cc: git Justin Tobler <jltobler@gmail.com> writes: > Is there any reason that the server should be expected to invalid data > to the client? You mean "should be sending"? Well, at $DAYJOB we've had some unintentional misconfiguration that led to a similar failure[1]. I wasn't able to reproduce the exact same failure in test, because it requires to pass different configuration to the `GET info/refs` and `POST upload-pack` handlers, but the effect is the same. [1]: https://gitlab.com/gitlab-org/git/-/issues/564 > If the server is misconfigured, I wonder if it should > instead handle this issue by not sending the invalid bundle-uri in the > first place and printing a warning message on the server-side. From > client perspective, if it's a server-side issue there may not be much > they can do about the error and it could cause some confusion. I can include a server-side fix in this series as well, but that doesn't hold back there might be servers out there in the field that don't have that fix and still serve invalid data. I think ideally we should fix it on both sides. >> Signed-off-by: Toon Claes <toon@iotcl.com> >> --- >> connect.c | 4 ++-- >> t/t5558-clone-bundle-uri.sh | 25 +++++++++++++++++++++++++ >> 2 files changed, 27 insertions(+), 2 deletions(-) >> >> diff --git a/connect.c b/connect.c >> index 8352b71faf..d2e2bd8cce 100644 >> --- a/connect.c >> +++ b/connect.c >> @@ -536,8 +536,8 @@ int get_remote_bundle_uri(int fd_out, struct packet_reader *reader, >> if (!bundle_uri_parse_line(bundles, line)) >> continue; >> >> - return error(_("error on bundle-uri response line %d: %s"), >> - line_nr, line); >> + warning(_("ignore invalid bundle-uri response line %d: %s"), >> + line_nr, line); > > If I'm understanding correctly, an error here indicates some sort of > issue between the client and remote Git server while figuring out the > bundle-uri capability. > I think it is reasonable for the client to always > expect the server to communicate in a way it understands and IMO should > probably be handled by fixing the server-side instead. Ideally yes, but broken servers exist now. Having clients deal with it properly is anyway advised in my opinion. -- Cheers, Toon ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] bundle-uri: do not abort on invalid packet line 2025-09-17 8:40 ` Toon Claes @ 2025-09-17 14:30 ` Justin Tobler 0 siblings, 0 replies; 10+ messages in thread From: Justin Tobler @ 2025-09-17 14:30 UTC (permalink / raw) To: Toon Claes; +Cc: git On 25/09/17 10:40AM, Toon Claes wrote: > Justin Tobler <jltobler@gmail.com> writes: > > > If the server is misconfigured, I wonder if it should > > instead handle this issue by not sending the invalid bundle-uri in the > > first place and printing a warning message on the server-side. From > > client perspective, if it's a server-side issue there may not be much > > they can do about the error and it could cause some confusion. > > I can include a server-side fix in this series as well, but that doesn't > hold back there might be servers out there in the field that don't have > that fix and still serve invalid data. I think ideally we should fix it > on both sides. From my perspective, a server-side fix is the most important as it addresses the root of the problem. That is, a remote Git server with misconfigured bundle-uri is communicating with a client in an invalid manner. We could address this by updating the client to simply ignore the miscommunicating server. I'm hesitant though because this feels like a slightly different class of issue here being that, from my understanding, the server is not correctly negotiating the capability with the client. Maybe it is fine though for clients to ignore errors in this process? I think I would be more inclined to agree with a client-side fix if the issue was more pronounced in practice. From my understanding, for this to occur: 1. A remote Git server must be advertising bundle-uri. 2. The bundle-uri must be misconfigured on the server-side. 3. The client must request the bundle-uri from the server via transfer.bundleURI. A user can workaround this problem simply by disabling bundle-uri. I suspect there are not many users in this situation though. > >> Signed-off-by: Toon Claes <toon@iotcl.com> > >> --- > >> connect.c | 4 ++-- > >> t/t5558-clone-bundle-uri.sh | 25 +++++++++++++++++++++++++ > >> 2 files changed, 27 insertions(+), 2 deletions(-) > >> > >> diff --git a/connect.c b/connect.c > >> index 8352b71faf..d2e2bd8cce 100644 > >> --- a/connect.c > >> +++ b/connect.c > >> @@ -536,8 +536,8 @@ int get_remote_bundle_uri(int fd_out, struct packet_reader *reader, > >> if (!bundle_uri_parse_line(bundles, line)) > >> continue; > >> > >> - return error(_("error on bundle-uri response line %d: %s"), > >> - line_nr, line); > >> + warning(_("ignore invalid bundle-uri response line %d: %s"), > >> + line_nr, line); > > > > If I'm understanding correctly, an error here indicates some sort of > > issue between the client and remote Git server while figuring out the > > bundle-uri capability. > > > > I think it is reasonable for the client to always > > expect the server to communicate in a way it understands and IMO should > > probably be handled by fixing the server-side instead. > > Ideally yes, but broken servers exist now. Having clients deal with it > properly is anyway advised in my opinion. Another benefit of addressing this on the server-side is that it should transparently fix the issue for clients. I would like to think server providers using the bundle-uri feature would be more inclined than clients to update quickly and address the problem. Regardless, I still think the issue is niche enough not to be a huge problem for clients. IMO, I think a server-side fix should come first. If this step isn't enough, then it may make sense to also address it on the client side. -Justin ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-09-17 14:30 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-09-12 16:46 [PATCH 0/2] Make git-clone(1) more resilient when using bundle-URI Toon Claes 2025-09-12 16:46 ` [PATCH 1/2] bundle-uri: ignore bundles without uri Toon Claes 2025-09-12 17:11 ` Justin Tobler 2025-09-12 17:18 ` Junio C Hamano 2025-09-16 15:25 ` Toon Claes 2025-09-16 16:59 ` Junio C Hamano 2025-09-12 16:46 ` [PATCH 2/2] bundle-uri: do not abort on invalid packet line Toon Claes 2025-09-12 17:37 ` Justin Tobler 2025-09-17 8:40 ` Toon Claes 2025-09-17 14:30 ` Justin Tobler
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).