* tests broken with curl-8.7.0
@ 2024-03-30 0:02 Jeff King
2024-03-30 8:54 ` Daniel Stenberg
0 siblings, 1 reply; 15+ messages in thread
From: Jeff King @ 2024-03-30 0:02 UTC (permalink / raw)
To: git; +Cc: Daniel Stenberg
I noticed some http-related failures in the test suite on my Debian
unstable system, which recently got an upgraded curl package. It looks
like it's related to cases where we use the remote-curl helper in
"connect" mode (i.e., protocol v2) and the http buffer is small
(requiring us to stream the data to curl). Besides just running t5551,
an easy reproduction is:
[this works]
$ git ls-remote https://git.kernel.org/pub/scm/git/git.git | wc -l
1867
[this doesn't]
$ git -c http.postbuffer=65536 ls-remote https://git.kernel.org/pub/scm/git/git.git
fatal: expected flush after ref listing
The error message comes from ls-remote itself, which was expecting a
FLUSH packet from the remote. Instead it gets the RESPONSE_END from
remote-curl (remember that in connect mode, remote-curl is just ferrying
bytes back and forth between ls-remote and the server).
It works with older versions of libcurl, but not 8.7.0 (or 8.7.1).
Bisecting in libcurl points to 9369c30cd (lib: Curl_read/Curl_write
clarifications, 2024-02-15).
Running with GIT_TRACE_CURL=1 shows weirdness on the POST we send to
issue the ls-refs command. With older curl, I see this:
=> Send header: POST /pub/scm/git/git.git/git-upload-pack HTTP/1.1
=> Send header: Host: git.kernel.org
=> Send header: User-Agent: git/2.44.0.789.g252ee96bc5.dirty
=> Send header: Accept-Encoding: deflate, gzip
=> Send header: Content-Type: application/x-git-upload-pack-request
=> Send header: Accept: application/x-git-upload-pack-result
=> Send header: Git-Protocol: version=2
=> Send header: Transfer-Encoding: chunked
=> Send header:
=> Send data: 14..0014command=ls-refs...
=> Send data: 2a..002aagent=git/2.44.0.789.g252ee96bc5.dirty..
[and so on until...]
== Info: Signaling end of chunked upload via terminating chunk.
But with the broken version, I get:
=> Send header: POST /pub/scm/git/git.git/git-upload-pack HTTP/1.1
=> Send header: Host: git.kernel.org
=> Send header: User-Agent: git/2.44.0.789.g252ee96bc5.dirty
=> Send header: Accept-Encoding: deflate, gzip, br, zstd
=> Send header: Content-Type: application/x-git-upload-pack-request
=> Send header: Accept: application/x-git-upload-pack-result
=> Send header: Git-Protocol: version=2
=> Send header: Transfer-Encoding: chunked
=> Send header:
=> Send data, 0000000014 bytes (0x0000000e)
=> Send data: 4..0014..0....
== Info: upload completely sent off: 14 bytes
So we only get the first 4 bytes, and then we quit (the double mention
of 14 is confusing, but I think it is both the size of the pkt-line
("command=ls-refs\n") but also the length of the 4-byte string when
framed with chunked transfer-encoding). Those 4 bytes are the first
thing returned by rpc_out(), which we use as our CURLOPT_READFUNCTION.
It's possible that we're doing something wrong with our read/write
function callbacks. But I don't see how; we say "here's 4 bytes", but
then we never get called again. It's like curl is giving up on trying to
read the post input early for some reason.
I'm not sure how to dig further. That commit is pretty big and scary. I
did check that the tip of master in curl.git is still affected (I'd
hoped maybe the 0-length write fixes in b30d694a027 would be related,
but that's not it).
Ideas?
-Peff
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: tests broken with curl-8.7.0 2024-03-30 0:02 tests broken with curl-8.7.0 Jeff King @ 2024-03-30 8:54 ` Daniel Stenberg 2024-04-02 20:02 ` [PATCH 0/2] git+curl 8.7.0 workaround Jeff King 0 siblings, 1 reply; 15+ messages in thread From: Daniel Stenberg @ 2024-03-30 8:54 UTC (permalink / raw) To: Jeff King; +Cc: git On Fri, 29 Mar 2024, Jeff King wrote: > I noticed some http-related failures in the test suite on my Debian unstable > system, which recently got an upgraded curl package. It looks like it's > related to cases where we use the remote-curl helper in "connect" mode > (i.e., protocol v2) and the http buffer is small (requiring us to stream the > data to curl). Besides just running t5551, an easy reproduction is: This smells like a libcurl regression to me. I "imported" this into our issue tracker here: https://github.com/curl/curl/issues/13229 -- / daniel.haxx.se ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 0/2] git+curl 8.7.0 workaround 2024-03-30 8:54 ` Daniel Stenberg @ 2024-04-02 20:02 ` Jeff King 2024-04-02 20:05 ` [PATCH 1/2] http: reset POSTFIELDSIZE when clearing curl handle Jeff King ` (3 more replies) 0 siblings, 4 replies; 15+ messages in thread From: Jeff King @ 2024-04-02 20:02 UTC (permalink / raw) To: git; +Cc: Daniel Stenberg On Sat, Mar 30, 2024 at 09:54:02AM +0100, Daniel Stenberg wrote: > On Fri, 29 Mar 2024, Jeff King wrote: > > > I noticed some http-related failures in the test suite on my Debian > > unstable system, which recently got an upgraded curl package. It looks > > like it's related to cases where we use the remote-curl helper in > > "connect" mode (i.e., protocol v2) and the http buffer is small > > (requiring us to stream the data to curl). Besides just running t5551, > > an easy reproduction is: > > This smells like a libcurl regression to me. I "imported" this into our > issue tracker here: https://github.com/curl/curl/issues/13229 This was all resolved in that issue, but just to summarize for the list here: it was a regression in curl and there's a fix already. Thanks Daniel for your (as usual) prompt digging into the problem (and likewise to Stefan for the actual fix). Ultimately the issue will be fixed by moving to a different version of libcurl, but here's an easy workaround in the meantime, with a small doc cleanup I found along the way. [1/2]: http: reset POSTFIELDSIZE when clearing curl handle [2/2]: INSTALL: bump libcurl version to 7.21.3 INSTALL | 2 +- http.c | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) -Peff ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/2] http: reset POSTFIELDSIZE when clearing curl handle 2024-04-02 20:02 ` [PATCH 0/2] git+curl 8.7.0 workaround Jeff King @ 2024-04-02 20:05 ` Jeff King 2024-04-02 20:27 ` Junio C Hamano ` (2 more replies) 2024-04-02 20:06 ` [PATCH 2/2] INSTALL: bump libcurl version to 7.21.3 Jeff King ` (2 subsequent siblings) 3 siblings, 3 replies; 15+ messages in thread From: Jeff King @ 2024-04-02 20:05 UTC (permalink / raw) To: git; +Cc: Daniel Stenberg In get_active_slot(), we return a CURL handle that may have been used before (reusing them is good because it lets curl reuse the same connection across many requests). We set a few curl options back to defaults that may have been modified by previous requests. We reset POSTFIELDS to NULL, but do not reset POSTFIELDSIZE (which defaults to "-1"). This usually doesn't matter because most POSTs will set both fields together anyway. But there is one exception: when handling a large request in remote-curl's post_rpc(), we don't set _either_, and instead set a READFUNCTION to stream data into libcurl. This can interact weirdly with a stale POSTFIELDSIZE setting, because curl will assume it should read only some set number of bytes from our READFUNCTION. However, it has worked in practice because we also manually set a "Transfer-Encoding: chunked" header, which libcurl uses as a clue to set the POSTFIELDSIZE to -1 itself. So everything works, but we're better off resetting the size manually for a few reasons: - there was a regression in curl 8.7.0 where the chunked header detection didn't kick in, causing any large HTTP requests made by Git to fail. This has since been fixed (but not yet released). In the issue, curl folks recommended setting it explicitly to -1: https://github.com/curl/curl/issues/13229#issuecomment-2029826058 and it indeed works around the regression. So even though it won't be strictly necessary after the fix there, this will help folks who end up using the affected libcurl versions. - it's consistent with what a new curl handle would look like. Since get_active_slot() may or may not return a used handle, this reduces the possibility of heisenbugs that only appear with certain request patterns. Note that the recommendation in the curl issue is to actually drop the manual Transfer-Encoding header. Modern libcurl will add the header itself when streaming from a READFUNCTION. However, that code wasn't added until 802aa5ae2 (HTTP: use chunked Transfer-Encoding for HTTP_POST if size unknown, 2019-07-22), which is in curl 7.66.0. We claim to support back to 7.19.5, so those older versions still need the manual header. Signed-off-by: Jeff King <peff@peff.net> --- http.c | 1 + 1 file changed, 1 insertion(+) diff --git a/http.c b/http.c index e73b136e58..3d80bd6116 100644 --- a/http.c +++ b/http.c @@ -1452,6 +1452,7 @@ struct active_request_slot *get_active_slot(void) curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, NULL); curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, NULL); curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDS, NULL); + curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDSIZE, -1L); curl_easy_setopt(slot->curl, CURLOPT_UPLOAD, 0); curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1); curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 1); -- 2.44.0.789.g5ea01f6724 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] http: reset POSTFIELDSIZE when clearing curl handle 2024-04-02 20:05 ` [PATCH 1/2] http: reset POSTFIELDSIZE when clearing curl handle Jeff King @ 2024-04-02 20:27 ` Junio C Hamano 2024-04-03 3:20 ` Jeff King 2024-04-03 6:30 ` Patrick Steinhardt 2 siblings, 0 replies; 15+ messages in thread From: Junio C Hamano @ 2024-04-02 20:27 UTC (permalink / raw) To: Jeff King; +Cc: git, Daniel Stenberg Jeff King <peff@peff.net> writes: > In get_active_slot(), we return a CURL handle that may have been used > before (reusing them is good because it lets curl reuse the same > connection across many requests). We set a few curl options back to > defaults that may have been modified by previous requests. > > We reset POSTFIELDS to NULL, but do not reset POSTFIELDSIZE (which > defaults to "-1"). This usually doesn't matter because most POSTs will > set both fields together anyway. But there is one exception: when > handling a large request in remote-curl's post_rpc(), we don't set > _either_, and instead set a READFUNCTION to stream data into libcurl. > > This can interact weirdly with a stale POSTFIELDSIZE setting, because > curl will assume it should read only some set number of bytes from our > READFUNCTION. However, it has worked in practice because we also > manually set a "Transfer-Encoding: chunked" header, which libcurl uses > as a clue to set the POSTFIELDSIZE to -1 itself. > > So everything works, but we're better off resetting the size manually > for a few reasons: > > - there was a regression in curl 8.7.0 where the chunked header > detection didn't kick in, causing any large HTTP requests made by > Git to fail. This has since been fixed (but not yet released). In > the issue, curl folks recommended setting it explicitly to -1: > > https://github.com/curl/curl/issues/13229#issuecomment-2029826058 > > and it indeed works around the regression. So even though it won't > be strictly necessary after the fix there, this will help folks who > end up using the affected libcurl versions. > > - it's consistent with what a new curl handle would look like. Since > get_active_slot() may or may not return a used handle, this reduces > the possibility of heisenbugs that only appear with certain request > patterns. > > Note that the recommendation in the curl issue is to actually drop the > manual Transfer-Encoding header. Modern libcurl will add the header > itself when streaming from a READFUNCTION. However, that code wasn't > added until 802aa5ae2 (HTTP: use chunked Transfer-Encoding for HTTP_POST > if size unknown, 2019-07-22), which is in curl 7.66.0. We claim to > support back to 7.19.5, so those older versions still need the manual > header. > > Signed-off-by: Jeff King <peff@peff.net> > --- > http.c | 1 + > 1 file changed, 1 insertion(+) As always, well articulated. Thanks. Will queue. > diff --git a/http.c b/http.c > index e73b136e58..3d80bd6116 100644 > --- a/http.c > +++ b/http.c > @@ -1452,6 +1452,7 @@ struct active_request_slot *get_active_slot(void) > curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, NULL); > curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, NULL); > curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDS, NULL); > + curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDSIZE, -1L); > curl_easy_setopt(slot->curl, CURLOPT_UPLOAD, 0); > curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1); > curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 1); ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] http: reset POSTFIELDSIZE when clearing curl handle 2024-04-02 20:05 ` [PATCH 1/2] http: reset POSTFIELDSIZE when clearing curl handle Jeff King 2024-04-02 20:27 ` Junio C Hamano @ 2024-04-03 3:20 ` Jeff King 2024-04-03 6:30 ` Patrick Steinhardt 2 siblings, 0 replies; 15+ messages in thread From: Jeff King @ 2024-04-03 3:20 UTC (permalink / raw) To: git; +Cc: Daniel Stenberg On Tue, Apr 02, 2024 at 04:05:17PM -0400, Jeff King wrote: > So everything works, but we're better off resetting the size manually > for a few reasons: > > - there was a regression in curl 8.7.0 where the chunked header > detection didn't kick in, causing any large HTTP requests made by > Git to fail. This has since been fixed (but not yet released). In > the issue, curl folks recommended setting it explicitly to -1: > > https://github.com/curl/curl/issues/13229#issuecomment-2029826058 > > and it indeed works around the regression. So even though it won't > be strictly necessary after the fix there, this will help folks who > end up using the affected libcurl versions. Hmph. This isn't quite enough to make things work with 8.7.0, because there are two things wrong there: 1. curl uses the leftover POSTFIELDSIZE, which this patch fixes. So HTTP/1.1 is fine (including t5551). 2. In HTTP/2 mode, it sends chunked data, even though HTTP/2 does not use "Transfer-Encoding: chunked" (it correctly does not send the header, but it puts unnecessary chunk framing around the data). t5559 (which covers HTTP/2) fails, and so does accessing HTTP/2 capable hosts like github.com (though score another win for t5559; it has introduced some hassles, but it diagnosed a real problem we would not have otherwise seen in the test suite). This second problem can be fixed by eliminating the manual transfer-encoding header, which is what's confusing curl's HTTP/2 code. But as I said earlier... > Note that the recommendation in the curl issue is to actually drop the > manual Transfer-Encoding header. Modern libcurl will add the header > itself when streaming from a READFUNCTION. However, that code wasn't > added until 802aa5ae2 (HTTP: use chunked Transfer-Encoding for HTTP_POST > if size unknown, 2019-07-22), which is in curl 7.66.0. We claim to > support back to 7.19.5, so those older versions still need the manual > header. ...we can't do so unconditionally. So we'd need something like: diff --git a/remote-curl.c b/remote-curl.c index 31b02b8840..215bcc6e10 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -955,7 +955,9 @@ static int post_rpc(struct rpc_state *rpc, int stateless_connect, int flush_rece /* The request body is large and the size cannot be predicted. * We must use chunked encoding to send it. */ +#if LIBCURL_VERSION_NUM < 0x074200 headers = curl_slist_append(headers, "Transfer-Encoding: chunked"); +#endif rpc->initial_buffer = 1; curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, rpc_out); curl_easy_setopt(slot->curl, CURLOPT_INFILE, rpc); I was hoping not to have to carry any version-specific cruft (especially since newer versions of curl should fix the regression). But it's not too bad, and it actually marks the code as something that we can ditch in the future when that version of curl becomes obsolete. I can try to prepare a cleaner patch for it tomorrow, but comments welcome in the meantime. -Peff ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] http: reset POSTFIELDSIZE when clearing curl handle 2024-04-02 20:05 ` [PATCH 1/2] http: reset POSTFIELDSIZE when clearing curl handle Jeff King 2024-04-02 20:27 ` Junio C Hamano 2024-04-03 3:20 ` Jeff King @ 2024-04-03 6:30 ` Patrick Steinhardt 2024-04-03 6:34 ` Patrick Steinhardt 2 siblings, 1 reply; 15+ messages in thread From: Patrick Steinhardt @ 2024-04-03 6:30 UTC (permalink / raw) To: Jeff King; +Cc: git, Daniel Stenberg [-- Attachment #1: Type: text/plain, Size: 4596 bytes --] On Tue, Apr 02, 2024 at 04:05:17PM -0400, Jeff King wrote: > In get_active_slot(), we return a CURL handle that may have been used > before (reusing them is good because it lets curl reuse the same > connection across many requests). We set a few curl options back to > defaults that may have been modified by previous requests. > > We reset POSTFIELDS to NULL, but do not reset POSTFIELDSIZE (which > defaults to "-1"). This usually doesn't matter because most POSTs will > set both fields together anyway. But there is one exception: when > handling a large request in remote-curl's post_rpc(), we don't set > _either_, and instead set a READFUNCTION to stream data into libcurl. > > This can interact weirdly with a stale POSTFIELDSIZE setting, because > curl will assume it should read only some set number of bytes from our > READFUNCTION. However, it has worked in practice because we also > manually set a "Transfer-Encoding: chunked" header, which libcurl uses > as a clue to set the POSTFIELDSIZE to -1 itself. > > So everything works, but we're better off resetting the size manually > for a few reasons: > > - there was a regression in curl 8.7.0 where the chunked header > detection didn't kick in, causing any large HTTP requests made by > Git to fail. This has since been fixed (but not yet released). In > the issue, curl folks recommended setting it explicitly to -1: > > https://github.com/curl/curl/issues/13229#issuecomment-2029826058 > > and it indeed works around the regression. So even though it won't > be strictly necessary after the fix there, this will help folks who > end up using the affected libcurl versions. > > - it's consistent with what a new curl handle would look like. Since > get_active_slot() may or may not return a used handle, this reduces > the possibility of heisenbugs that only appear with certain request > patterns. > > Note that the recommendation in the curl issue is to actually drop the > manual Transfer-Encoding header. Modern libcurl will add the header > itself when streaming from a READFUNCTION. However, that code wasn't > added until 802aa5ae2 (HTTP: use chunked Transfer-Encoding for HTTP_POST > if size unknown, 2019-07-22), which is in curl 7.66.0. We claim to > support back to 7.19.5, so those older versions still need the manual > header. > > Signed-off-by: Jeff King <peff@peff.net> > --- > http.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/http.c b/http.c > index e73b136e58..3d80bd6116 100644 > --- a/http.c > +++ b/http.c > @@ -1452,6 +1452,7 @@ struct active_request_slot *get_active_slot(void) > curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, NULL); > curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, NULL); > curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDS, NULL); > + curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDSIZE, -1L); > curl_easy_setopt(slot->curl, CURLOPT_UPLOAD, 0); > curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1); > curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 1); Can't we refactor this code to instead use `curl_easy_reset()`? That function already resets most of the data we want to reset and would also end up setting `POSFIELDSIZE = -1` via `Curl_init_userdefined()`. So wouldn't the following be a more sensible fix? diff --git a/http.c b/http.c index e73b136e58..e5f5bc23db 100644 --- a/http.c +++ b/http.c @@ -1442,20 +1442,14 @@ struct active_request_slot *get_active_slot(void) slot->finished = NULL; slot->callback_data = NULL; slot->callback_func = NULL; + curl_easy_reset(slot->curl); curl_easy_setopt(slot->curl, CURLOPT_COOKIEFILE, curl_cookie_file); if (curl_save_cookies) curl_easy_setopt(slot->curl, CURLOPT_COOKIEJAR, curl_cookie_file); curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, pragma_header); curl_easy_setopt(slot->curl, CURLOPT_RESOLVE, host_resolutions); curl_easy_setopt(slot->curl, CURLOPT_ERRORBUFFER, curl_errorstr); - curl_easy_setopt(slot->curl, CURLOPT_CUSTOMREQUEST, NULL); - curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, NULL); - curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, NULL); - curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDS, NULL); - curl_easy_setopt(slot->curl, CURLOPT_UPLOAD, 0); - curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1); curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 1); - curl_easy_setopt(slot->curl, CURLOPT_RANGE, NULL); /* * Default following to off unless "ALWAYS" is configured; this gives Patrick [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] http: reset POSTFIELDSIZE when clearing curl handle 2024-04-03 6:30 ` Patrick Steinhardt @ 2024-04-03 6:34 ` Patrick Steinhardt 2024-04-03 20:18 ` Jeff King 0 siblings, 1 reply; 15+ messages in thread From: Patrick Steinhardt @ 2024-04-03 6:34 UTC (permalink / raw) To: Jeff King; +Cc: git, Daniel Stenberg [-- Attachment #1: Type: text/plain, Size: 5054 bytes --] On Wed, Apr 03, 2024 at 08:30:54AM +0200, Patrick Steinhardt wrote: > On Tue, Apr 02, 2024 at 04:05:17PM -0400, Jeff King wrote: > > In get_active_slot(), we return a CURL handle that may have been used > > before (reusing them is good because it lets curl reuse the same > > connection across many requests). We set a few curl options back to > > defaults that may have been modified by previous requests. > > > > We reset POSTFIELDS to NULL, but do not reset POSTFIELDSIZE (which > > defaults to "-1"). This usually doesn't matter because most POSTs will > > set both fields together anyway. But there is one exception: when > > handling a large request in remote-curl's post_rpc(), we don't set > > _either_, and instead set a READFUNCTION to stream data into libcurl. > > > > This can interact weirdly with a stale POSTFIELDSIZE setting, because > > curl will assume it should read only some set number of bytes from our > > READFUNCTION. However, it has worked in practice because we also > > manually set a "Transfer-Encoding: chunked" header, which libcurl uses > > as a clue to set the POSTFIELDSIZE to -1 itself. > > > > So everything works, but we're better off resetting the size manually > > for a few reasons: > > > > - there was a regression in curl 8.7.0 where the chunked header > > detection didn't kick in, causing any large HTTP requests made by > > Git to fail. This has since been fixed (but not yet released). In > > the issue, curl folks recommended setting it explicitly to -1: > > > > https://github.com/curl/curl/issues/13229#issuecomment-2029826058 > > > > and it indeed works around the regression. So even though it won't > > be strictly necessary after the fix there, this will help folks who > > end up using the affected libcurl versions. > > > > - it's consistent with what a new curl handle would look like. Since > > get_active_slot() may or may not return a used handle, this reduces > > the possibility of heisenbugs that only appear with certain request > > patterns. > > > > Note that the recommendation in the curl issue is to actually drop the > > manual Transfer-Encoding header. Modern libcurl will add the header > > itself when streaming from a READFUNCTION. However, that code wasn't > > added until 802aa5ae2 (HTTP: use chunked Transfer-Encoding for HTTP_POST > > if size unknown, 2019-07-22), which is in curl 7.66.0. We claim to > > support back to 7.19.5, so those older versions still need the manual > > header. > > > > Signed-off-by: Jeff King <peff@peff.net> > > --- > > http.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/http.c b/http.c > > index e73b136e58..3d80bd6116 100644 > > --- a/http.c > > +++ b/http.c > > @@ -1452,6 +1452,7 @@ struct active_request_slot *get_active_slot(void) > > curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, NULL); > > curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, NULL); > > curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDS, NULL); > > + curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDSIZE, -1L); > > curl_easy_setopt(slot->curl, CURLOPT_UPLOAD, 0); > > curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1); > > curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 1); > > Can't we refactor this code to instead use `curl_easy_reset()`? That > function already resets most of the data we want to reset and would also > end up setting `POSFIELDSIZE = -1` via `Curl_init_userdefined()`. So > wouldn't the following be a more sensible fix? > > diff --git a/http.c b/http.c > index e73b136e58..e5f5bc23db 100644 > --- a/http.c > +++ b/http.c > @@ -1442,20 +1442,14 @@ struct active_request_slot *get_active_slot(void) > slot->finished = NULL; > slot->callback_data = NULL; > slot->callback_func = NULL; > + curl_easy_reset(slot->curl); > curl_easy_setopt(slot->curl, CURLOPT_COOKIEFILE, curl_cookie_file); > if (curl_save_cookies) > curl_easy_setopt(slot->curl, CURLOPT_COOKIEJAR, curl_cookie_file); > curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, pragma_header); > curl_easy_setopt(slot->curl, CURLOPT_RESOLVE, host_resolutions); > curl_easy_setopt(slot->curl, CURLOPT_ERRORBUFFER, curl_errorstr); > - curl_easy_setopt(slot->curl, CURLOPT_CUSTOMREQUEST, NULL); > - curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, NULL); > - curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, NULL); > - curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDS, NULL); > - curl_easy_setopt(slot->curl, CURLOPT_UPLOAD, 0); > - curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1); > curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 1); > - curl_easy_setopt(slot->curl, CURLOPT_RANGE, NULL); > > /* > * Default following to off unless "ALWAYS" is configured; this gives Oh well, the answer is "no", or at least not as easily as this, as the failing tests tell us. I guess it resets more data than we actually want it to reset, but I didn't dig any deeper than that. Patrick [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] http: reset POSTFIELDSIZE when clearing curl handle 2024-04-03 6:34 ` Patrick Steinhardt @ 2024-04-03 20:18 ` Jeff King 0 siblings, 0 replies; 15+ messages in thread From: Jeff King @ 2024-04-03 20:18 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git, Daniel Stenberg On Wed, Apr 03, 2024 at 08:34:37AM +0200, Patrick Steinhardt wrote: > > Can't we refactor this code to instead use `curl_easy_reset()`? That > > function already resets most of the data we want to reset and would also > > end up setting `POSFIELDSIZE = -1` via `Curl_init_userdefined()`. So > > wouldn't the following be a more sensible fix? > [...] > Oh well, the answer is "no", or at least not as easily as this, as the > failing tests tell us. I guess it resets more data than we actually want > it to reset, but I didn't dig any deeper than that. Yeah. The curl setup is really in two parts: 1. we make a handle in get_curl_handle(), which also sets up a bunch of options. We use that to make a single "curl_default" handle, and then when we want a new handle we curl_easy_duphandle() that 2. when we want to make a request we call get_active_slot(), which will either return an already-used handle or duphandle() a new one. And then reset some options, but also do some more setup. Your patch touches spot (2), so it's erasing all of the setup done in (1). I don't think there's a way to say "go back to state when we called duphandle(), but keep reusing connections, etc". Possibly it could be solved by pushing all of the setup from (1) into (2). Though that would probably require separating out some config handling, etc, from the actual curl_easy_setopt() calls (we wouldn't want to complain about invalid config for _every_ request, for example, just once per program run). This is all also a bit more complicated than it needs to be for smart-http. The dumb-http code may want to have several handles going at once to do individual object/pack downloads. Whereas for smart http, we really are only working with one handle, and doing one request at a time. I don't know if we'll ever drop dumb-http support, but there would probably be a lot of cleanup possibilities if we did. -Peff ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/2] INSTALL: bump libcurl version to 7.21.3 2024-04-02 20:02 ` [PATCH 0/2] git+curl 8.7.0 workaround Jeff King 2024-04-02 20:05 ` [PATCH 1/2] http: reset POSTFIELDSIZE when clearing curl handle Jeff King @ 2024-04-02 20:06 ` Jeff King 2024-04-02 20:21 ` [PATCH 0/2] git+curl 8.7.0 workaround rsbecker 2024-04-05 20:04 ` [PATCH 3/2] remote-curl: add Transfer-Encoding header only for older curl Jeff King 3 siblings, 0 replies; 15+ messages in thread From: Jeff King @ 2024-04-02 20:06 UTC (permalink / raw) To: git; +Cc: Daniel Stenberg Our documentation claims we support curl versions back to 7.19.5. But we can no longer compile with that version since adding an unconditional use of CURLOPT_RESOLVE in 511cfd3bff (http: add custom hostname to IP address resolutions, 2022-05-16). That feature wasn't added to libcurl until 7.21.3. We could add #ifdefs to make this work back to 7.19.5. But given that nobody noticed the compilation failure in the intervening two years, it makes more sense to bump the version in the documentation to 7.21.3 (which is itself over 13 years old). We could perhaps go forward even more (which would let us drop some cruft from git-curl-compat.h), but this should be an obviously safe jump, and we can move forward later. Note that user-visible syntax for CURLOPT_RESOLVE has grown new features in subsequent curl versions. Our documentation mentions "+" and "-" entries, which require more recent versions than 7.21.3. We could perhaps clarify that in our docs, but it's probably not worth cluttering them with restrictions of ancient curl versions. Signed-off-by: Jeff King <peff@peff.net> --- INSTALL | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/INSTALL b/INSTALL index c6fb240c91..2a46d04592 100644 --- a/INSTALL +++ b/INSTALL @@ -139,7 +139,7 @@ Issues of note: not need that functionality, use NO_CURL to build without it. - Git requires version "7.19.5" or later of "libcurl" to build + Git requires version "7.21.3" or later of "libcurl" to build without NO_CURL. This version requirement may be bumped in the future. -- 2.44.0.789.g5ea01f6724 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* RE: [PATCH 0/2] git+curl 8.7.0 workaround 2024-04-02 20:02 ` [PATCH 0/2] git+curl 8.7.0 workaround Jeff King 2024-04-02 20:05 ` [PATCH 1/2] http: reset POSTFIELDSIZE when clearing curl handle Jeff King 2024-04-02 20:06 ` [PATCH 2/2] INSTALL: bump libcurl version to 7.21.3 Jeff King @ 2024-04-02 20:21 ` rsbecker 2024-04-02 20:31 ` Jeff King 2024-04-05 20:04 ` [PATCH 3/2] remote-curl: add Transfer-Encoding header only for older curl Jeff King 3 siblings, 1 reply; 15+ messages in thread From: rsbecker @ 2024-04-02 20:21 UTC (permalink / raw) To: 'Jeff King', git; +Cc: 'Daniel Stenberg' On Tuesday, April 2, 2024 4:03 PM, Peff wrote: >To: git@vger.kernel.org >Cc: Daniel Stenberg <daniel@haxx.se> >Subject: [PATCH 0/2] git+curl 8.7.0 workaround > >On Sat, Mar 30, 2024 at 09:54:02AM +0100, Daniel Stenberg wrote: > >> On Fri, 29 Mar 2024, Jeff King wrote: >> >> > I noticed some http-related failures in the test suite on my Debian >> > unstable system, which recently got an upgraded curl package. It >> > looks like it's related to cases where we use the remote-curl helper >> > in "connect" mode (i.e., protocol v2) and the http buffer is small >> > (requiring us to stream the data to curl). Besides just running >> > t5551, an easy reproduction is: >> >> This smells like a libcurl regression to me. I "imported" this into >> our issue tracker here: https://github.com/curl/curl/issues/13229 > >This was all resolved in that issue, but just to summarize for the list >here: it was a regression in curl and there's a fix already. Thanks Daniel for your (as >usual) prompt digging into the problem (and likewise to Stefan for the actual fix). > >Ultimately the issue will be fixed by moving to a different version of libcurl, but >here's an easy workaround in the meantime, with a small doc cleanup I found along >the way. > > [1/2]: http: reset POSTFIELDSIZE when clearing curl handle > [2/2]: INSTALL: bump libcurl version to 7.21.3 > > INSTALL | 2 +- > http.c | 1 + > 2 files changed, 2 insertions(+), 1 deletion(-) Do we have an ETA for this fix? That or do we know when curl is planning on resolving this? Thanks, Randall ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] git+curl 8.7.0 workaround 2024-04-02 20:21 ` [PATCH 0/2] git+curl 8.7.0 workaround rsbecker @ 2024-04-02 20:31 ` Jeff King 0 siblings, 0 replies; 15+ messages in thread From: Jeff King @ 2024-04-02 20:31 UTC (permalink / raw) To: rsbecker; +Cc: git, 'Daniel Stenberg' On Tue, Apr 02, 2024 at 04:21:56PM -0400, rsbecker@nexbridge.com wrote: > Do we have an ETA for this fix? That or do we know when curl is > planning on resolving this? The fix is in curl's master branch already. According to their docs/RELEASE-PROCEDURE.md, the next release would be May 22. But Daniel is the more authoritative word. ;) -Peff ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 3/2] remote-curl: add Transfer-Encoding header only for older curl 2024-04-02 20:02 ` [PATCH 0/2] git+curl 8.7.0 workaround Jeff King ` (2 preceding siblings ...) 2024-04-02 20:21 ` [PATCH 0/2] git+curl 8.7.0 workaround rsbecker @ 2024-04-05 20:04 ` Jeff King 2024-04-05 21:30 ` Daniel Stenberg 2024-04-05 21:49 ` Junio C Hamano 3 siblings, 2 replies; 15+ messages in thread From: Jeff King @ 2024-04-05 20:04 UTC (permalink / raw) To: git; +Cc: Daniel Stenberg, Junio C Hamano On Tue, Apr 02, 2024 at 04:02:54PM -0400, Jeff King wrote: > Ultimately the issue will be fixed by moving to a different version of > libcurl, but here's an easy workaround in the meantime, with a small doc > cleanup I found along the way. > > [1/2]: http: reset POSTFIELDSIZE when clearing curl handle > [2/2]: INSTALL: bump libcurl version to 7.21.3 Here's a final patch on top that gives us a workaround for the HTTP/2 half of this. It's a cleaned-up version of what I showed earlier. -- >8 -- Subject: remote-curl: add Transfer-Encoding header only for older curl As of curl 7.66.0, we don't need to manually specify a "chunked" Transfer-Encoding header. Instead, modern curl deduces the need for it in a POST that has a POSTFIELDSIZE of -1 and uses READFUNCTION rather than POSTFIELDS. That version is recent enough that we can't just drop the header; we need to do so conditionally. Since it's only a single line, it seems like the simplest thing would just be to keep setting it unconditionally (after all, the #ifdefs are much longer than the actual code). But there's another wrinkle: HTTP/2. Curl may choose to use HTTP/2 under the hood if the server supports it. And in that protocol, we do not use the chunked encoding for streaming at all. Most versions of curl handle this just fine by recognizing and removing the header. But there's a regression in curl 8.7.0 and 8.7.1 where it doesn't, and large requests over HTTP/2 are broken (which t5559 notices). That regression has since been fixed upstream, but not yet released. Make the setting of this header conditional, which will let Git work even with those buggy curl versions. And as a bonus, it serves as a reminder that we can eventually clean up the code as we bump the supported curl versions. Signed-off-by: Jeff King <peff@peff.net> --- Manually tested against curl 8.7.0 (where skipping the header fixes t5559), as well as 7.65.0 (which continues to pass t5551, but would fail if we didn't set the header). git-curl-compat.h | 9 +++++++++ remote-curl.c | 3 +++ 2 files changed, 12 insertions(+) diff --git a/git-curl-compat.h b/git-curl-compat.h index fd96b3cdff..e1d0bdd273 100644 --- a/git-curl-compat.h +++ b/git-curl-compat.h @@ -126,6 +126,15 @@ #define GIT_CURL_HAVE_CURLSSLSET_NO_BACKENDS #endif +/** + * Versions before curl 7.66.0 (September 2019) required manually setting the + * transfer-encoding for a streaming POST; after that this is handled + * automatically. + */ +#if LIBCURL_VERSION_NUM < 0x074200 +#define GIT_CURL_NEED_TRANSFER_ENCODING_HEADER +#endif + /** * CURLOPT_PROTOCOLS_STR and CURLOPT_REDIR_PROTOCOLS_STR were added in 7.85.0, * released in August 2022. diff --git a/remote-curl.c b/remote-curl.c index 31b02b8840..0b6d7815fd 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -1,4 +1,5 @@ #include "git-compat-util.h" +#include "git-curl-compat.h" #include "config.h" #include "environment.h" #include "gettext.h" @@ -955,7 +956,9 @@ static int post_rpc(struct rpc_state *rpc, int stateless_connect, int flush_rece /* The request body is large and the size cannot be predicted. * We must use chunked encoding to send it. */ +#ifdef GIT_CURL_NEED_TRANSFER_ENCODING_HEADER headers = curl_slist_append(headers, "Transfer-Encoding: chunked"); +#endif rpc->initial_buffer = 1; curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, rpc_out); curl_easy_setopt(slot->curl, CURLOPT_INFILE, rpc); -- 2.44.0.872.gbe303efee2 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 3/2] remote-curl: add Transfer-Encoding header only for older curl 2024-04-05 20:04 ` [PATCH 3/2] remote-curl: add Transfer-Encoding header only for older curl Jeff King @ 2024-04-05 21:30 ` Daniel Stenberg 2024-04-05 21:49 ` Junio C Hamano 1 sibling, 0 replies; 15+ messages in thread From: Daniel Stenberg @ 2024-04-05 21:30 UTC (permalink / raw) To: Jeff King; +Cc: git, Junio C Hamano On Fri, 5 Apr 2024, Jeff King wrote: > Here's a final patch on top that gives us a workaround for the HTTP/2 > half of this. It's a cleaned-up version of what I showed earlier. +1 from me. (and appologies for this regression, but you know...) -- / daniel.haxx.se ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 3/2] remote-curl: add Transfer-Encoding header only for older curl 2024-04-05 20:04 ` [PATCH 3/2] remote-curl: add Transfer-Encoding header only for older curl Jeff King 2024-04-05 21:30 ` Daniel Stenberg @ 2024-04-05 21:49 ` Junio C Hamano 1 sibling, 0 replies; 15+ messages in thread From: Junio C Hamano @ 2024-04-05 21:49 UTC (permalink / raw) To: Jeff King; +Cc: git, Daniel Stenberg Jeff King <peff@peff.net> writes: > ... And as a bonus, it serves as a > reminder that we can eventually clean up the code as we bump the > supported curl versions. ;-). The way git-curl-compat.h lays out various features/workarounds and the lowest version that needs them makes it very easy to do that. Thanks, will queue. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2024-04-05 21:49 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-03-30 0:02 tests broken with curl-8.7.0 Jeff King 2024-03-30 8:54 ` Daniel Stenberg 2024-04-02 20:02 ` [PATCH 0/2] git+curl 8.7.0 workaround Jeff King 2024-04-02 20:05 ` [PATCH 1/2] http: reset POSTFIELDSIZE when clearing curl handle Jeff King 2024-04-02 20:27 ` Junio C Hamano 2024-04-03 3:20 ` Jeff King 2024-04-03 6:30 ` Patrick Steinhardt 2024-04-03 6:34 ` Patrick Steinhardt 2024-04-03 20:18 ` Jeff King 2024-04-02 20:06 ` [PATCH 2/2] INSTALL: bump libcurl version to 7.21.3 Jeff King 2024-04-02 20:21 ` [PATCH 0/2] git+curl 8.7.0 workaround rsbecker 2024-04-02 20:31 ` Jeff King 2024-04-05 20:04 ` [PATCH 3/2] remote-curl: add Transfer-Encoding header only for older curl Jeff King 2024-04-05 21:30 ` Daniel Stenberg 2024-04-05 21:49 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).