* [PATCH] http: do not ignore proxy path @ 2024-07-26 15:52 Ryan Hendrickson via GitGitGadget 2024-07-26 16:29 ` Junio C Hamano ` (2 more replies) 0 siblings, 3 replies; 31+ messages in thread From: Ryan Hendrickson via GitGitGadget @ 2024-07-26 15:52 UTC (permalink / raw) To: git; +Cc: Ryan Hendrickson, Ryan Hendrickson From: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> The documentation for `http.proxy` describes that option, and the environment variables it overrides, as supporting "the syntax understood by curl". curl allows SOCKS proxies to use a path to a Unix domain socket, like `socks5h://localhost/path/to/socket.sock`. Git should therefore include, if present, the path part of the proxy URL in what it passes to libcurl. Signed-off-by: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> --- http: do not ignore proxy path * Documentation: do I need to add anything? * Tests: I could use a pointer on how best to add a test for this. Adding a case to t5564-http-proxy.sh seems straightforward but I don't think httpd can be configured to listen to domain sockets; can I use netcat? Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1767%2Frhendric%2Frhendric%2Fhttp-proxy-path-v1 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1767/rhendric/rhendric/http-proxy-path-v1 Pull-Request: https://github.com/gitgitgadget/git/pull/1767 http.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/http.c b/http.c index 623ed234891..0cd75986a6b 100644 --- a/http.c +++ b/http.c @@ -1265,7 +1265,13 @@ static CURL *get_curl_handle(void) if (!proxy_auth.host) die("Invalid proxy URL '%s'", curl_http_proxy); - curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host); + if (proxy_auth.path) { + struct strbuf proxy = STRBUF_INIT; + strbuf_addf(&proxy, "%s/%s", proxy_auth.host, proxy_auth.path); + curl_easy_setopt(result, CURLOPT_PROXY, proxy.buf); + strbuf_release(&proxy); + } else + curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host); var_override(&curl_no_proxy, getenv("NO_PROXY")); var_override(&curl_no_proxy, getenv("no_proxy")); curl_easy_setopt(result, CURLOPT_NOPROXY, curl_no_proxy); base-commit: ad57f148c6b5f8735b62238dda8f571c582e0e54 -- gitgitgadget ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH] http: do not ignore proxy path 2024-07-26 15:52 [PATCH] http: do not ignore proxy path Ryan Hendrickson via GitGitGadget @ 2024-07-26 16:29 ` Junio C Hamano 2024-07-26 17:12 ` Ryan Hendrickson 2024-07-26 21:11 ` Jeff King 2024-07-27 6:44 ` [PATCH v2] " Ryan Hendrickson via GitGitGadget 2 siblings, 1 reply; 31+ messages in thread From: Junio C Hamano @ 2024-07-26 16:29 UTC (permalink / raw) To: Ryan Hendrickson via GitGitGadget; +Cc: git, Ryan Hendrickson "Ryan Hendrickson via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> > > The documentation for `http.proxy` describes that option, and the > environment variables it overrides, as supporting "the syntax understood > by curl". curl allows SOCKS proxies to use a path to a Unix domain > socket, like `socks5h://localhost/path/to/socket.sock`. Git should > therefore include, if present, the path part of the proxy URL in what it > passes to libcurl. > > Signed-off-by: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> > --- > http: do not ignore proxy path > > * Documentation: do I need to add anything? http.proxy documentation says The syntax thus is [protocol://][user[:password]@]proxyhost[:port]. but the updated code pays attention to what can come after the "host[:post]" part, does it not? > diff --git a/http.c b/http.c > index 623ed234891..0cd75986a6b 100644 > --- a/http.c > +++ b/http.c I am unfamiliar with this code path, so let me follow along aloud. > @@ -1265,7 +1265,13 @@ static CURL *get_curl_handle(void) > if (!proxy_auth.host) > die("Invalid proxy URL '%s'", curl_http_proxy); We grabbed the value from the configuration variable (or various environment variables like $http_proxy) in the curl_http_proxy variable, and then passed it to credential_from_url() to parse parts out of [protocol://][user[:password]@]proxyhost[:port][/p/a/t/h]. The parsed result is in proxy_auth struct and there is no hope it can be usable if the .host member is missing. > - curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host); We used to only use the .host member but the curl_http_proxy could have had a path after it, held in the .path member. > + if (proxy_auth.path) { > + struct strbuf proxy = STRBUF_INIT; > + strbuf_addf(&proxy, "%s/%s", proxy_auth.host, proxy_auth.path); > + curl_easy_setopt(result, CURLOPT_PROXY, proxy.buf); > + strbuf_release(&proxy); > + } else > + curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host); Style. If "if" side needs {braces} because it consists of multiple statements, the corresponding "else" side should also have {braces} around its body, even if it only has a single statement. If you have the proxy strbuf in a bit wider scope, then the above becomes if (proxy_auth.path) strbuf_addf(&proxy, "%s/%s", proxy_auth.host, proxy_auth.path); else strbuf_addstr(&proxy, proxy_auth.host); curl_easy_setopt(result, CURLOPT_PROXY, proxy.buf); strbuf_release(&proxy); which might (I am not decided) be easier to follow. Could existing users have been taking advantage of the fact that the extra /path at the end of http.proxy (and $http_proxy and friends) are ignored? For them, this change will appear as a regression. Other than that (and the lack of any documentation and test updates), looking good. Thanks. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] http: do not ignore proxy path 2024-07-26 16:29 ` Junio C Hamano @ 2024-07-26 17:12 ` Ryan Hendrickson 2024-07-26 17:45 ` Junio C Hamano 0 siblings, 1 reply; 31+ messages in thread From: Ryan Hendrickson @ 2024-07-26 17:12 UTC (permalink / raw) To: Junio C Hamano; +Cc: git At 2024-07-26 09:29-0700, Junio C Hamano <gitster@pobox.com> sent: > http.proxy documentation says > > The syntax thus is [protocol://][user[:password]@]proxyhost[:port]. > > but the updated code pays attention to what can come after the > "host[:post]" part, does it not? Correct; I'll add a "[/path]" to that construction. >> + if (proxy_auth.path) { >> + struct strbuf proxy = STRBUF_INIT; >> + strbuf_addf(&proxy, "%s/%s", proxy_auth.host, proxy_auth.path); >> + curl_easy_setopt(result, CURLOPT_PROXY, proxy.buf); >> + strbuf_release(&proxy); >> + } else >> + curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host); > > Style. If "if" side needs {braces} because it consists of multiple > statements, the corresponding "else" side should also have {braces} > around its body, even if it only has a single statement. > > If you have the proxy strbuf in a bit wider scope, then the above becomes > > if (proxy_auth.path) > strbuf_addf(&proxy, "%s/%s", proxy_auth.host, proxy_auth.path); > else > strbuf_addstr(&proxy, proxy_auth.host); > curl_easy_setopt(result, CURLOPT_PROXY, proxy.buf); > strbuf_release(&proxy); > > which might (I am not decided) be easier to follow. For what it's worth, I was following the precedent set by the if-statement starting at line 1256 (a few lines above this patch): > if (strstr(curl_http_proxy, "://")) > credential_from_url(&proxy_auth, curl_http_proxy); > else { > struct strbuf url = STRBUF_INIT; > strbuf_addf(&url, "http://%s", curl_http_proxy); > credential_from_url(&proxy_auth, url.buf); > strbuf_release(&url); > } <https://github.com/git/git/blob/ad57f148c6b5f8735b62238dda8f571c582e0e54/http.c#L1256> I have no problem with being inconsistent with surrounding code in these style choices; just let me know what I should do in light of that. > Could existing users have been taking advantage of the fact that the > extra /path at the end of http.proxy (and $http_proxy and friends) > are ignored? For them, this change will appear as a regression. That is possible, though I have difficulty imagining a scenario in which it would be intentional. What do you recommend I do about that possibility? R ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] http: do not ignore proxy path 2024-07-26 17:12 ` Ryan Hendrickson @ 2024-07-26 17:45 ` Junio C Hamano 0 siblings, 0 replies; 31+ messages in thread From: Junio C Hamano @ 2024-07-26 17:45 UTC (permalink / raw) To: Ryan Hendrickson; +Cc: git Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> writes: > For what it's worth, I was following the precedent set by the > if-statement starting at line 1256 (a few lines above this patch): It is worth nothing. Existing violation does not make it a good idea to add more of them. It makes it more work to clean them all up later. >> Could existing users have been taking advantage of the fact that the >> extra /path at the end of http.proxy (and $http_proxy and friends) >> are ignored? For them, this change will appear as a regression. > > That is possible, though I have difficulty imagining a scenario in > which it would be intentional. > > What do you recommend I do about that possibility? I have no idea. I already said that I am not familiar with this code path, and it is likely I am a worse person than you are to find a potential creative (ab)uses of the mechanism to take advantage of the fact that the current code ignores the path part. Having said that, given that http.proxy falls back to environment variables that have been established a lot longer than Git itself, like HTTPS_PROXY etc., if we _know_ that setting HTTPS_PROXY to such a value with /path part causes curl (or other popular programs) to try using such a value without stripping the path part (and even better if that causes them to fail), then such an observation would make a very good argument why it is extremely unlikely that existing users used such a setting, hence it is unlikely this patch would make a regression. Thanks. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] http: do not ignore proxy path 2024-07-26 15:52 [PATCH] http: do not ignore proxy path Ryan Hendrickson via GitGitGadget 2024-07-26 16:29 ` Junio C Hamano @ 2024-07-26 21:11 ` Jeff King 2024-07-26 22:43 ` Ryan Hendrickson 2024-07-27 6:44 ` [PATCH v2] " Ryan Hendrickson via GitGitGadget 2 siblings, 1 reply; 31+ messages in thread From: Jeff King @ 2024-07-26 21:11 UTC (permalink / raw) To: Ryan Hendrickson via GitGitGadget; +Cc: git, Ryan Hendrickson On Fri, Jul 26, 2024 at 03:52:50PM +0000, Ryan Hendrickson via GitGitGadget wrote: > * Tests: I could use a pointer on how best to add a test for this. > Adding a case to t5564-http-proxy.sh seems straightforward but I > don't think httpd can be configured to listen to domain sockets; can > I use netcat? I don't offhand know of a way to test this without a custom program like netcat. If it's the only option, it's OK to use tools that might not be available everywhere as long as the tests are marked as optional with the appropriate prerequisite. You can find prior art by looking for test_lazy_prereq calls (e.g., the ones for GZIP or ZIPINFO are pretty straight-forward). I would warn that there are several not-quite-compatible variants of netcat floating around, which can create headaches. You might be better off with a short perl script using IO::Socket::UNIX or similar. > diff --git a/http.c b/http.c > index 623ed234891..0cd75986a6b 100644 > --- a/http.c > +++ b/http.c > @@ -1265,7 +1265,13 @@ static CURL *get_curl_handle(void) > if (!proxy_auth.host) > die("Invalid proxy URL '%s'", curl_http_proxy); > > - curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host); > + if (proxy_auth.path) { > + struct strbuf proxy = STRBUF_INIT; > + strbuf_addf(&proxy, "%s/%s", proxy_auth.host, proxy_auth.path); > + curl_easy_setopt(result, CURLOPT_PROXY, proxy.buf); > + strbuf_release(&proxy); > + } else > + curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host); The fields in the proxy_auth struct have been parsed from the url, with any url encoding removed. But then we paste them back together into a pseudo-url without doing any further encoding. Is that correct? I doubt that the host contains a "/", but if you had a path that contained a "%", then the URL form of that is going to be %25. Which is curl expecting to get here? I say "pseudo-url" because it is weird to slap a path on the end of the hostname but leave off the scheme, etc. Which kind of makes me wonder why we pass proxy_auth.host in the first place, and not simply the original curl_http_proxy. It looks like a weird interaction between 372370f167 (http: use credential API to handle proxy authentication, 2016-01-26) and 57415089bd (http: honor empty http.proxy option to bypass proxy, 2017-04-11). The former added a _second_ CURLOPT_PROXY call, and the latter removed the first one. I wonder if we could go back to passing the string straight to curl (as we did prior to 2016), and keeping the proxy_auth struct purely as a mechanism for gathering credentials. That would presumably fix your use case. And this is also perhaps an interesting data point for Junio's question about regressions (if people were passing paths, it used to work and was broken in 2016). -Peff ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] http: do not ignore proxy path 2024-07-26 21:11 ` Jeff King @ 2024-07-26 22:43 ` Ryan Hendrickson 2024-07-29 19:31 ` Jeff King 0 siblings, 1 reply; 31+ messages in thread From: Ryan Hendrickson @ 2024-07-26 22:43 UTC (permalink / raw) To: Jeff King; +Cc: git At 2024-07-26 17:11-0400, Jeff King <peff@peff.net> sent: > I would warn that there are several not-quite-compatible variants of > netcat floating around, which can create headaches. You might be better > off with a short perl script using IO::Socket::UNIX or similar. Ah, okay, thanks for the pointer! >> diff --git a/http.c b/http.c >> index 623ed234891..0cd75986a6b 100644 >> --- a/http.c >> +++ b/http.c >> @@ -1265,7 +1265,13 @@ static CURL *get_curl_handle(void) >> if (!proxy_auth.host) >> die("Invalid proxy URL '%s'", curl_http_proxy); >> >> - curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host); >> + if (proxy_auth.path) { >> + struct strbuf proxy = STRBUF_INIT; >> + strbuf_addf(&proxy, "%s/%s", proxy_auth.host, proxy_auth.path); >> + curl_easy_setopt(result, CURLOPT_PROXY, proxy.buf); >> + strbuf_release(&proxy); >> + } else >> + curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host); > > The fields in the proxy_auth struct have been parsed from the url, with > any url encoding removed. But then we paste them back together into a > pseudo-url without doing any further encoding. Is that correct? > > I doubt that the host contains a "/", but if you had a path that > contained a "%", then the URL form of that is going to be %25. Which is > curl expecting to get here? Oh, I see! Yes, this is an issue with my patch: if I create a socket file named "%30", command-line curl wants http_proxy to contain "%2530", and patched Git wants http_proxy to contain "%252530". Good edge case to put in a test. > I wonder if we could go back to passing the string straight to curl (as > we did prior to 2016), and keeping the proxy_auth struct purely as a > mechanism for gathering credentials. Hmm, that would be nice, but I think curl doesn't deal well with the extra case that Git supports of specifying a username but no password. It causes one of the existing tests to fail if I pass the string straight through. On top of that, all of those starts_with tests for checking the protocol are written quite loosely, so in practice Git "supports" the protocols "socks://" and "socksonmyfeet://" by mapping them both to SOCKS4, and curl would not like it if it received those strings directly. So given that Git wants to handle the protocol and the credentials, it makes sense that only the host and the path are passed to curl. I just have to make sure that they are correctly re-encoded. R ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] http: do not ignore proxy path 2024-07-26 22:43 ` Ryan Hendrickson @ 2024-07-29 19:31 ` Jeff King 0 siblings, 0 replies; 31+ messages in thread From: Jeff King @ 2024-07-29 19:31 UTC (permalink / raw) To: Ryan Hendrickson; +Cc: git On Fri, Jul 26, 2024 at 06:43:29PM -0400, Ryan Hendrickson wrote: > > I wonder if we could go back to passing the string straight to curl (as > > we did prior to 2016), and keeping the proxy_auth struct purely as a > > mechanism for gathering credentials. > > Hmm, that would be nice, but I think curl doesn't deal well with the extra > case that Git supports of specifying a username but no password. It causes > one of the existing tests to fail if I pass the string straight through. OK, thanks for trying. It would have been nice, but I'm not surprised that there's an unusual interaction. > On top of that, all of those starts_with tests for checking the protocol are > written quite loosely, so in practice Git "supports" the protocols > "socks://" and "socksonmyfeet://" by mapping them both to SOCKS4, and curl > would not like it if it received those strings directly. > > So given that Git wants to handle the protocol and the credentials, it makes > sense that only the host and the path are passed to curl. I just have to > make sure that they are correctly re-encoded. Makes sense. -Peff ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v2] http: do not ignore proxy path 2024-07-26 15:52 [PATCH] http: do not ignore proxy path Ryan Hendrickson via GitGitGadget 2024-07-26 16:29 ` Junio C Hamano 2024-07-26 21:11 ` Jeff King @ 2024-07-27 6:44 ` Ryan Hendrickson via GitGitGadget 2024-07-29 20:09 ` Jeff King 2024-07-31 16:01 ` [PATCH v3] " Ryan Hendrickson via GitGitGadget 2 siblings, 2 replies; 31+ messages in thread From: Ryan Hendrickson via GitGitGadget @ 2024-07-27 6:44 UTC (permalink / raw) To: git; +Cc: Ryan Hendrickson, Jeff King, Ryan Hendrickson, Ryan Hendrickson From: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> The documentation for `http.proxy` describes that option, and the environment variables it overrides, as supporting "the syntax understood by curl". curl allows SOCKS proxies to use a path to a Unix domain socket, like `socks5h://localhost/path/to/socket.sock`. Git should therefore include, if present, the path part of the proxy URL in what it passes to libcurl. Signed-off-by: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> --- http: do not ignore proxy path Re earlier discussion about regressions, it turns out that curl has only supported this syntax since 2022 [https://curl.se/ch/7.84.0.html]. Earlier versions of curl, if provided an http_proxy that contained a path, would also ignore it. curl didn't seem to consider this a breaking change when the feature was introduced [https://github.com/curl/curl/pull/8668], though; is that a valid argument for Git not to lose sleep over it either? A test has been added, but unfortunately it is not being run on any of GitHub's CI workflows, because the runner images in use all have versions of libcurl that are too old. It works on my machine, but unknown troubles may await someone else. Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1767%2Frhendric%2Frhendric%2Fhttp-proxy-path-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1767/rhendric/rhendric/http-proxy-path-v2 Pull-Request: https://github.com/gitgitgadget/git/pull/1767 Range-diff vs v1: 1: 6b6ef680dce < -: ----------- http: do not ignore proxy path -: ----------- > 1: b4715eba5b1 http: do not ignore proxy path Documentation/config/http.txt | 4 +- http.c | 20 ++- t/socks5-proxy-server/LICENSE | 21 +++ t/socks5-proxy-server/README.md | 27 +++ t/socks5-proxy-server/src/socks5.pl | 260 ++++++++++++++++++++++++++++ t/t5564-http-proxy.sh | 18 ++ 6 files changed, 347 insertions(+), 3 deletions(-) create mode 100644 t/socks5-proxy-server/LICENSE create mode 100644 t/socks5-proxy-server/README.md create mode 100755 t/socks5-proxy-server/src/socks5.pl diff --git a/Documentation/config/http.txt b/Documentation/config/http.txt index 162b33fc52f..a14371b5c96 100644 --- a/Documentation/config/http.txt +++ b/Documentation/config/http.txt @@ -5,8 +5,8 @@ http.proxy:: proxy string with a user name but no password, in which case git will attempt to acquire one in the same way it does for other credentials. See linkgit:gitcredentials[7] for more information. The syntax thus is - '[protocol://][user[:password]@]proxyhost[:port]'. This can be overridden - on a per-remote basis; see remote.<name>.proxy + '[protocol://][user[:password]@]proxyhost[:port][/path]'. This can be + overridden on a per-remote basis; see remote.<name>.proxy + Any proxy, however configured, must be completely transparent and must not modify, transform, or buffer the request or response in any way. Proxies which diff --git a/http.c b/http.c index 623ed234891..a6001296dd0 100644 --- a/http.c +++ b/http.c @@ -1227,6 +1227,7 @@ static CURL *get_curl_handle(void) */ curl_easy_setopt(result, CURLOPT_PROXY, ""); } else if (curl_http_proxy) { + struct strbuf proxy = STRBUF_INIT; if (starts_with(curl_http_proxy, "socks5h")) curl_easy_setopt(result, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS5_HOSTNAME); @@ -1265,7 +1266,24 @@ static CURL *get_curl_handle(void) if (!proxy_auth.host) die("Invalid proxy URL '%s'", curl_http_proxy); - curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host); + strbuf_addstr(&proxy, proxy_auth.host); + if (proxy_auth.path) { + int path_is_supported = 0; + /* curl_version_info was added in curl 7.10 */ +#if LIBCURL_VERSION_NUM >= 0x070a00 + curl_version_info_data *ver = curl_version_info(CURLVERSION_NOW); + path_is_supported = ver->version_num >= 0x075400; +#endif + if (path_is_supported) { + strbuf_addch(&proxy, '/'); + strbuf_add_percentencode(&proxy, proxy_auth.path, 0); + } else { + die("libcurl 7.84 or later is required to support paths in proxy URLs"); + } + } + curl_easy_setopt(result, CURLOPT_PROXY, proxy.buf); + strbuf_release(&proxy); + var_override(&curl_no_proxy, getenv("NO_PROXY")); var_override(&curl_no_proxy, getenv("no_proxy")); curl_easy_setopt(result, CURLOPT_NOPROXY, curl_no_proxy); diff --git a/t/socks5-proxy-server/LICENSE b/t/socks5-proxy-server/LICENSE new file mode 100644 index 00000000000..94981550c94 --- /dev/null +++ b/t/socks5-proxy-server/LICENSE @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2014 kaimi.io + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/t/socks5-proxy-server/README.md b/t/socks5-proxy-server/README.md new file mode 100644 index 00000000000..6eed0e5ce7a --- /dev/null +++ b/t/socks5-proxy-server/README.md @@ -0,0 +1,27 @@ +socks5-proxy-server +=================== + +This Perl code implements a SOCKS5 proxy server that listens for incoming connections and processes them in separate threads. The server takes three input parameters: `path`, `login`, and `password`. + +When a client attempts to connect to the server, the server checks if the client supports any of the available authentication methods (no authentication or login/password authentication). If a suitable method is found, the server establishes a connection with the target server and begins forwarding data between the client and the target server. + +The code uses the `IO::Select` module for working with sockets and the `threads` module for creating and managing threads. It includes several functions, including: + +- `main`: the main function that creates threads for processing incoming connections. +- `replenish`: a function that creates additional threads if the number of free threads is less than the specified value. +- `new_client`: a function that handles incoming client connections and checks if the available authentication methods are supported by the client. +- `socks_auth`: a function that performs login and password authentication. +- `handle_client`: a function that establishes a connection with the target server and forwards data between the client and the target server. + +This code includes the use of the following Perl modules: `IO::Select`, `Socket`, `threads`, `threads::shared`. + +To run this code, enter the following command: +`perl socks5.pl path login password` + +Note that this code is designed for educational purposes and should not be used in production environments without proper modifications and security measures. 😊 + +This code is distributed under the [MIT License](LICENSE), which allows for free use, modification, and distribution of the code as long as the original copyright notice and license are included. + +--- + +Originally from: https://github.com/kaimi-io/socks5-proxy-server diff --git a/t/socks5-proxy-server/src/socks5.pl b/t/socks5-proxy-server/src/socks5.pl new file mode 100755 index 00000000000..3ef66a1a287 --- /dev/null +++ b/t/socks5-proxy-server/src/socks5.pl @@ -0,0 +1,260 @@ +#!/usr/bin/perl +#В коде использованы части rdss (.Slip) и sss (hm2k) +use strict; +use IO::Select; +use Socket; +use threads; +use threads::shared; + +if(scalar @ARGV < 1) +{ + die "Usage: serv.pl path login pass\n"; +} + +my ($path, $login, $passw) = @ARGV; + +my %config = (thr_init => 10, thr_min => 2, thr_max => 20, conn_limit => SOMAXCONN); + +$| = 1; + +my %status : shared; +my $accept : shared; + +my $sock_addr = sockaddr_un $path or die $!; +socket my $socket, PF_UNIX, SOCK_STREAM, 0; +bind $socket, $sock_addr or die $!; +listen $socket, $config{'conn_limit'} or die $!; + + +my $sel = IO::Select->new($socket); + +print "Starting...\n"; + +replenish($config{'thr_init'}); + +while(1) +{ + lock %status; + cond_wait %status; + + my @free = sort {$a <=> $b} grep {$status{$_} eq 'free'} keys %status; + + if(scalar @free < $config{'thr_min'}) + { + replenish($#free - $config{'thr_min'}); + } + elsif(scalar @free > $config{'thr_max'}) + { + my @kill = @free[0..$#free - $config{'thr_max'}]; + status($_ => 'kill') for @kill; + } +} + +status($_ => 'kill') for keys %status; + +sub main +{ + my $sock = shift; + my $loop = 50; + + my $tid = threads->tid; + my $conn; + + threads->self->detach; + status($tid, 'free'); + + while(status($tid) ne 'kill' && $loop > 0) + { + next unless $sel->can_read(.1); + { + lock $accept; + next unless accept $conn, $sock; + } + $loop--; + status($tid => 'work'); + new_client($conn); + close $conn; + status($tid => 'free'); + } + + status($tid, 'kill'); +} + +sub status +{ + my ($tid, $state) = @_; + lock %status; + + return $status{$tid} unless $state; + if($state) + { + $status{$tid} = $state unless defined $status{$tid} and $status{$tid} eq 'kill'; + } + else + { + delete $status{$tid}; + } + + cond_broadcast %status; +} + +sub replenish +{ + threads->create(\&main, $socket) for 1..$_[0]; +} + +sub new_client +{ + my $sock = shift; + + sysread $sock, my $buf, 2; + + my ($ver, $auth_num) = unpack('H2H2', $buf); + #Версия протокола + return unless $ver eq '05'; + sysread $sock, $buf, $auth_num; + + my $ok = 0; + #Перечисление методов авторизации + for(my ($mode, $i) = (0, 0); $i < $auth_num; $mode = ord substr $buf, ++$i, 1) + { + #0 - Без авторизации; 2 - Username/Password + if($mode == 0 && !$login) + { + syswrite $sock, "\x05\x00"; + $ok++; + last; + } + elsif($mode == 2 && $login) + { + return unless socks_auth($sock); + $ok++; + last; + } + } + #Подходящие методы есть + if($ok) + { + sysread $sock, $buf, 3; + my ($ver, $cmd, $r) = unpack 'H2H2H2', $buf; + + if($ver eq '05' && $r eq '00') + { + my ($client_host, $client_host_raw, $client_port, $client_port_raw) = get_conn_info($sock); + return unless ($client_host || $client_port); + + syswrite $sock, "\x05\x00\x00".$client_host_raw.$client_port_raw; + handle_client($sock, $client_host, $client_port, $cmd); + } + } + else + { + syswrite $sock, "\x05\xFF"; + } +} + +sub socks_auth +{ + my $sock = shift; + + syswrite $sock, "\x05\x02"; + sysread $sock, my $buf, 1; + + if(ord $buf == 1) + { + #username length : username : password length : password + sysread $sock, $buf, 1; + sysread $sock, my $s_login, ord($buf); + sysread $sock, $buf, 1; + sysread $sock, my $s_passw, ord($buf); + + #0x00 = success; any other value = failure + if($login eq $s_login && $passw eq $s_passw) + { + syswrite $sock, "\x05\x00"; + return 1; + } + else + { + syswrite $sock, "\x05\x01"; + } + } + + return 0; +} + +sub handle_client +{ + my ($sock, $host, $port, $cmd) = @_; + + #0x01 = establish a TCP/IP stream connection + if($cmd == 1) + { + my $sock_addr = sockaddr_in $port, inet_aton($host) or return; + socket my $target, PF_INET, SOCK_STREAM, getprotobyname('tcp') or return; + connect $target, $sock_addr or return; + + while($sock || $target) + { + my ($rin, $cbuf, $tbuf, $rout, $eout) = ('', '', '', '', ''); + vec($rin, fileno($sock), 1) = 1 if $sock; + vec($rin, fileno($target), 1) = 1 if $target; + select($rout = $rin, undef, $eout = $rin, 120); + return if(!$rout && !$eout); + + if($sock && (vec($eout, fileno($sock), 1) || vec($rout, fileno($sock), 1))) + { + my $res = sysread $sock, $tbuf, 1024; + return if(!defined $res || !$res); + } + if($target && (vec($eout, fileno($target), 1) || vec($rout, fileno($target), 1))) + { + my $res = sysread $target, $cbuf, 1024; + return if(!defined $res || !$res); + } + while(my $len = length($tbuf)) + { + my $r = syswrite $target, $tbuf, $len; + return if(!defined $r || $r <= 0); + $tbuf = substr($tbuf, $r); + } + while(my $len = length($cbuf)) + { + my $r = syswrite $sock, $cbuf, $len; + return if(!defined $r || $r <= 0); + $cbuf = substr($cbuf, $r); + } + } + } +} + +sub get_conn_info +{ + my $sock = shift; + + my ($host, $raw_host) = ('', ''); + sysread $sock, my $buf, 1; + ($raw_host, $buf) = ($buf, ord $buf); + #0x01 = IPv4 address; 0x03 = Domain name + if($buf == 1) + { + #4 bytes for IPv4 address + sysread $sock, $buf, 4; + $raw_host .= $buf; + $host = join '.', map(ord, split //, $buf); + } + elsif($buf == 3) + { + #1 byte of name length followed by the name for Domain name + sysread $sock, $buf, 1; + sysread $sock, $host, ord($buf); + $raw_host .= $host; + } + + #port number in a network byte order, 2 bytes + my ($port, $raw_port) = ('', ''); + sysread $sock, $raw_port, 2; + $port = ord(substr($raw_port, 0, 1)) << 8 | ord(substr($raw_port, 1, 1)); + + return $host, $raw_host, $port, $raw_port; +} diff --git a/t/t5564-http-proxy.sh b/t/t5564-http-proxy.sh index bb35b87071d..bafaa31adf8 100755 --- a/t/t5564-http-proxy.sh +++ b/t/t5564-http-proxy.sh @@ -39,4 +39,22 @@ test_expect_success 'clone can prompt for proxy password' ' expect_askpass pass proxuser ' +start_socks() { + # The %30 tests that the correct amount of percent-encoding is applied + # to the proxy string passed to curl. + "$PERL_PATH" "$TEST_DIRECTORY/socks5-proxy-server/src/socks5.pl" \ + "$TRASH_DIRECTORY/%30.sock" proxuser proxpass & + socks_pid=$! +} + +test_lazy_prereq LIBCURL_7_84 'expr x$(curl-config --vernum) \>= x075400' + +test_expect_success PERL,LIBCURL_7_84 'clone via Unix socket' ' + start_socks && + test_when_finished "rm -rf clone && kill $socks_pid" && + test_config_global http.proxy "socks5://proxuser:proxpass@localhost$PWD/%2530.sock" && + GIT_TRACE_CURL=$PWD/trace git clone "$HTTPD_URL/smart/repo.git" clone && + grep -i "SOCKS5 request granted." trace +' + test_done base-commit: ad57f148c6b5f8735b62238dda8f571c582e0e54 -- gitgitgadget ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH v2] http: do not ignore proxy path 2024-07-27 6:44 ` [PATCH v2] " Ryan Hendrickson via GitGitGadget @ 2024-07-29 20:09 ` Jeff King 2024-07-31 15:33 ` Ryan Hendrickson 2024-07-31 16:01 ` [PATCH v3] " Ryan Hendrickson via GitGitGadget 1 sibling, 1 reply; 31+ messages in thread From: Jeff King @ 2024-07-29 20:09 UTC (permalink / raw) To: Ryan Hendrickson via GitGitGadget; +Cc: git, Ryan Hendrickson On Sat, Jul 27, 2024 at 06:44:42AM +0000, Ryan Hendrickson via GitGitGadget wrote: > Re earlier discussion about regressions, it turns out that curl has only > supported this syntax since 2022 [https://curl.se/ch/7.84.0.html]. > Earlier versions of curl, if provided an http_proxy that contained a > path, would also ignore it. curl didn't seem to consider this a breaking > change when the feature was introduced > [https://github.com/curl/curl/pull/8668], though; is that a valid > argument for Git not to lose sleep over it either? IMHO it is probably OK. The path did not do anything before then, so why would anybody pass it? Looking into the curl support, I did notice two things: - unix sockets are only supported for socks proxies. Is it meaningful to have a path on an http proxy? I don't think so (there is no place to send it in the "GET" line). But perhaps we should limit the new code only to socks. - curl says you should put "localhost" in the host field for this, though obviously it is not otherwise used. Should we likewise require that to kick in the code? > @@ -1265,7 +1266,24 @@ static CURL *get_curl_handle(void) > if (!proxy_auth.host) > die("Invalid proxy URL '%s'", curl_http_proxy); > > - curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host); > + strbuf_addstr(&proxy, proxy_auth.host); > + if (proxy_auth.path) { There's one gotcha here I wondered about: we usually throw away the path element, unless credential.useHTTPPath is set. That only happens after we load the per-credential config, though, via credential_apply_config(), which in turn is triggered by a call to credential_fill(), etc. I think this is OK here, since we have just called credential_from_url() ourselves, and the earliest credential_fill() we'd hit is from init_curl_proxy_auth(), which is called right after the code you're touching. > + int path_is_supported = 0; > + /* curl_version_info was added in curl 7.10 */ > +#if LIBCURL_VERSION_NUM >= 0x070a00 > + curl_version_info_data *ver = curl_version_info(CURLVERSION_NOW); > + path_is_supported = ver->version_num >= 0x075400; > +#endif We've been trying to keep our curl version #ifdefs in git-curl-compat.h, with human-readable names. But in this case, I think we can ditch it entirely, as we require curl 7.21.3 or later already. This will be the first time we check curl_version_info() instead of a build-time option. I think that makes sense here. Most features require extra information at build-time (e.g., CURLOPT_* constants), but in this case it is purely a check on the runtime behavior. We perhaps could get away with just letting an older curl quietly ignore the path field, but what you have here is probably a bit friendlier for users. > diff --git a/t/socks5-proxy-server/src/socks5.pl b/t/socks5-proxy-server/src/socks5.pl > new file mode 100755 > index 00000000000..3ef66a1a287 > --- /dev/null > +++ b/t/socks5-proxy-server/src/socks5.pl This is a lot more code than I was hoping for. There are definitely parts of it we don't care about (e.g, threading, handling multiple connections at once) that could be pared down a bit. I don't think we particularly care about auth either (we just want to make sure we talk to the unix-socket proxy at all). What if we switched to socks4, which drops all of the auth bits and supports only IPs, not hostnames (that came in socks4a). This is the shortest I could come up with (only lightly tested): -- >8 -- use strict; use IO::Select; use IO::Socket::UNIX; use IO::Socket::INET; my $path = shift; unlink($path); my $server = IO::Socket::UNIX->new(Listen => 1, Local => $path) or die "unable to listen on $path: $!"; $| = 1; print "ready to rumble\n"; while (my $client = $server->accept()) { sysread $client, my $buf, 8; my ($version, $cmd, $port, $ip) = unpack 'CCnN', $buf; next unless $version == 4; # socks4 next unless $cmd == 1; # TCP stream connection # skip NUL-terminated id while (sysread $client, my $char, 1) { last unless ord($char); } # version(0), reply(5a == granted), port (ignored), ip (ignored) syswrite $client, "\x00\x5a\x00\x00\x00\x00\x00\x00"; my $remote = IO::Socket::INET->new(PeerHost => $ip, PeerPort => $port) or die "unable to connect to $ip/$port: $!"; my $io = IO::Select->new($client, $remote); while ($io->count) { for my $fh ($io->can_read(0)) { for my $pair ([$client, $remote], [$remote, $client]) { my ($from, $to) = @$pair; next unless $fh == $from; my $r = sysread $from, my $buf, 1024; if (!defined $r || $r <= 0) { $io->remove($from); next; } syswrite $to, $buf; } } } } -- >8 -- That's still more than I'd like, but quite a bit smaller. I dunno. Probably the one you found is more battle-tested, but I really think we have pretty simple requirements. > diff --git a/t/t5564-http-proxy.sh b/t/t5564-http-proxy.sh > index bb35b87071d..bafaa31adf8 100755 > --- a/t/t5564-http-proxy.sh > +++ b/t/t5564-http-proxy.sh > @@ -39,4 +39,22 @@ test_expect_success 'clone can prompt for proxy password' ' > expect_askpass pass proxuser > ' > > +start_socks() { > + # The %30 tests that the correct amount of percent-encoding is applied > + # to the proxy string passed to curl. > + "$PERL_PATH" "$TEST_DIRECTORY/socks5-proxy-server/src/socks5.pl" \ > + "$TRASH_DIRECTORY/%30.sock" proxuser proxpass & > + socks_pid=$! > +} > + > +test_lazy_prereq LIBCURL_7_84 'expr x$(curl-config --vernum) \>= x075400' This is the first time we'd be relying on curl-config in the test suite. I suspect it would _usually_ work, but I'd worry about a few things: 1. The Makefile allows for a different path for $(CURL_CONFIG), or even skipping it entirely by setting $(CURLDIR). 2. We'd usually build and test in the same environment, but not necessarily. In particular, I know Windows uses separate CI jobs for this, and I'm not sure if curl-config will be around in the test jobs. I can see two paths forward: a. There's been recent discussion about adding an option for Git to report the run-time curl version. That doesn't exist yet, though "git version --build-options" will report the build-time version. That might be good enough for the purposes of testing a build. b. Write the test such that it expects the request to work _or_ we see the "version too old" failure. > +test_expect_success PERL,LIBCURL_7_84 'clone via Unix socket' ' I'm not sure if this PERL prereq is sufficient. We also need to know that we can make unix sockets. Probably we need to actually run the socks proxy and make sure it gets to the "starting..." line before accepting that it works. Which also gets us to... > + start_socks && > + test_when_finished "rm -rf clone && kill $socks_pid" && > + test_config_global http.proxy "socks5://proxuser:proxpass@localhost$PWD/%2530.sock" && > + GIT_TRACE_CURL=$PWD/trace git clone "$HTTPD_URL/smart/repo.git" clone && This is racy. We started the proxy in the background, but we don't know that it's ready to accept connections by the time we run "git clone". My first few runs all failed, but putting a "sleep 1" in between fixed it (which obviously is not a real fix, but just a debugging aid). If you usually see success, try running the test script with "--stress" to create extra load, and you should see failures. The usual trick here is to start the process in the background, and then in the foreground wait for some "I'm ready" output, which involves ugly tricks with fifos. There's prior art in lib-git-daemon.sh (though it's a little more complicated there, because we want to relay its stderr, too, but with a script under our control we can just put the ready message on stdout). So coupled with the prereq fix that I mentioned above, you might be able to do something like (totally untested): start_socks_proxy () { mkfifo socks_output && { perl proxy.pl ... >socks_output & socks_pid=$! } && read line <socks_output && test "$line" = "Starting..." } test_expect_success 'try to start socks proxy' ' if start_socks_proxy then test_seq_prereq SOCKS_PROXY fi ' test_expect_success SOCKS_PROXY 'clone via unix socket' ' ... ' -Peff ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v2] http: do not ignore proxy path 2024-07-29 20:09 ` Jeff King @ 2024-07-31 15:33 ` Ryan Hendrickson 0 siblings, 0 replies; 31+ messages in thread From: Ryan Hendrickson @ 2024-07-31 15:33 UTC (permalink / raw) To: Jeff King; +Cc: git At 2024-07-29 16:09-0400, Jeff King <peff@peff.net> sent: > Looking into the curl support, I did notice two things: > > - unix sockets are only supported for socks proxies. Is it meaningful > to have a path on an http proxy? I don't think so (there is no place > to send it in the "GET" line). But perhaps we should limit the new > code only to socks. > > - curl says you should put "localhost" in the host field for this, > though obviously it is not otherwise used. Should we likewise > require that to kick in the code? Sounds good, I've added checks and tests for both of these cases. >> @@ -1265,7 +1266,24 @@ static CURL *get_curl_handle(void) >> if (!proxy_auth.host) >> die("Invalid proxy URL '%s'", curl_http_proxy); >> >> - curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host); >> + strbuf_addstr(&proxy, proxy_auth.host); >> + if (proxy_auth.path) { > > There's one gotcha here I wondered about: we usually throw away the path > element, unless credential.useHTTPPath is set. That only happens after > we load the per-credential config, though, via credential_apply_config(), > which in turn is triggered by a call to credential_fill(), etc. > > I think this is OK here, since we have just called credential_from_url() > ourselves, and the earliest credential_fill() we'd hit is from > init_curl_proxy_auth(), which is called right after the code you're > touching. Yes, good lookout but I don't think this is a problem either. >> + int path_is_supported = 0; >> + /* curl_version_info was added in curl 7.10 */ >> +#if LIBCURL_VERSION_NUM >= 0x070a00 >> + curl_version_info_data *ver = curl_version_info(CURLVERSION_NOW); >> + path_is_supported = ver->version_num >= 0x075400; >> +#endif > > We've been trying to keep our curl version #ifdefs in git-curl-compat.h, > with human-readable names. But in this case, I think we can ditch it > entirely, as we require curl 7.21.3 or later already. Ah, okay, that is good to know! I'll remove the #if. > This will be the first time we check curl_version_info() instead of a > build-time option. I think that makes sense here. Most features require > extra information at build-time (e.g., CURLOPT_* constants), but in this > case it is purely a check on the runtime behavior. > > We perhaps could get away with just letting an older curl quietly ignore > the path field, but what you have here is probably a bit friendlier for > users. Yes, curl has a nasty tendency to quietly ignore a lot of things. I didn't want users to be uncertain about whether they were using the feature incorrectly if an older curl was the actual issue. > What if we switched to socks4, which drops all of the auth bits and > supports only IPs, not hostnames (that came in socks4a). This is the > shortest I could come up with (only lightly tested): Ah, many thanks! I like this much better, and I'm not proficient enough with Perl to have written it myself. > I can see two paths forward: > > a. There's been recent discussion about adding an option for Git to > report the run-time curl version. That doesn't exist yet, though > "git version --build-options" will report the build-time version. > That might be good enough for the purposes of testing a build. > > b. Write the test such that it expects the request to work _or_ we see > the "version too old" failure. Got it, I'll go with option b. > If you usually see success, try running the test script with "--stress" > to create extra load, and you should see failures. Huh. I never saw any race issues on my machine even with --stress, but your approach is safer so I'm running with it. Thank you! R ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v3] http: do not ignore proxy path 2024-07-27 6:44 ` [PATCH v2] " Ryan Hendrickson via GitGitGadget 2024-07-29 20:09 ` Jeff King @ 2024-07-31 16:01 ` Ryan Hendrickson via GitGitGadget 2024-07-31 22:24 ` Junio C Hamano 2024-08-01 5:22 ` [PATCH v4] " Ryan Hendrickson via GitGitGadget 1 sibling, 2 replies; 31+ messages in thread From: Ryan Hendrickson via GitGitGadget @ 2024-07-31 16:01 UTC (permalink / raw) To: git; +Cc: Ryan Hendrickson, Jeff King, Ryan Hendrickson, Ryan Hendrickson From: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> The documentation for `http.proxy` describes that option, and the environment variables it overrides, as supporting "the syntax understood by curl". curl allows SOCKS proxies to use a path to a Unix domain socket, like `socks5h://localhost/path/to/socket.sock`. Git should therefore include, if present, the path part of the proxy URL in what it passes to libcurl. Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> --- http: do not ignore proxy path Re earlier discussion about regressions, it turns out that curl has only supported this syntax since 2022 [https://curl.se/ch/7.84.0.html]. Earlier versions of curl, if provided an http_proxy that contained a path, would also ignore it. curl didn't seem to consider this a breaking change when the feature was introduced [https://github.com/curl/curl/pull/8668], though; is that a valid argument for Git not to lose sleep over it either? A test has been added, but unfortunately it is not being run on any of GitHub's CI workflows, because the runner images in use all have versions of libcurl that are too old. It works on my machine, but unknown troubles may await someone else. Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1767%2Frhendric%2Frhendric%2Fhttp-proxy-path-v3 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1767/rhendric/rhendric/http-proxy-path-v3 Pull-Request: https://github.com/gitgitgadget/git/pull/1767 Range-diff vs v2: 1: b4715eba5b1 ! 1: 7a58da7102e http: do not ignore proxy path @@ Commit message therefore include, if present, the path part of the proxy URL in what it passes to libcurl. + Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> ## Documentation/config/http.txt ## @@ http.c: static CURL *get_curl_handle(void) - curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host); + strbuf_addstr(&proxy, proxy_auth.host); + if (proxy_auth.path) { -+ int path_is_supported = 0; -+ /* curl_version_info was added in curl 7.10 */ -+#if LIBCURL_VERSION_NUM >= 0x070a00 + curl_version_info_data *ver = curl_version_info(CURLVERSION_NOW); -+ path_is_supported = ver->version_num >= 0x075400; -+#endif -+ if (path_is_supported) { -+ strbuf_addch(&proxy, '/'); -+ strbuf_add_percentencode(&proxy, proxy_auth.path, 0); -+ } else { ++ if (ver->version_num < 0x075400) + die("libcurl 7.84 or later is required to support paths in proxy URLs"); -+ } ++ ++ if (!starts_with(proxy_auth.protocol, "socks")) ++ die("Invalid proxy URL '%s': only SOCKS proxies support paths", ++ curl_http_proxy); ++ ++ if (strcasecmp(proxy_auth.host, "localhost")) ++ die("Invalid proxy URL '%s': host must be localhost if a path is present", ++ curl_http_proxy); ++ ++ strbuf_addch(&proxy, '/'); ++ strbuf_add_percentencode(&proxy, proxy_auth.path, 0); + } + curl_easy_setopt(result, CURLOPT_PROXY, proxy.buf); + strbuf_release(&proxy); @@ http.c: static CURL *get_curl_handle(void) var_override(&curl_no_proxy, getenv("no_proxy")); curl_easy_setopt(result, CURLOPT_NOPROXY, curl_no_proxy); - ## t/socks5-proxy-server/LICENSE (new) ## -@@ -+MIT License -+ -+Copyright (c) 2014 kaimi.io -+ -+Permission is hereby granted, free of charge, to any person obtaining a copy -+of this software and associated documentation files (the "Software"), to deal -+in the Software without restriction, including without limitation the rights -+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -+copies of the Software, and to permit persons to whom the Software is -+furnished to do so, subject to the following conditions: -+ -+The above copyright notice and this permission notice shall be included in all -+copies or substantial portions of the Software. -+ -+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -+SOFTWARE. - - ## t/socks5-proxy-server/README.md (new) ## -@@ -+socks5-proxy-server -+=================== -+ -+This Perl code implements a SOCKS5 proxy server that listens for incoming connections and processes them in separate threads. The server takes three input parameters: `path`, `login`, and `password`. -+ -+When a client attempts to connect to the server, the server checks if the client supports any of the available authentication methods (no authentication or login/password authentication). If a suitable method is found, the server establishes a connection with the target server and begins forwarding data between the client and the target server. -+ -+The code uses the `IO::Select` module for working with sockets and the `threads` module for creating and managing threads. It includes several functions, including: -+ -+- `main`: the main function that creates threads for processing incoming connections. -+- `replenish`: a function that creates additional threads if the number of free threads is less than the specified value. -+- `new_client`: a function that handles incoming client connections and checks if the available authentication methods are supported by the client. -+- `socks_auth`: a function that performs login and password authentication. -+- `handle_client`: a function that establishes a connection with the target server and forwards data between the client and the target server. -+ -+This code includes the use of the following Perl modules: `IO::Select`, `Socket`, `threads`, `threads::shared`. -+ -+To run this code, enter the following command: -+`perl socks5.pl path login password` -+ -+Note that this code is designed for educational purposes and should not be used in production environments without proper modifications and security measures. 😊 -+ -+This code is distributed under the [MIT License](LICENSE), which allows for free use, modification, and distribution of the code as long as the original copyright notice and license are included. -+ -+--- -+ -+Originally from: https://github.com/kaimi-io/socks5-proxy-server - - ## t/socks5-proxy-server/src/socks5.pl (new) ## + ## t/socks4-proxy.pl (new) ## @@ -+#!/usr/bin/perl -+#В коде использованы части rdss (.Slip) и sss (hm2k) +use strict; +use IO::Select; -+use Socket; -+use threads; -+use threads::shared; ++use IO::Socket::UNIX; ++use IO::Socket::INET; + -+if(scalar @ARGV < 1) -+{ -+ die "Usage: serv.pl path login pass\n"; -+} ++my $path = shift; + -+my ($path, $login, $passw) = @ARGV; -+ -+my %config = (thr_init => 10, thr_min => 2, thr_max => 20, conn_limit => SOMAXCONN); ++unlink($path); ++my $server = IO::Socket::UNIX->new(Listen => 1, Local => $path) ++ or die "unable to listen on $path: $!"; + +$| = 1; ++print "ready\n"; + -+my %status : shared; -+my $accept : shared; -+ -+my $sock_addr = sockaddr_un $path or die $!; -+socket my $socket, PF_UNIX, SOCK_STREAM, 0; -+bind $socket, $sock_addr or die $!; -+listen $socket, $config{'conn_limit'} or die $!; -+ -+ -+my $sel = IO::Select->new($socket); -+ -+print "Starting...\n"; -+ -+replenish($config{'thr_init'}); -+ -+while(1) -+{ -+ lock %status; -+ cond_wait %status; ++while (my $client = $server->accept()) { ++ sysread $client, my $buf, 8; ++ my ($version, $cmd, $port, $ip) = unpack 'CCnN', $buf; ++ next unless $version == 4; # socks4 ++ next unless $cmd == 1; # TCP stream connection + -+ my @free = sort {$a <=> $b} grep {$status{$_} eq 'free'} keys %status; -+ -+ if(scalar @free < $config{'thr_min'}) -+ { -+ replenish($#free - $config{'thr_min'}); -+ } -+ elsif(scalar @free > $config{'thr_max'}) -+ { -+ my @kill = @free[0..$#free - $config{'thr_max'}]; -+ status($_ => 'kill') for @kill; ++ # skip NUL-terminated id ++ while (sysread $client, my $char, 1) { ++ last unless ord($char); + } -+} -+ -+status($_ => 'kill') for keys %status; -+ -+sub main -+{ -+ my $sock = shift; -+ my $loop = 50; -+ -+ my $tid = threads->tid; -+ my $conn; -+ -+ threads->self->detach; -+ status($tid, 'free'); -+ -+ while(status($tid) ne 'kill' && $loop > 0) -+ { -+ next unless $sel->can_read(.1); -+ { -+ lock $accept; -+ next unless accept $conn, $sock; -+ } -+ $loop--; -+ status($tid => 'work'); -+ new_client($conn); -+ close $conn; -+ status($tid => 'free'); -+ } -+ -+ status($tid, 'kill'); -+} -+ -+sub status -+{ -+ my ($tid, $state) = @_; -+ lock %status; -+ -+ return $status{$tid} unless $state; -+ if($state) -+ { -+ $status{$tid} = $state unless defined $status{$tid} and $status{$tid} eq 'kill'; -+ } -+ else -+ { -+ delete $status{$tid}; -+ } -+ -+ cond_broadcast %status; -+} -+ -+sub replenish -+{ -+ threads->create(\&main, $socket) for 1..$_[0]; -+} -+ -+sub new_client -+{ -+ my $sock = shift; -+ -+ sysread $sock, my $buf, 2; -+ -+ my ($ver, $auth_num) = unpack('H2H2', $buf); -+ #Версия протокола -+ return unless $ver eq '05'; -+ sysread $sock, $buf, $auth_num; + -+ my $ok = 0; -+ #Перечисление методов авторизации -+ for(my ($mode, $i) = (0, 0); $i < $auth_num; $mode = ord substr $buf, ++$i, 1) -+ { -+ #0 - Без авторизации; 2 - Username/Password -+ if($mode == 0 && !$login) -+ { -+ syswrite $sock, "\x05\x00"; -+ $ok++; -+ last; -+ } -+ elsif($mode == 2 && $login) -+ { -+ return unless socks_auth($sock); -+ $ok++; -+ last; -+ } -+ } -+ #Подходящие методы есть -+ if($ok) -+ { -+ sysread $sock, $buf, 3; -+ my ($ver, $cmd, $r) = unpack 'H2H2H2', $buf; -+ -+ if($ver eq '05' && $r eq '00') -+ { -+ my ($client_host, $client_host_raw, $client_port, $client_port_raw) = get_conn_info($sock); -+ return unless ($client_host || $client_port); -+ -+ syswrite $sock, "\x05\x00\x00".$client_host_raw.$client_port_raw; -+ handle_client($sock, $client_host, $client_port, $cmd); -+ } -+ } -+ else -+ { -+ syswrite $sock, "\x05\xFF"; -+ } -+} -+ -+sub socks_auth -+{ -+ my $sock = shift; -+ -+ syswrite $sock, "\x05\x02"; -+ sysread $sock, my $buf, 1; -+ -+ if(ord $buf == 1) -+ { -+ #username length : username : password length : password -+ sysread $sock, $buf, 1; -+ sysread $sock, my $s_login, ord($buf); -+ sysread $sock, $buf, 1; -+ sysread $sock, my $s_passw, ord($buf); -+ -+ #0x00 = success; any other value = failure -+ if($login eq $s_login && $passw eq $s_passw) -+ { -+ syswrite $sock, "\x05\x00"; -+ return 1; -+ } -+ else -+ { -+ syswrite $sock, "\x05\x01"; -+ } -+ } -+ -+ return 0; -+} -+ -+sub handle_client -+{ -+ my ($sock, $host, $port, $cmd) = @_; -+ -+ #0x01 = establish a TCP/IP stream connection -+ if($cmd == 1) -+ { -+ my $sock_addr = sockaddr_in $port, inet_aton($host) or return; -+ socket my $target, PF_INET, SOCK_STREAM, getprotobyname('tcp') or return; -+ connect $target, $sock_addr or return; -+ -+ while($sock || $target) -+ { -+ my ($rin, $cbuf, $tbuf, $rout, $eout) = ('', '', '', '', ''); -+ vec($rin, fileno($sock), 1) = 1 if $sock; -+ vec($rin, fileno($target), 1) = 1 if $target; -+ select($rout = $rin, undef, $eout = $rin, 120); -+ return if(!$rout && !$eout); -+ -+ if($sock && (vec($eout, fileno($sock), 1) || vec($rout, fileno($sock), 1))) -+ { -+ my $res = sysread $sock, $tbuf, 1024; -+ return if(!defined $res || !$res); -+ } -+ if($target && (vec($eout, fileno($target), 1) || vec($rout, fileno($target), 1))) -+ { -+ my $res = sysread $target, $cbuf, 1024; -+ return if(!defined $res || !$res); -+ } -+ while(my $len = length($tbuf)) -+ { -+ my $r = syswrite $target, $tbuf, $len; -+ return if(!defined $r || $r <= 0); -+ $tbuf = substr($tbuf, $r); -+ } -+ while(my $len = length($cbuf)) -+ { -+ my $r = syswrite $sock, $cbuf, $len; -+ return if(!defined $r || $r <= 0); -+ $cbuf = substr($cbuf, $r); ++ # version(0), reply(5a == granted), port (ignored), ip (ignored) ++ syswrite $client, "\x00\x5a\x00\x00\x00\x00\x00\x00"; ++ ++ my $remote = IO::Socket::INET->new(PeerHost => $ip, PeerPort => $port) ++ or die "unable to connect to $ip/$port: $!"; ++ ++ my $io = IO::Select->new($client, $remote); ++ while ($io->count) { ++ for my $fh ($io->can_read(0)) { ++ for my $pair ([$client, $remote], [$remote, $client]) { ++ my ($from, $to) = @$pair; ++ next unless $fh == $from; ++ ++ my $r = sysread $from, my $buf, 1024; ++ if (!defined $r || $r <= 0) { ++ $io->remove($from); ++ next; ++ } ++ syswrite $to, $buf; + } + } + } -+} -+ -+sub get_conn_info -+{ -+ my $sock = shift; -+ -+ my ($host, $raw_host) = ('', ''); -+ sysread $sock, my $buf, 1; -+ ($raw_host, $buf) = ($buf, ord $buf); -+ #0x01 = IPv4 address; 0x03 = Domain name -+ if($buf == 1) -+ { -+ #4 bytes for IPv4 address -+ sysread $sock, $buf, 4; -+ $raw_host .= $buf; -+ $host = join '.', map(ord, split //, $buf); -+ } -+ elsif($buf == 3) -+ { -+ #1 byte of name length followed by the name for Domain name -+ sysread $sock, $buf, 1; -+ sysread $sock, $host, ord($buf); -+ $raw_host .= $host; -+ } -+ -+ #port number in a network byte order, 2 bytes -+ my ($port, $raw_port) = ('', ''); -+ sysread $sock, $raw_port, 2; -+ $port = ord(substr($raw_port, 0, 1)) << 8 | ord(substr($raw_port, 1, 1)); -+ -+ return $host, $raw_host, $port, $raw_port; +} ## t/t5564-http-proxy.sh ## @@ t/t5564-http-proxy.sh: test_expect_success 'clone can prompt for proxy password' ' +start_socks() { ++ mkfifo socks_output && ++ { ++ "$PERL_PATH" "$TEST_DIRECTORY/socks4-proxy.pl" "$1" >socks_output & ++ socks_pid=$! ++ } && ++ read line <socks_output && ++ test "$line" = ready ++} ++ ++test_expect_success PERL 'try to start SOCKS proxy' ' + # The %30 tests that the correct amount of percent-encoding is applied + # to the proxy string passed to curl. -+ "$PERL_PATH" "$TEST_DIRECTORY/socks5-proxy-server/src/socks5.pl" \ -+ "$TRASH_DIRECTORY/%30.sock" proxuser proxpass & -+ socks_pid=$! -+} ++ if start_socks %30.sock ++ then ++ test_set_prereq SOCKS_PROXY ++ fi ++' ++ ++test_expect_success SOCKS_PROXY 'clone via Unix socket' ' ++ test_when_finished "rm -rf clone" && ++ test_config_global http.proxy "socks4://localhost$PWD/%2530.sock" && { ++ { ++ GIT_TRACE_CURL=$PWD/trace git clone "$HTTPD_URL/smart/repo.git" clone 2>err && ++ grep -i "SOCKS4 request granted." trace ++ } || ++ grep "^fatal: libcurl 7\.84 or later" err ++ } ++' + -+test_lazy_prereq LIBCURL_7_84 'expr x$(curl-config --vernum) \>= x075400' ++test_expect_success SOCKS_PROXY 'stop SOCKS proxy' 'kill "$socks_pid"' + -+test_expect_success PERL,LIBCURL_7_84 'clone via Unix socket' ' -+ start_socks && -+ test_when_finished "rm -rf clone && kill $socks_pid" && -+ test_config_global http.proxy "socks5://proxuser:proxpass@localhost$PWD/%2530.sock" && -+ GIT_TRACE_CURL=$PWD/trace git clone "$HTTPD_URL/smart/repo.git" clone && -+ grep -i "SOCKS5 request granted." trace ++test_expect_success 'Unix socket requires socks*:' ' ++ ! git clone -c http.proxy=localhost/path https://example.com/repo.git 2>err && { ++ grep "^fatal: Invalid proxy URL '\''localhost/path'\'': only SOCKS proxies support paths" err || ++ grep "^fatal: libcurl 7\.84 or later" err ++ } ++' ++ ++test_expect_success 'Unix socket requires localhost' ' ++ ! git clone -c http.proxy=socks4://127.0.0.1/path https://example.com/repo.git 2>err && { ++ grep "^fatal: Invalid proxy URL '\''socks4://127\.0\.0\.1/path'\'': host must be localhost if a path is present" err || ++ grep "^fatal: libcurl 7\.84 or later" err ++ } +' + test_done Documentation/config/http.txt | 4 +-- http.c | 22 +++++++++++++++- t/socks4-proxy.pl | 48 +++++++++++++++++++++++++++++++++++ t/t5564-http-proxy.sh | 46 +++++++++++++++++++++++++++++++++ 4 files changed, 117 insertions(+), 3 deletions(-) create mode 100644 t/socks4-proxy.pl diff --git a/Documentation/config/http.txt b/Documentation/config/http.txt index 162b33fc52f..a14371b5c96 100644 --- a/Documentation/config/http.txt +++ b/Documentation/config/http.txt @@ -5,8 +5,8 @@ http.proxy:: proxy string with a user name but no password, in which case git will attempt to acquire one in the same way it does for other credentials. See linkgit:gitcredentials[7] for more information. The syntax thus is - '[protocol://][user[:password]@]proxyhost[:port]'. This can be overridden - on a per-remote basis; see remote.<name>.proxy + '[protocol://][user[:password]@]proxyhost[:port][/path]'. This can be + overridden on a per-remote basis; see remote.<name>.proxy + Any proxy, however configured, must be completely transparent and must not modify, transform, or buffer the request or response in any way. Proxies which diff --git a/http.c b/http.c index 623ed234891..a50ba095889 100644 --- a/http.c +++ b/http.c @@ -1227,6 +1227,7 @@ static CURL *get_curl_handle(void) */ curl_easy_setopt(result, CURLOPT_PROXY, ""); } else if (curl_http_proxy) { + struct strbuf proxy = STRBUF_INIT; if (starts_with(curl_http_proxy, "socks5h")) curl_easy_setopt(result, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS5_HOSTNAME); @@ -1265,7 +1266,26 @@ static CURL *get_curl_handle(void) if (!proxy_auth.host) die("Invalid proxy URL '%s'", curl_http_proxy); - curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host); + strbuf_addstr(&proxy, proxy_auth.host); + if (proxy_auth.path) { + curl_version_info_data *ver = curl_version_info(CURLVERSION_NOW); + if (ver->version_num < 0x075400) + die("libcurl 7.84 or later is required to support paths in proxy URLs"); + + if (!starts_with(proxy_auth.protocol, "socks")) + die("Invalid proxy URL '%s': only SOCKS proxies support paths", + curl_http_proxy); + + if (strcasecmp(proxy_auth.host, "localhost")) + die("Invalid proxy URL '%s': host must be localhost if a path is present", + curl_http_proxy); + + strbuf_addch(&proxy, '/'); + strbuf_add_percentencode(&proxy, proxy_auth.path, 0); + } + curl_easy_setopt(result, CURLOPT_PROXY, proxy.buf); + strbuf_release(&proxy); + var_override(&curl_no_proxy, getenv("NO_PROXY")); var_override(&curl_no_proxy, getenv("no_proxy")); curl_easy_setopt(result, CURLOPT_NOPROXY, curl_no_proxy); diff --git a/t/socks4-proxy.pl b/t/socks4-proxy.pl new file mode 100644 index 00000000000..4c3a35c0083 --- /dev/null +++ b/t/socks4-proxy.pl @@ -0,0 +1,48 @@ +use strict; +use IO::Select; +use IO::Socket::UNIX; +use IO::Socket::INET; + +my $path = shift; + +unlink($path); +my $server = IO::Socket::UNIX->new(Listen => 1, Local => $path) + or die "unable to listen on $path: $!"; + +$| = 1; +print "ready\n"; + +while (my $client = $server->accept()) { + sysread $client, my $buf, 8; + my ($version, $cmd, $port, $ip) = unpack 'CCnN', $buf; + next unless $version == 4; # socks4 + next unless $cmd == 1; # TCP stream connection + + # skip NUL-terminated id + while (sysread $client, my $char, 1) { + last unless ord($char); + } + + # version(0), reply(5a == granted), port (ignored), ip (ignored) + syswrite $client, "\x00\x5a\x00\x00\x00\x00\x00\x00"; + + my $remote = IO::Socket::INET->new(PeerHost => $ip, PeerPort => $port) + or die "unable to connect to $ip/$port: $!"; + + my $io = IO::Select->new($client, $remote); + while ($io->count) { + for my $fh ($io->can_read(0)) { + for my $pair ([$client, $remote], [$remote, $client]) { + my ($from, $to) = @$pair; + next unless $fh == $from; + + my $r = sysread $from, my $buf, 1024; + if (!defined $r || $r <= 0) { + $io->remove($from); + next; + } + syswrite $to, $buf; + } + } + } +} diff --git a/t/t5564-http-proxy.sh b/t/t5564-http-proxy.sh index bb35b87071d..7fcffba67a2 100755 --- a/t/t5564-http-proxy.sh +++ b/t/t5564-http-proxy.sh @@ -39,4 +39,50 @@ test_expect_success 'clone can prompt for proxy password' ' expect_askpass pass proxuser ' +start_socks() { + mkfifo socks_output && + { + "$PERL_PATH" "$TEST_DIRECTORY/socks4-proxy.pl" "$1" >socks_output & + socks_pid=$! + } && + read line <socks_output && + test "$line" = ready +} + +test_expect_success PERL 'try to start SOCKS proxy' ' + # The %30 tests that the correct amount of percent-encoding is applied + # to the proxy string passed to curl. + if start_socks %30.sock + then + test_set_prereq SOCKS_PROXY + fi +' + +test_expect_success SOCKS_PROXY 'clone via Unix socket' ' + test_when_finished "rm -rf clone" && + test_config_global http.proxy "socks4://localhost$PWD/%2530.sock" && { + { + GIT_TRACE_CURL=$PWD/trace git clone "$HTTPD_URL/smart/repo.git" clone 2>err && + grep -i "SOCKS4 request granted." trace + } || + grep "^fatal: libcurl 7\.84 or later" err + } +' + +test_expect_success SOCKS_PROXY 'stop SOCKS proxy' 'kill "$socks_pid"' + +test_expect_success 'Unix socket requires socks*:' ' + ! git clone -c http.proxy=localhost/path https://example.com/repo.git 2>err && { + grep "^fatal: Invalid proxy URL '\''localhost/path'\'': only SOCKS proxies support paths" err || + grep "^fatal: libcurl 7\.84 or later" err + } +' + +test_expect_success 'Unix socket requires localhost' ' + ! git clone -c http.proxy=socks4://127.0.0.1/path https://example.com/repo.git 2>err && { + grep "^fatal: Invalid proxy URL '\''socks4://127\.0\.0\.1/path'\'': host must be localhost if a path is present" err || + grep "^fatal: libcurl 7\.84 or later" err + } +' + test_done base-commit: 39bf06adf96da25b87c9aa7d35a32ef3683eb4a4 -- gitgitgadget ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH v3] http: do not ignore proxy path 2024-07-31 16:01 ` [PATCH v3] " Ryan Hendrickson via GitGitGadget @ 2024-07-31 22:24 ` Junio C Hamano 2024-08-01 3:44 ` Ryan Hendrickson 2024-08-01 5:45 ` Jeff King 2024-08-01 5:22 ` [PATCH v4] " Ryan Hendrickson via GitGitGadget 1 sibling, 2 replies; 31+ messages in thread From: Junio C Hamano @ 2024-07-31 22:24 UTC (permalink / raw) To: Ryan Hendrickson via GitGitGadget; +Cc: git, Ryan Hendrickson, Jeff King "Ryan Hendrickson via GitGitGadget" <gitgitgadget@gmail.com> writes: > @@ -5,8 +5,8 @@ http.proxy:: > proxy string with a user name but no password, in which case git will > attempt to acquire one in the same way it does for other credentials. See > linkgit:gitcredentials[7] for more information. The syntax thus is > - '[protocol://][user[:password]@]proxyhost[:port]'. This can be overridden > - on a per-remote basis; see remote.<name>.proxy > + '[protocol://][user[:password]@]proxyhost[:port][/path]'. This can be > + overridden on a per-remote basis; see remote.<name>.proxy OK. > diff --git a/http.c b/http.c > index 623ed234891..a50ba095889 100644 > --- a/http.c > +++ b/http.c > @@ -1227,6 +1227,7 @@ static CURL *get_curl_handle(void) > */ > curl_easy_setopt(result, CURLOPT_PROXY, ""); > } else if (curl_http_proxy) { > + struct strbuf proxy = STRBUF_INIT; > if (starts_with(curl_http_proxy, "socks5h")) > curl_easy_setopt(result, > CURLOPT_PROXYTYPE, CURLPROXY_SOCKS5_HOSTNAME); In a block with local variable decl, be more friendly to readers by having a blank line between the end of declarations and the first statement. > + strbuf_addstr(&proxy, proxy_auth.host); > + if (proxy_auth.path) { > + curl_version_info_data *ver = curl_version_info(CURLVERSION_NOW); > + if (ver->version_num < 0x075400) > + die("libcurl 7.84 or later is required to support paths in proxy URLs"); > + > + if (!starts_with(proxy_auth.protocol, "socks")) > + die("Invalid proxy URL '%s': only SOCKS proxies support paths", > + curl_http_proxy); > + > + if (strcasecmp(proxy_auth.host, "localhost")) > + die("Invalid proxy URL '%s': host must be localhost if a path is present", > + curl_http_proxy); We insist that it must be "localhost", so let's not do strcasecmp() but just do strcmp(). > diff --git a/t/t5564-http-proxy.sh b/t/t5564-http-proxy.sh > index bb35b87071d..7fcffba67a2 100755 > --- a/t/t5564-http-proxy.sh > +++ b/t/t5564-http-proxy.sh > @@ -39,4 +39,50 @@ test_expect_success 'clone can prompt for proxy password' ' > expect_askpass pass proxuser > ' > > +start_socks() { > + mkfifo socks_output && > + { > + "$PERL_PATH" "$TEST_DIRECTORY/socks4-proxy.pl" "$1" >socks_output & > + socks_pid=$! > + } && > + read line <socks_output && > + test "$line" = ready > +} > + > +test_expect_success PERL 'try to start SOCKS proxy' ' > + # The %30 tests that the correct amount of percent-encoding is applied > + # to the proxy string passed to curl. > + if start_socks %30.sock > + then > + test_set_prereq SOCKS_PROXY > + fi > +' Making it a regular test_expect_success would mean GIT_SKIP_TEST mechansim can be used to skip it, which is probably not what you want. Can't this be a more common test_lazy_prereq, perhaps like test_lazy_prereq SOCKS_PROXY ' # useful comment about 30% here ... test_have_prereq PERL && start_socks %30.sock ' or something? Thanks. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3] http: do not ignore proxy path 2024-07-31 22:24 ` Junio C Hamano @ 2024-08-01 3:44 ` Ryan Hendrickson 2024-08-01 5:21 ` Junio C Hamano 2024-08-01 5:45 ` Jeff King 1 sibling, 1 reply; 31+ messages in thread From: Ryan Hendrickson @ 2024-08-01 3:44 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Jeff King [-- Attachment #1: Type: text/plain, Size: 1662 bytes --] At 2024-07-31 15:24-0700, Junio C Hamano <gitster@pobox.com> sent: > In a block with local variable decl, be more friendly to readers by > having a blank line between the end of declarations and the first > statement. Will do. > We insist that it must be "localhost", so let's not do strcasecmp() > but just do strcmp(). I don't see the wisdom of being more restrictive than curl is in this respect. The documentation makes the claim that curl's syntax is what this option supports, and while that is not literally true due to how protocols are handled, I don't see a reason to intentionally deviate further. I've tested that curl does the right thing with any casing of localhost, both on its own and via Git. Host names are generally case insensitive; the error message should be understood in that context. And if it is misunderstood, there's no negative impact on the user who writes "localhost" (while there is a negative impact on the user who quite reasonably expects "LOCALHOST" to work if we don't follow curl's lead). > Making it a regular test_expect_success would mean GIT_SKIP_TEST > mechansim can be used to skip it, which is probably not what you > want. Can't this be a more common test_lazy_prereq, perhaps like > > test_lazy_prereq SOCKS_PROXY ' > # useful comment about 30% here ... > test_have_prereq PERL && > start_socks %30.sock > ' > > or something? This existing test file is definitely not written with running individual tests in mind; the first test is a prerequisite for all that comes after, and the second test seems to be required for tests 3–5 as well. But sure, I'll use that pattern if you like. R ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3] http: do not ignore proxy path 2024-08-01 3:44 ` Ryan Hendrickson @ 2024-08-01 5:21 ` Junio C Hamano 0 siblings, 0 replies; 31+ messages in thread From: Junio C Hamano @ 2024-08-01 5:21 UTC (permalink / raw) To: Ryan Hendrickson; +Cc: git, Jeff King Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> writes: >> We insist that it must be "localhost", so let's not do strcasecmp() >> but just do strcmp(). > > I don't see the wisdom of being more restrictive than curl is in this > respect. Ah, if curl is doing case insensitivity, then matching its behaviour is perfectly fine. I was just reacting to this message ... > + die("Invalid proxy URL '%s': host must be localhost if a path is present", ... that results when strcasecmp() does not like it. Thanks. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3] http: do not ignore proxy path 2024-07-31 22:24 ` Junio C Hamano 2024-08-01 3:44 ` Ryan Hendrickson @ 2024-08-01 5:45 ` Jeff King 2024-08-01 14:40 ` Junio C Hamano 1 sibling, 1 reply; 31+ messages in thread From: Jeff King @ 2024-08-01 5:45 UTC (permalink / raw) To: Junio C Hamano; +Cc: Ryan Hendrickson via GitGitGadget, git, Ryan Hendrickson On Wed, Jul 31, 2024 at 03:24:01PM -0700, Junio C Hamano wrote: > > +start_socks() { > > + mkfifo socks_output && > > + { > > + "$PERL_PATH" "$TEST_DIRECTORY/socks4-proxy.pl" "$1" >socks_output & > > + socks_pid=$! > > + } && > > + read line <socks_output && > > + test "$line" = ready > > +} > > + > > +test_expect_success PERL 'try to start SOCKS proxy' ' > > + # The %30 tests that the correct amount of percent-encoding is applied > > + # to the proxy string passed to curl. > > + if start_socks %30.sock > > + then > > + test_set_prereq SOCKS_PROXY > > + fi > > +' > > Making it a regular test_expect_success would mean GIT_SKIP_TEST > mechansim can be used to skip it, which is probably not what you > want. Can't this be a more common test_lazy_prereq, perhaps like > > test_lazy_prereq SOCKS_PROXY ' > # useful comment about 30% here ... > test_have_prereq PERL && > start_socks %30.sock > ' > > or something? I think Ryan picked up this approach from my earlier mail. And the reason I suggested doing it this way is that the prereq test has an important side effect: it creates the socket and starts the proxy server! I think lazy prereqs only make sense when their only output is the yes/no of whether we match the prereq. And then we don't care when or how often they are evaluated. I do think things would work, assuming the proxy server then can serve multiple tests. It just feels weird, and doing it more like the git-daemon/http tests made more sense to me (though those ones do not even do their work inside an expect block). If we did it in a lazy prereq I think you'd need to munge the path, as the lazy prereq operates in a throwaway directory (so the %30.sock would get removed before we can use it in the next test). -Peff ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3] http: do not ignore proxy path 2024-08-01 5:45 ` Jeff King @ 2024-08-01 14:40 ` Junio C Hamano 0 siblings, 0 replies; 31+ messages in thread From: Junio C Hamano @ 2024-08-01 14:40 UTC (permalink / raw) To: Jeff King; +Cc: Ryan Hendrickson via GitGitGadget, git, Ryan Hendrickson Jeff King <peff@peff.net> writes: > I think Ryan picked up this approach from my earlier mail. And the > reason I suggested doing it this way is that the prereq test has an > important side effect: it creates the socket and starts the proxy > server! Ah, OK. > I think lazy prereqs only make sense when their only output is the > yes/no of whether we match the prereq. And then we don't care when or > how often they are evaluated. Once the prereq test is attempted the result is cached (that's the whole point of lazy prereq mechanism) so we won't see multiple sock proxies spawned. > I do think things would work, assuming the > proxy server then can serve multiple tests. It just feels weird, and > doing it more like the git-daemon/http tests made more sense to me > (though those ones do not even do their work inside an expect block). OK. That's fine. It is probably not a good fit for the pattern. > If we did it in a lazy prereq I think you'd need to munge the path, as > the lazy prereq operates in a throwaway directory (so the %30.sock would > get removed before we can use it in the next test). True. ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v4] http: do not ignore proxy path 2024-07-31 16:01 ` [PATCH v3] " Ryan Hendrickson via GitGitGadget 2024-07-31 22:24 ` Junio C Hamano @ 2024-08-01 5:22 ` Ryan Hendrickson via GitGitGadget 2024-08-01 6:04 ` Jeff King 2024-08-02 5:20 ` [PATCH v5] " Ryan Hendrickson via GitGitGadget 1 sibling, 2 replies; 31+ messages in thread From: Ryan Hendrickson via GitGitGadget @ 2024-08-01 5:22 UTC (permalink / raw) To: git; +Cc: Ryan Hendrickson, Jeff King, Ryan Hendrickson, Ryan Hendrickson From: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> The documentation for `http.proxy` describes that option, and the environment variables it overrides, as supporting "the syntax understood by curl". curl allows SOCKS proxies to use a path to a Unix domain socket, like `socks5h://localhost/path/to/socket.sock`. Git should therefore include, if present, the path part of the proxy URL in what it passes to libcurl. Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> --- http: do not ignore proxy path Re earlier discussion about regressions, it turns out that curl has only supported this syntax since 2022 [https://curl.se/ch/7.84.0.html]. Earlier versions of curl, if provided an http_proxy that contained a path, would also ignore it. curl didn't seem to consider this a breaking change when the feature was introduced [https://github.com/curl/curl/pull/8668], though; is that a valid argument for Git not to lose sleep over it either? A test has been added, but unfortunately it is not being run on any of GitHub's CI workflows, because the runner images in use all have versions of libcurl that are too old. It works on my machine, but unknown troubles may await someone else. Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1767%2Frhendric%2Frhendric%2Fhttp-proxy-path-v4 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1767/rhendric/rhendric/http-proxy-path-v4 Pull-Request: https://github.com/gitgitgadget/git/pull/1767 Range-diff vs v3: 1: 7a58da7102e ! 1: 507fb75c1a6 http: do not ignore proxy path @@ http.c: static CURL *get_curl_handle(void) curl_easy_setopt(result, CURLOPT_PROXY, ""); } else if (curl_http_proxy) { + struct strbuf proxy = STRBUF_INIT; ++ if (starts_with(curl_http_proxy, "socks5h")) curl_easy_setopt(result, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS5_HOSTNAME); @@ http.c: static CURL *get_curl_handle(void) + strbuf_addstr(&proxy, proxy_auth.host); + if (proxy_auth.path) { + curl_version_info_data *ver = curl_version_info(CURLVERSION_NOW); ++ + if (ver->version_num < 0x075400) + die("libcurl 7.84 or later is required to support paths in proxy URLs"); + @@ t/t5564-http-proxy.sh: test_expect_success 'clone can prompt for proxy password' + mkfifo socks_output && + { + "$PERL_PATH" "$TEST_DIRECTORY/socks4-proxy.pl" "$1" >socks_output & -+ socks_pid=$! ++ echo $! > "$TRASH_DIRECTORY/socks.pid" + } && + read line <socks_output && + test "$line" = ready +} + -+test_expect_success PERL 'try to start SOCKS proxy' ' -+ # The %30 tests that the correct amount of percent-encoding is applied -+ # to the proxy string passed to curl. -+ if start_socks %30.sock -+ then -+ test_set_prereq SOCKS_PROXY -+ fi -+' ++# The %30 tests that the correct amount of percent-encoding is applied to the ++# proxy string passed to curl. ++test_lazy_prereq SOCKS_PROXY 'test_have_prereq PERL && start_socks "$TRASH_DIRECTORY/%30.sock"' ++ ++test_atexit 'test ! -e "$TRASH_DIRECTORY/socks.pid" || kill "$(cat "$TRASH_DIRECTORY/socks.pid")"' + +test_expect_success SOCKS_PROXY 'clone via Unix socket' ' + test_when_finished "rm -rf clone" && @@ t/t5564-http-proxy.sh: test_expect_success 'clone can prompt for proxy password' + } +' + -+test_expect_success SOCKS_PROXY 'stop SOCKS proxy' 'kill "$socks_pid"' -+ +test_expect_success 'Unix socket requires socks*:' ' + ! git clone -c http.proxy=localhost/path https://example.com/repo.git 2>err && { + grep "^fatal: Invalid proxy URL '\''localhost/path'\'': only SOCKS proxies support paths" err || Documentation/config/http.txt | 4 +-- http.c | 24 +++++++++++++++++- t/socks4-proxy.pl | 48 +++++++++++++++++++++++++++++++++++ t/t5564-http-proxy.sh | 41 ++++++++++++++++++++++++++++++ 4 files changed, 114 insertions(+), 3 deletions(-) create mode 100644 t/socks4-proxy.pl diff --git a/Documentation/config/http.txt b/Documentation/config/http.txt index 162b33fc52f..a14371b5c96 100644 --- a/Documentation/config/http.txt +++ b/Documentation/config/http.txt @@ -5,8 +5,8 @@ http.proxy:: proxy string with a user name but no password, in which case git will attempt to acquire one in the same way it does for other credentials. See linkgit:gitcredentials[7] for more information. The syntax thus is - '[protocol://][user[:password]@]proxyhost[:port]'. This can be overridden - on a per-remote basis; see remote.<name>.proxy + '[protocol://][user[:password]@]proxyhost[:port][/path]'. This can be + overridden on a per-remote basis; see remote.<name>.proxy + Any proxy, however configured, must be completely transparent and must not modify, transform, or buffer the request or response in any way. Proxies which diff --git a/http.c b/http.c index 623ed234891..6c6cc5c822a 100644 --- a/http.c +++ b/http.c @@ -1227,6 +1227,8 @@ static CURL *get_curl_handle(void) */ curl_easy_setopt(result, CURLOPT_PROXY, ""); } else if (curl_http_proxy) { + struct strbuf proxy = STRBUF_INIT; + if (starts_with(curl_http_proxy, "socks5h")) curl_easy_setopt(result, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS5_HOSTNAME); @@ -1265,7 +1267,27 @@ static CURL *get_curl_handle(void) if (!proxy_auth.host) die("Invalid proxy URL '%s'", curl_http_proxy); - curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host); + strbuf_addstr(&proxy, proxy_auth.host); + if (proxy_auth.path) { + curl_version_info_data *ver = curl_version_info(CURLVERSION_NOW); + + if (ver->version_num < 0x075400) + die("libcurl 7.84 or later is required to support paths in proxy URLs"); + + if (!starts_with(proxy_auth.protocol, "socks")) + die("Invalid proxy URL '%s': only SOCKS proxies support paths", + curl_http_proxy); + + if (strcasecmp(proxy_auth.host, "localhost")) + die("Invalid proxy URL '%s': host must be localhost if a path is present", + curl_http_proxy); + + strbuf_addch(&proxy, '/'); + strbuf_add_percentencode(&proxy, proxy_auth.path, 0); + } + curl_easy_setopt(result, CURLOPT_PROXY, proxy.buf); + strbuf_release(&proxy); + var_override(&curl_no_proxy, getenv("NO_PROXY")); var_override(&curl_no_proxy, getenv("no_proxy")); curl_easy_setopt(result, CURLOPT_NOPROXY, curl_no_proxy); diff --git a/t/socks4-proxy.pl b/t/socks4-proxy.pl new file mode 100644 index 00000000000..4c3a35c0083 --- /dev/null +++ b/t/socks4-proxy.pl @@ -0,0 +1,48 @@ +use strict; +use IO::Select; +use IO::Socket::UNIX; +use IO::Socket::INET; + +my $path = shift; + +unlink($path); +my $server = IO::Socket::UNIX->new(Listen => 1, Local => $path) + or die "unable to listen on $path: $!"; + +$| = 1; +print "ready\n"; + +while (my $client = $server->accept()) { + sysread $client, my $buf, 8; + my ($version, $cmd, $port, $ip) = unpack 'CCnN', $buf; + next unless $version == 4; # socks4 + next unless $cmd == 1; # TCP stream connection + + # skip NUL-terminated id + while (sysread $client, my $char, 1) { + last unless ord($char); + } + + # version(0), reply(5a == granted), port (ignored), ip (ignored) + syswrite $client, "\x00\x5a\x00\x00\x00\x00\x00\x00"; + + my $remote = IO::Socket::INET->new(PeerHost => $ip, PeerPort => $port) + or die "unable to connect to $ip/$port: $!"; + + my $io = IO::Select->new($client, $remote); + while ($io->count) { + for my $fh ($io->can_read(0)) { + for my $pair ([$client, $remote], [$remote, $client]) { + my ($from, $to) = @$pair; + next unless $fh == $from; + + my $r = sysread $from, my $buf, 1024; + if (!defined $r || $r <= 0) { + $io->remove($from); + next; + } + syswrite $to, $buf; + } + } + } +} diff --git a/t/t5564-http-proxy.sh b/t/t5564-http-proxy.sh index bb35b87071d..039d7fc748e 100755 --- a/t/t5564-http-proxy.sh +++ b/t/t5564-http-proxy.sh @@ -39,4 +39,45 @@ test_expect_success 'clone can prompt for proxy password' ' expect_askpass pass proxuser ' +start_socks() { + mkfifo socks_output && + { + "$PERL_PATH" "$TEST_DIRECTORY/socks4-proxy.pl" "$1" >socks_output & + echo $! > "$TRASH_DIRECTORY/socks.pid" + } && + read line <socks_output && + test "$line" = ready +} + +# The %30 tests that the correct amount of percent-encoding is applied to the +# proxy string passed to curl. +test_lazy_prereq SOCKS_PROXY 'test_have_prereq PERL && start_socks "$TRASH_DIRECTORY/%30.sock"' + +test_atexit 'test ! -e "$TRASH_DIRECTORY/socks.pid" || kill "$(cat "$TRASH_DIRECTORY/socks.pid")"' + +test_expect_success SOCKS_PROXY 'clone via Unix socket' ' + test_when_finished "rm -rf clone" && + test_config_global http.proxy "socks4://localhost$PWD/%2530.sock" && { + { + GIT_TRACE_CURL=$PWD/trace git clone "$HTTPD_URL/smart/repo.git" clone 2>err && + grep -i "SOCKS4 request granted." trace + } || + grep "^fatal: libcurl 7\.84 or later" err + } +' + +test_expect_success 'Unix socket requires socks*:' ' + ! git clone -c http.proxy=localhost/path https://example.com/repo.git 2>err && { + grep "^fatal: Invalid proxy URL '\''localhost/path'\'': only SOCKS proxies support paths" err || + grep "^fatal: libcurl 7\.84 or later" err + } +' + +test_expect_success 'Unix socket requires localhost' ' + ! git clone -c http.proxy=socks4://127.0.0.1/path https://example.com/repo.git 2>err && { + grep "^fatal: Invalid proxy URL '\''socks4://127\.0\.0\.1/path'\'': host must be localhost if a path is present" err || + grep "^fatal: libcurl 7\.84 or later" err + } +' + test_done base-commit: 39bf06adf96da25b87c9aa7d35a32ef3683eb4a4 -- gitgitgadget ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH v4] http: do not ignore proxy path 2024-08-01 5:22 ` [PATCH v4] " Ryan Hendrickson via GitGitGadget @ 2024-08-01 6:04 ` Jeff King 2024-08-01 17:04 ` Junio C Hamano 2024-08-02 5:20 ` [PATCH v5] " Ryan Hendrickson via GitGitGadget 1 sibling, 1 reply; 31+ messages in thread From: Jeff King @ 2024-08-01 6:04 UTC (permalink / raw) To: Ryan Hendrickson via GitGitGadget; +Cc: git, Ryan Hendrickson On Thu, Aug 01, 2024 at 05:22:56AM +0000, Ryan Hendrickson via GitGitGadget wrote: > From: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> > > The documentation for `http.proxy` describes that option, and the > environment variables it overrides, as supporting "the syntax understood > by curl". curl allows SOCKS proxies to use a path to a Unix domain > socket, like `socks5h://localhost/path/to/socket.sock`. Git should > therefore include, if present, the path part of the proxy URL in what it > passes to libcurl. > > Co-authored-by: Jeff King <peff@peff.net> > Signed-off-by: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> Thanks for crediting me. I'll add my: Signed-off-by: Jeff King <peff@peff.net> to be explicit that the proxy script is under the DCO. > - curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host); > + strbuf_addstr(&proxy, proxy_auth.host); > + if (proxy_auth.path) { > + curl_version_info_data *ver = curl_version_info(CURLVERSION_NOW); > + > + if (ver->version_num < 0x075400) > + die("libcurl 7.84 or later is required to support paths in proxy URLs"); > + > + if (!starts_with(proxy_auth.protocol, "socks")) > + die("Invalid proxy URL '%s': only SOCKS proxies support paths", > + curl_http_proxy); > + > + if (strcasecmp(proxy_auth.host, "localhost")) > + die("Invalid proxy URL '%s': host must be localhost if a path is present", > + curl_http_proxy); > + > + strbuf_addch(&proxy, '/'); > + strbuf_add_percentencode(&proxy, proxy_auth.path, 0); This all looks good to me. I wondered briefly whether proxy_auth.protocol could ever be NULL, but I think we refuse to parse such a URL. > +start_socks() { > + mkfifo socks_output && > + { > + "$PERL_PATH" "$TEST_DIRECTORY/socks4-proxy.pl" "$1" >socks_output & > + echo $! > "$TRASH_DIRECTORY/socks.pid" > + } && > + read line <socks_output && > + test "$line" = ready > +} > + > +# The %30 tests that the correct amount of percent-encoding is applied to the > +# proxy string passed to curl. > +test_lazy_prereq SOCKS_PROXY 'test_have_prereq PERL && start_socks "$TRASH_DIRECTORY/%30.sock"' OK, I see you figured out that the lazy prereq requires giving the full path to the socket. :) I had forgotten that we also run the prereq in a subshell to avoid side effects, but you caught that, as well. All of this to me is good evidence that the non-lazy version you had originally is a better approach. But I don't think it's worth spending time fighting over, so I'm OK either way. > +test_atexit 'test ! -e "$TRASH_DIRECTORY/socks.pid" || kill "$(cat "$TRASH_DIRECTORY/socks.pid")"' Ah, good. I had meant to mention cleaning up at the end (as we do for git-daemon and apache), but forgot. I'm glad you included it here. > +test_expect_success SOCKS_PROXY 'clone via Unix socket' ' > + test_when_finished "rm -rf clone" && > + test_config_global http.proxy "socks4://localhost$PWD/%2530.sock" && { > + { > + GIT_TRACE_CURL=$PWD/trace git clone "$HTTPD_URL/smart/repo.git" clone 2>err && > + grep -i "SOCKS4 request granted." trace > + } || > + grep "^fatal: libcurl 7\.84 or later" err > + } > +' Looks good. Usually I do not bother with escaping "." in grep messages, as it is already a loose match and it keeps the test easier to read. But it is OK to do so. We do have a test_grep wrapper which gives nicer output when the match fails, but maybe that is a bad choice here, since we accept either of the two messages. (Likewise for using test_cmp, I suppose). The use of "||" in the command-chaining is unusual enough that it's probably worth calling out either in a comment or in the commit message. > +test_expect_success 'Unix socket requires socks*:' ' > + ! git clone -c http.proxy=localhost/path https://example.com/repo.git 2>err && { > + grep "^fatal: Invalid proxy URL '\''localhost/path'\'': only SOCKS proxies support paths" err || > + grep "^fatal: libcurl 7\.84 or later" err > + } > +' > + > +test_expect_success 'Unix socket requires localhost' ' > + ! git clone -c http.proxy=socks4://127.0.0.1/path https://example.com/repo.git 2>err && { > + grep "^fatal: Invalid proxy URL '\''socks4://127\.0\.0\.1/path'\'': host must be localhost if a path is present" err || > + grep "^fatal: libcurl 7\.84 or later" err > + } > +' > + Likewise here, I'd probably just match the single-quotes with "." for readability (but it's OK to write it as you did). You can (since a week or so ago) also use a here-doc body like: test_expect_success 'Unix socket requires socks*:' - <<\EOT ... test_grep "^fatal: Invalid proxy URL 'localhost/path': only SOCKS proxies support paths" err || ... EOT but I'm OK with it either way. The same "||" comment applies. -Peff ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v4] http: do not ignore proxy path 2024-08-01 6:04 ` Jeff King @ 2024-08-01 17:04 ` Junio C Hamano 0 siblings, 0 replies; 31+ messages in thread From: Junio C Hamano @ 2024-08-01 17:04 UTC (permalink / raw) To: Jeff King; +Cc: Ryan Hendrickson via GitGitGadget, git, Ryan Hendrickson Jeff King <peff@peff.net> writes: > On Thu, Aug 01, 2024 at 05:22:56AM +0000, Ryan Hendrickson via GitGitGadget wrote: > >> From: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> >> >> The documentation for `http.proxy` describes that option, and the >> environment variables it overrides, as supporting "the syntax understood >> by curl". curl allows SOCKS proxies to use a path to a Unix domain >> socket, like `socks5h://localhost/path/to/socket.sock`. Git should >> therefore include, if present, the path part of the proxy URL in what it >> passes to libcurl. >> >> Co-authored-by: Jeff King <peff@peff.net> >> Signed-off-by: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> > > Thanks for crediting me. I'll add my: > > Signed-off-by: Jeff King <peff@peff.net> > > to be explicit that the proxy script is under the DCO. OK, I'll amend it while queuing this v4. Thanks. >> +# The %30 tests that the correct amount of percent-encoding is applied to the >> +# proxy string passed to curl. >> +test_lazy_prereq SOCKS_PROXY 'test_have_prereq PERL && start_socks "$TRASH_DIRECTORY/%30.sock"' > > OK, I see you figured out that the lazy prereq requires giving the full > path to the socket. :) I had forgotten that we also run the prereq in a > subshell to avoid side effects, but you caught that, as well. ;-) > All of this to me is good evidence that the non-lazy version you had > originally is a better approach. But I don't think it's worth spending > time fighting over, so I'm OK either way. I'd be OK either way, too. Thanks, both. ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v5] http: do not ignore proxy path 2024-08-01 5:22 ` [PATCH v4] " Ryan Hendrickson via GitGitGadget 2024-08-01 6:04 ` Jeff King @ 2024-08-02 5:20 ` Ryan Hendrickson via GitGitGadget 2024-08-02 15:52 ` Junio C Hamano 1 sibling, 1 reply; 31+ messages in thread From: Ryan Hendrickson via GitGitGadget @ 2024-08-02 5:20 UTC (permalink / raw) To: git; +Cc: Ryan Hendrickson, Jeff King, Ryan Hendrickson, Ryan Hendrickson From: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> The documentation for `http.proxy` describes that option, and the environment variables it overrides, as supporting "the syntax understood by curl". curl allows SOCKS proxies to use a path to a Unix domain socket, like `socks5h://localhost/path/to/socket.sock`. Git should therefore include, if present, the path part of the proxy URL in what it passes to libcurl. Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> --- http: do not ignore proxy path Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1767%2Frhendric%2Frhendric%2Fhttp-proxy-path-v5 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1767/rhendric/rhendric/http-proxy-path-v5 Pull-Request: https://github.com/gitgitgadget/git/pull/1767 Range-diff vs v4: 1: 507fb75c1a6 ! 1: fa101a3b264 http: do not ignore proxy path @@ Commit message Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> + Signed-off-by: Jeff King <peff@peff.net> ## Documentation/config/http.txt ## @@ Documentation/config/http.txt: http.proxy:: @@ t/t5564-http-proxy.sh: test_expect_success 'clone can prompt for proxy password' + +test_atexit 'test ! -e "$TRASH_DIRECTORY/socks.pid" || kill "$(cat "$TRASH_DIRECTORY/socks.pid")"' + ++# The below tests morally ought to be gated on a prerequisite that Git is ++# linked with a libcurl that supports Unix socket paths for proxies (7.84 or ++# later), but this is not easy to test right now. Instead, we || the tests with ++# this function. ++old_libcurl_error() { ++ grep -Fx "fatal: libcurl 7.84 or later is required to support paths in proxy URLs" "$1" ++} ++ +test_expect_success SOCKS_PROXY 'clone via Unix socket' ' + test_when_finished "rm -rf clone" && + test_config_global http.proxy "socks4://localhost$PWD/%2530.sock" && { + { + GIT_TRACE_CURL=$PWD/trace git clone "$HTTPD_URL/smart/repo.git" clone 2>err && -+ grep -i "SOCKS4 request granted." trace ++ grep -i "SOCKS4 request granted" trace + } || -+ grep "^fatal: libcurl 7\.84 or later" err ++ old_libcurl_error err + } +' + -+test_expect_success 'Unix socket requires socks*:' ' ++test_expect_success 'Unix socket requires socks*:' - <<\EOT + ! git clone -c http.proxy=localhost/path https://example.com/repo.git 2>err && { -+ grep "^fatal: Invalid proxy URL '\''localhost/path'\'': only SOCKS proxies support paths" err || -+ grep "^fatal: libcurl 7\.84 or later" err ++ grep -Fx "fatal: Invalid proxy URL 'localhost/path': only SOCKS proxies support paths" err || ++ old_libcurl_error err + } -+' ++EOT + -+test_expect_success 'Unix socket requires localhost' ' ++test_expect_success 'Unix socket requires localhost' - <<\EOT + ! git clone -c http.proxy=socks4://127.0.0.1/path https://example.com/repo.git 2>err && { -+ grep "^fatal: Invalid proxy URL '\''socks4://127\.0\.0\.1/path'\'': host must be localhost if a path is present" err || -+ grep "^fatal: libcurl 7\.84 or later" err ++ grep -Fx "fatal: Invalid proxy URL 'socks4://127.0.0.1/path': host must be localhost if a path is present" err || ++ old_libcurl_error err + } -+' ++EOT + test_done Documentation/config/http.txt | 4 +-- http.c | 24 ++++++++++++++++- t/socks4-proxy.pl | 48 ++++++++++++++++++++++++++++++++++ t/t5564-http-proxy.sh | 49 +++++++++++++++++++++++++++++++++++ 4 files changed, 122 insertions(+), 3 deletions(-) create mode 100644 t/socks4-proxy.pl diff --git a/Documentation/config/http.txt b/Documentation/config/http.txt index 162b33fc52f..a14371b5c96 100644 --- a/Documentation/config/http.txt +++ b/Documentation/config/http.txt @@ -5,8 +5,8 @@ http.proxy:: proxy string with a user name but no password, in which case git will attempt to acquire one in the same way it does for other credentials. See linkgit:gitcredentials[7] for more information. The syntax thus is - '[protocol://][user[:password]@]proxyhost[:port]'. This can be overridden - on a per-remote basis; see remote.<name>.proxy + '[protocol://][user[:password]@]proxyhost[:port][/path]'. This can be + overridden on a per-remote basis; see remote.<name>.proxy + Any proxy, however configured, must be completely transparent and must not modify, transform, or buffer the request or response in any way. Proxies which diff --git a/http.c b/http.c index 623ed234891..6c6cc5c822a 100644 --- a/http.c +++ b/http.c @@ -1227,6 +1227,8 @@ static CURL *get_curl_handle(void) */ curl_easy_setopt(result, CURLOPT_PROXY, ""); } else if (curl_http_proxy) { + struct strbuf proxy = STRBUF_INIT; + if (starts_with(curl_http_proxy, "socks5h")) curl_easy_setopt(result, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS5_HOSTNAME); @@ -1265,7 +1267,27 @@ static CURL *get_curl_handle(void) if (!proxy_auth.host) die("Invalid proxy URL '%s'", curl_http_proxy); - curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host); + strbuf_addstr(&proxy, proxy_auth.host); + if (proxy_auth.path) { + curl_version_info_data *ver = curl_version_info(CURLVERSION_NOW); + + if (ver->version_num < 0x075400) + die("libcurl 7.84 or later is required to support paths in proxy URLs"); + + if (!starts_with(proxy_auth.protocol, "socks")) + die("Invalid proxy URL '%s': only SOCKS proxies support paths", + curl_http_proxy); + + if (strcasecmp(proxy_auth.host, "localhost")) + die("Invalid proxy URL '%s': host must be localhost if a path is present", + curl_http_proxy); + + strbuf_addch(&proxy, '/'); + strbuf_add_percentencode(&proxy, proxy_auth.path, 0); + } + curl_easy_setopt(result, CURLOPT_PROXY, proxy.buf); + strbuf_release(&proxy); + var_override(&curl_no_proxy, getenv("NO_PROXY")); var_override(&curl_no_proxy, getenv("no_proxy")); curl_easy_setopt(result, CURLOPT_NOPROXY, curl_no_proxy); diff --git a/t/socks4-proxy.pl b/t/socks4-proxy.pl new file mode 100644 index 00000000000..4c3a35c0083 --- /dev/null +++ b/t/socks4-proxy.pl @@ -0,0 +1,48 @@ +use strict; +use IO::Select; +use IO::Socket::UNIX; +use IO::Socket::INET; + +my $path = shift; + +unlink($path); +my $server = IO::Socket::UNIX->new(Listen => 1, Local => $path) + or die "unable to listen on $path: $!"; + +$| = 1; +print "ready\n"; + +while (my $client = $server->accept()) { + sysread $client, my $buf, 8; + my ($version, $cmd, $port, $ip) = unpack 'CCnN', $buf; + next unless $version == 4; # socks4 + next unless $cmd == 1; # TCP stream connection + + # skip NUL-terminated id + while (sysread $client, my $char, 1) { + last unless ord($char); + } + + # version(0), reply(5a == granted), port (ignored), ip (ignored) + syswrite $client, "\x00\x5a\x00\x00\x00\x00\x00\x00"; + + my $remote = IO::Socket::INET->new(PeerHost => $ip, PeerPort => $port) + or die "unable to connect to $ip/$port: $!"; + + my $io = IO::Select->new($client, $remote); + while ($io->count) { + for my $fh ($io->can_read(0)) { + for my $pair ([$client, $remote], [$remote, $client]) { + my ($from, $to) = @$pair; + next unless $fh == $from; + + my $r = sysread $from, my $buf, 1024; + if (!defined $r || $r <= 0) { + $io->remove($from); + next; + } + syswrite $to, $buf; + } + } + } +} diff --git a/t/t5564-http-proxy.sh b/t/t5564-http-proxy.sh index bb35b87071d..0d6cfebbfab 100755 --- a/t/t5564-http-proxy.sh +++ b/t/t5564-http-proxy.sh @@ -39,4 +39,53 @@ test_expect_success 'clone can prompt for proxy password' ' expect_askpass pass proxuser ' +start_socks() { + mkfifo socks_output && + { + "$PERL_PATH" "$TEST_DIRECTORY/socks4-proxy.pl" "$1" >socks_output & + echo $! > "$TRASH_DIRECTORY/socks.pid" + } && + read line <socks_output && + test "$line" = ready +} + +# The %30 tests that the correct amount of percent-encoding is applied to the +# proxy string passed to curl. +test_lazy_prereq SOCKS_PROXY 'test_have_prereq PERL && start_socks "$TRASH_DIRECTORY/%30.sock"' + +test_atexit 'test ! -e "$TRASH_DIRECTORY/socks.pid" || kill "$(cat "$TRASH_DIRECTORY/socks.pid")"' + +# The below tests morally ought to be gated on a prerequisite that Git is +# linked with a libcurl that supports Unix socket paths for proxies (7.84 or +# later), but this is not easy to test right now. Instead, we || the tests with +# this function. +old_libcurl_error() { + grep -Fx "fatal: libcurl 7.84 or later is required to support paths in proxy URLs" "$1" +} + +test_expect_success SOCKS_PROXY 'clone via Unix socket' ' + test_when_finished "rm -rf clone" && + test_config_global http.proxy "socks4://localhost$PWD/%2530.sock" && { + { + GIT_TRACE_CURL=$PWD/trace git clone "$HTTPD_URL/smart/repo.git" clone 2>err && + grep -i "SOCKS4 request granted" trace + } || + old_libcurl_error err + } +' + +test_expect_success 'Unix socket requires socks*:' - <<\EOT + ! git clone -c http.proxy=localhost/path https://example.com/repo.git 2>err && { + grep -Fx "fatal: Invalid proxy URL 'localhost/path': only SOCKS proxies support paths" err || + old_libcurl_error err + } +EOT + +test_expect_success 'Unix socket requires localhost' - <<\EOT + ! git clone -c http.proxy=socks4://127.0.0.1/path https://example.com/repo.git 2>err && { + grep -Fx "fatal: Invalid proxy URL 'socks4://127.0.0.1/path': host must be localhost if a path is present" err || + old_libcurl_error err + } +EOT + test_done base-commit: 406f326d271e0bacecdb00425422c5fa3f314930 -- gitgitgadget ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH v5] http: do not ignore proxy path 2024-08-02 5:20 ` [PATCH v5] " Ryan Hendrickson via GitGitGadget @ 2024-08-02 15:52 ` Junio C Hamano 2024-08-02 16:43 ` Ryan Hendrickson 0 siblings, 1 reply; 31+ messages in thread From: Junio C Hamano @ 2024-08-02 15:52 UTC (permalink / raw) To: Ryan Hendrickson via GitGitGadget; +Cc: git, Ryan Hendrickson, Jeff King "Ryan Hendrickson via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> > > The documentation for `http.proxy` describes that option, and the > environment variables it overrides, as supporting "the syntax understood > by curl". curl allows SOCKS proxies to use a path to a Unix domain > socket, like `socks5h://localhost/path/to/socket.sock`. Git should > therefore include, if present, the path part of the proxy URL in what it > passes to libcurl. > > Co-authored-by: Jeff King <peff@peff.net> > Signed-off-by: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> > Signed-off-by: Jeff King <peff@peff.net> > --- The trailer lines should be ordered in chronological order to record the provenance of the patch. The last two entries make it look as if what you assembled and signed-off was relayed by peff, with or without further modification, with his sign-off to me, to become the final version, but that is not the story you want to tell. I'd swap the two lines (i.e., sign-offs) while queuing. > + if (!starts_with(proxy_auth.protocol, "socks")) > + die("Invalid proxy URL '%s': only SOCKS proxies support paths", > + curl_http_proxy); Our error messages that are prefixed with "fatal:" do not typically begin with a capital letter. But I'll let it pass, as this copies an existing message in this file. #leftoverbits clean-up needs to correct the ones added by this patch and existing "Invalid proxy URL '%s'" by downcasing "Invalid", possibly enclose the message in _() for i18n, and also downcase "C" in two "Could not set SSL ..." messages. > + if (strcasecmp(proxy_auth.host, "localhost")) > + die("Invalid proxy URL '%s': host must be localhost if a path is present", > + curl_http_proxy); Ditto. Or instead of leaving this for a later clean-up after the dust settles, we could have a separate "preliminary clean-up" patch to address existing ones first. Either is fine, but taking "clean-up after the dust settles" and leaving it a problem for other people is probably easier. > diff --git a/t/t5564-http-proxy.sh b/t/t5564-http-proxy.sh > index bb35b87071d..0d6cfebbfab 100755 > --- a/t/t5564-http-proxy.sh > +++ b/t/t5564-http-proxy.sh > @@ -39,4 +39,53 @@ test_expect_success 'clone can prompt for proxy password' ' > expect_askpass pass proxuser > ' > > +start_socks() { > + mkfifo socks_output && > + { > + "$PERL_PATH" "$TEST_DIRECTORY/socks4-proxy.pl" "$1" >socks_output & > + echo $! > "$TRASH_DIRECTORY/socks.pid" > + } && > + read line <socks_output && > + test "$line" = ready > +} > + > +# The %30 tests that the correct amount of percent-encoding is applied to the > +# proxy string passed to curl. > +test_lazy_prereq SOCKS_PROXY 'test_have_prereq PERL && start_socks "$TRASH_DIRECTORY/%30.sock"' > + > +test_atexit 'test ! -e "$TRASH_DIRECTORY/socks.pid" || kill "$(cat "$TRASH_DIRECTORY/socks.pid")"' Let's line-wrap these overly long ones. # The %30 tests that the correct amount of percent-encoding is # applied to the proxy string passed to curl. test_lazy_prereq SOCKS_PROXY ' test_have_prereq PERL && start_socks "$TRASH_DIRECTORY/%30.sock" ' test_atexit ' test ! -e "$TRASH_DIRECTORY/socks.pid" || kill "$(cat "$TRASH_DIRECTORY/socks.pid")" ' > +# The below tests morally ought to be gated on a prerequisite that Git is > +# linked with a libcurl that supports Unix socket paths for proxies (7.84 or > +# later), but this is not easy to test right now. Instead, we || the tests with > +# this function. > +old_libcurl_error() { > + grep -Fx "fatal: libcurl 7.84 or later is required to support paths in proxy URLs" "$1" > +} Cute. This fixes the polarity of the result from the whole "{ test } || this helper" construct. Even if the test proper fails, we will allow such a failure only when the reason of the failure is due to this message. If I were writing this, I would shorten to look for a bit fuzzier pattern like grep "^fatal: .* is required to support paths in proxy URLs" "$1" as that would allow us to fix the code later without needing to update the pattern, if we discover reasons, other than being older than libcURL 7.84, why paths in proxy URLs cannot be supported. Other than that, very nicely done. Thanks. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5] http: do not ignore proxy path 2024-08-02 15:52 ` Junio C Hamano @ 2024-08-02 16:43 ` Ryan Hendrickson 2024-08-02 17:10 ` Junio C Hamano 0 siblings, 1 reply; 31+ messages in thread From: Ryan Hendrickson @ 2024-08-02 16:43 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Jeff King At 2024-08-02 08:52-0700, Junio C Hamano <gitster@pobox.com> sent: > I'd swap the two lines (i.e., sign-offs) while queuing. Okay. > Our error messages that are prefixed with "fatal:" do not typically > begin with a capital letter. > > But I'll let it pass, as this copies an existing message in this > file. #leftoverbits clean-up needs to correct the ones added by > this patch and existing "Invalid proxy URL '%s'" by downcasing > "Invalid", possibly enclose the message in _() for i18n, and also > downcase "C" in two "Could not set SSL ..." messages. > > [ . . . ] > > Or instead of leaving this for a later clean-up after the dust > settles, we could have a separate "preliminary clean-up" patch to > address existing ones first. Either is fine, but taking "clean-up > after the dust settles" and leaving it a problem for other people is > probably easier. Hmm. I'd be inclined to take the preliminary clean-up approach, but some of the existing strings (there are also two "Unsupported ..."/"Supported ..." strings near the "Could not set..."s) are going through gettext, and I'm reluctant to interfere with the l10n process. Would a partial clean-up that downcased only the "Invalid proxy URL" message be acceptable, or worse than leaving this as #leftoverbits? > Let's line-wrap these overly long ones. Okay. > If I were writing this, I would shorten to look for a bit fuzzier > pattern like > > grep "^fatal: .* is required to support paths in proxy URLs" "$1" > > as that would allow us to fix the code later without needing to > update the pattern, if we discover reasons, other than being older > than libcURL 7.84, why paths in proxy URLs cannot be supported. Is this blocking feedback? This strikes me as speculative over-engineering; it would not be difficult to generalize the message, and possibly then choose a new, more appropriate function name and update the explanatory comment, when and if such reasons arise. R ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5] http: do not ignore proxy path 2024-08-02 16:43 ` Ryan Hendrickson @ 2024-08-02 17:10 ` Junio C Hamano 2024-08-02 18:03 ` Ryan Hendrickson 0 siblings, 1 reply; 31+ messages in thread From: Junio C Hamano @ 2024-08-02 17:10 UTC (permalink / raw) To: Ryan Hendrickson; +Cc: git, Jeff King Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> writes: > Hmm. I'd be inclined to take the preliminary clean-up approach, but > some of the existing strings (there are also two "Unsupported > ..."/"Supported ..." strings near the "Could not set..."s) are going > through gettext, and I'm reluctant to interfere with the l10n process. I do not see what you mean by interfering with the localization. If we are updating text to be translated anyway, giving translators the strings that need to be translated _earlier_ rather than later would be more helpful to them, no? >> If I were writing this, I would shorten to look for a bit fuzzier >> pattern like >> >> grep "^fatal: .* is required to support paths in proxy URLs" "$1" >> >> as that would allow us to fix the code later without needing to >> update the pattern, if we discover reasons, other than being older >> than libcURL 7.84, why paths in proxy URLs cannot be supported. > > Is this blocking feedback? This strikes me as speculative > over-engineering No, it is loosening a pattern that is overly tight and as a side effect shortening the line to more readable length ;-). ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5] http: do not ignore proxy path 2024-08-02 17:10 ` Junio C Hamano @ 2024-08-02 18:03 ` Ryan Hendrickson 2024-08-02 19:28 ` Junio C Hamano 0 siblings, 1 reply; 31+ messages in thread From: Ryan Hendrickson @ 2024-08-02 18:03 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Jeff King At 2024-08-02 10:10-0700, Junio C Hamano <gitster@pobox.com> sent: > Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> writes: > >> Hmm. I'd be inclined to take the preliminary clean-up approach, but >> some of the existing strings (there are also two "Unsupported >> ..."/"Supported ..." strings near the "Could not set..."s) are going >> through gettext, and I'm reluctant to interfere with the l10n process. > > I do not see what you mean by interfering with the localization. > > If we are updating text to be translated anyway, giving translators > the strings that need to be translated _earlier_ rather than later > would be more helpful to them, no? Probably true, but as a new contributor I don't know whether changing msgids means more people need to review the patch, more files need to be changed, a translation team needs to be notified, the change needs to be pushed a different branch... whatever your process is. Localized strings are generally more of a headache for drive-by contributors, in my experience across different projects. >> Is this blocking feedback? This strikes me as speculative >> over-engineering > > No, it is loosening a pattern that is overly tight and as a side > effect shortening the line to more readable length ;-). Blocking or not? R ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5] http: do not ignore proxy path 2024-08-02 18:03 ` Ryan Hendrickson @ 2024-08-02 19:28 ` Junio C Hamano 2024-08-02 19:39 ` Ryan Hendrickson 0 siblings, 1 reply; 31+ messages in thread From: Junio C Hamano @ 2024-08-02 19:28 UTC (permalink / raw) To: Ryan Hendrickson; +Cc: git, Jeff King Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> writes: > At 2024-08-02 10:10-0700, Junio C Hamano <gitster@pobox.com> sent: > >> Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> writes: >> >>> Hmm. I'd be inclined to take the preliminary clean-up approach, but >>> some of the existing strings (there are also two "Unsupported >>> ..."/"Supported ..." strings near the "Could not set..."s) are going >>> through gettext, and I'm reluctant to interfere with the l10n process. >> >> I do not see what you mean by interfering with the localization. >> >> If we are updating text to be translated anyway, giving translators >> the strings that need to be translated _earlier_ rather than later >> would be more helpful to them, no? > > Probably true, but as a new contributor I don't know whether changing > msgids means more people need to review the patch, more files need to > be changed, a translation team needs to be notified, the change needs > to be pushed a different branch... whatever your process is. Localized > strings are generally more of a headache for drive-by contributors, in > my experience across different projects. > >>> Is this blocking feedback? This strikes me as speculative >>> over-engineering >> >> No, it is loosening a pattern that is overly tight and as a side >> effect shortening the line to more readable length ;-). > > Blocking or not? If we are updating anyway, that question is irrelevant, no? This version may hit 'seen' but until the next version comes it will not advance to 'next'. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5] http: do not ignore proxy path 2024-08-02 19:28 ` Junio C Hamano @ 2024-08-02 19:39 ` Ryan Hendrickson 2024-08-02 21:13 ` Junio C Hamano 0 siblings, 1 reply; 31+ messages in thread From: Ryan Hendrickson @ 2024-08-02 19:39 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Jeff King [-- Attachment #1: Type: text/plain, Size: 582 bytes --] At 2024-08-02 12:28-0700, Junio C Hamano <gitster@pobox.com> sent: >>>> Is this blocking feedback? This strikes me as speculative >>>> over-engineering >>> >>> No, it is loosening a pattern that is overly tight and as a side >>> effect shortening the line to more readable length ;-). >> >> Blocking or not? > > If we are updating anyway, that question is irrelevant, no? This > version may hit 'seen' but until the next version comes it will not > advance to 'next'. I can't figure out what you mean by this so I am going to proceed as if you had simply said ‘non-blocking’. R ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5] http: do not ignore proxy path 2024-08-02 19:39 ` Ryan Hendrickson @ 2024-08-02 21:13 ` Junio C Hamano 2024-08-02 21:26 ` Ryan Hendrickson 0 siblings, 1 reply; 31+ messages in thread From: Junio C Hamano @ 2024-08-02 21:13 UTC (permalink / raw) To: Ryan Hendrickson; +Cc: git, Jeff King Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> writes: > At 2024-08-02 12:28-0700, Junio C Hamano <gitster@pobox.com> sent: > >>>>> Is this blocking feedback? This strikes me as speculative >>>>> over-engineering >>>> >>>> No, it is loosening a pattern that is overly tight and as a side >>>> effect shortening the line to more readable length ;-). >>> >>> Blocking or not? >> >> If we are updating anyway, that question is irrelevant, no? This >> version may hit 'seen' but until the next version comes it will not >> advance to 'next'. > > I can't figure out what you mean by this so I am going to proceed as > if you had simply said ‘non-blocking’. It does not make much sense to ask if a suggestion is "blocking" or "non-blocking". If you respond with a reasonable explanation why you do not want to take a suggestion, I may (or may not) say that your reasoning makes sense. IOW, making me say "it is blocking" means you want to me to say that I won't listen to you no matter what you say. That is rarely be the case. In this case, I do not think it makes sense to insist with -Fx that the error message has the exact message. And I do not think your "strikes me as" qualifies as a "reasonable explanation". ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5] http: do not ignore proxy path 2024-08-02 21:13 ` Junio C Hamano @ 2024-08-02 21:26 ` Ryan Hendrickson 2024-08-02 21:43 ` Junio C Hamano 2024-08-02 21:47 ` Junio C Hamano 0 siblings, 2 replies; 31+ messages in thread From: Ryan Hendrickson @ 2024-08-02 21:26 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Jeff King [-- Attachment #1: Type: text/plain, Size: 2408 bytes --] At 2024-08-02 14:13-0700, Junio C Hamano <gitster@pobox.com> sent: >>>>>> Is this blocking feedback? This strikes me as speculative >>>>>> over-engineering >>>>> >>>>> No, it is loosening a pattern that is overly tight and as a side >>>>> effect shortening the line to more readable length ;-). >>>> >>>> Blocking or not? >>> >>> If we are updating anyway, that question is irrelevant, no? This >>> version may hit 'seen' but until the next version comes it will not >>> advance to 'next'. >> >> I can't figure out what you mean by this so I am going to proceed as >> if you had simply said ‘non-blocking’. > > It does not make much sense to ask if a suggestion is "blocking" or > "non-blocking". If you respond with a reasonable explanation why > you do not want to take a suggestion, I may (or may not) say that > your reasoning makes sense. IOW, making me say "it is blocking" > means you want to me to say that I won't listen to you no matter > what you say. That is rarely be the case. I want you to do what Documentation/ReviewingGuidelines.txt says reviewers should do: - When providing a recommendation, be as clear as possible about whether you consider it "blocking" (the code would be broken or otherwise made worse if an issue isn't fixed) or "non-blocking" (the patch could be made better by taking the recommendation, but acceptance of the series does not require it). Non-blocking recommendations can be particularly ambiguous when they are related to - but outside the scope of - a series ("nice-to-have"s), or when they represent only stylistic differences between the author and reviewer. Because I would rather not have what is likely to be a highly subjective argument about this particular choice in a test script if we don't have to have one. I would also rather not put time into figuring out how best to rename the function "old_curl_version" if it no longer checks for the particular error produced when the curl version is too old, nor would I want to think ahead about whether it is correct for these tests that use this function to continue to pass if different variations on this error are raised. There is one qualifying error currently, and that's what the test matches against. Matching against hypothetical future errors is speculative overengineering if it is not obvious how the errors will vary and what it may mean if they do. R ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5] http: do not ignore proxy path 2024-08-02 21:26 ` Ryan Hendrickson @ 2024-08-02 21:43 ` Junio C Hamano 2024-08-02 21:47 ` Junio C Hamano 1 sibling, 0 replies; 31+ messages in thread From: Junio C Hamano @ 2024-08-02 21:43 UTC (permalink / raw) To: Ryan Hendrickson; +Cc: git, Jeff King Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> writes: > I would also rather not put time into figuring out how best to rename > the function "old_curl_version" if it no longer checks for the > particular error produced when the curl version is too old, Now, that is a good argument to make sure the "libcurl 7.84 or later" I suggested to omit for the sake of brevity is in the pattern. Thanks. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5] http: do not ignore proxy path 2024-08-02 21:26 ` Ryan Hendrickson 2024-08-02 21:43 ` Junio C Hamano @ 2024-08-02 21:47 ` Junio C Hamano 2024-08-02 22:14 ` Ryan Hendrickson 1 sibling, 1 reply; 31+ messages in thread From: Junio C Hamano @ 2024-08-02 21:47 UTC (permalink / raw) To: Ryan Hendrickson; +Cc: git, Jeff King Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> writes: > what the test matches against. Matching against hypothetical future > errors is speculative overengineering if it is not obvious how the > errors will vary and what it may mean if they do. The easiest you can imagine is that it turns out the cut-off version of cURL for the feature turned out to be not 7.84, or versions of cURL shipped by some distros have backports of the feature to an older version that explicitly naming 7.84 causes more confusion than naming the exact feature we rely on, and we end up rephrasing the error message. We have done such changes in the past, so it is not really speculating "hypothetical future errors", but applying common sense gained over years working on this project. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5] http: do not ignore proxy path 2024-08-02 21:47 ` Junio C Hamano @ 2024-08-02 22:14 ` Ryan Hendrickson 0 siblings, 0 replies; 31+ messages in thread From: Ryan Hendrickson @ 2024-08-02 22:14 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Jeff King At 2024-08-02 14:47-0700, Junio C Hamano <gitster@pobox.com> sent: > Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> writes: > >> what the test matches against. Matching against hypothetical future >> errors is speculative overengineering if it is not obvious how the >> errors will vary and what it may mean if they do. > > The easiest you can imagine is that it turns out the cut-off version > of cURL for the feature turned out to be not 7.84, or versions of > cURL shipped by some distros have backports of the feature to an > older version that explicitly naming 7.84 causes more confusion than > naming the exact feature we rely on, and we end up rephrasing the > error message. We have done such changes in the past, so it is not > really speculating "hypothetical future errors", but applying common > sense gained over years working on this project. Okay, so is it a problem to also change the message in the test if that happens? My concern is that if I pick some fragment of the message to elide in the test, the message could still get reworded in a way that requires the test to be changed anyway. Even if not, the comment above it would likely need revision too; and if the test doesn't fail, whoever amends the message is unlikely to notice this. The way the test is written in the current version of the patch, there is no ambiguity about what is being tested and what would need to change if the message changes. Future contributors can grep the code for references to that error message and find this test regardless of which part and how much of the message they choose to grep for. Shaving a dozen or so characters out of line is not more important than those considerations, is it? R ^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2024-08-02 22:14 UTC | newest] Thread overview: 31+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-07-26 15:52 [PATCH] http: do not ignore proxy path Ryan Hendrickson via GitGitGadget 2024-07-26 16:29 ` Junio C Hamano 2024-07-26 17:12 ` Ryan Hendrickson 2024-07-26 17:45 ` Junio C Hamano 2024-07-26 21:11 ` Jeff King 2024-07-26 22:43 ` Ryan Hendrickson 2024-07-29 19:31 ` Jeff King 2024-07-27 6:44 ` [PATCH v2] " Ryan Hendrickson via GitGitGadget 2024-07-29 20:09 ` Jeff King 2024-07-31 15:33 ` Ryan Hendrickson 2024-07-31 16:01 ` [PATCH v3] " Ryan Hendrickson via GitGitGadget 2024-07-31 22:24 ` Junio C Hamano 2024-08-01 3:44 ` Ryan Hendrickson 2024-08-01 5:21 ` Junio C Hamano 2024-08-01 5:45 ` Jeff King 2024-08-01 14:40 ` Junio C Hamano 2024-08-01 5:22 ` [PATCH v4] " Ryan Hendrickson via GitGitGadget 2024-08-01 6:04 ` Jeff King 2024-08-01 17:04 ` Junio C Hamano 2024-08-02 5:20 ` [PATCH v5] " Ryan Hendrickson via GitGitGadget 2024-08-02 15:52 ` Junio C Hamano 2024-08-02 16:43 ` Ryan Hendrickson 2024-08-02 17:10 ` Junio C Hamano 2024-08-02 18:03 ` Ryan Hendrickson 2024-08-02 19:28 ` Junio C Hamano 2024-08-02 19:39 ` Ryan Hendrickson 2024-08-02 21:13 ` Junio C Hamano 2024-08-02 21:26 ` Ryan Hendrickson 2024-08-02 21:43 ` Junio C Hamano 2024-08-02 21:47 ` Junio C Hamano 2024-08-02 22:14 ` Ryan Hendrickson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).