From: Taylor Blau <me@ttaylorr.com>
To: Vaidas Pilkauskas via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Vaidas Pilkauskas <vaidas.pilkauskas@shopify.com>
Subject: Re: [PATCH 1/3] http: add support for HTTP 429 rate limit retries
Date: Tue, 9 Dec 2025 18:15:11 -0500 [thread overview]
Message-ID: <aTitfzeb7J8TUTYQ@nand.local> (raw)
In-Reply-To: <ae0087cd1c7fbb6b748d6767b476c1bd1a19996f.1764160227.git.gitgitgadget@gmail.com>
On Wed, Nov 26, 2025 at 12:30:25PM +0000, Vaidas Pilkauskas via GitGitGadget wrote:
> Retry behavior is controlled by three new configuration options:
>
> * http.maxRetries: Maximum number of retry attempts (default: 0,
> meaning retries are disabled by default). Users must explicitly
> opt-in to retry behavior.
>
> * http.retryAfter: Default delay in seconds when the server doesn't
> provide a Retry-After header (default: -1, meaning fail if no
> header is provided). This serves as a fallback mechanism.
>
> * http.maxRetryTime: Maximum delay in seconds for a single retry
> (default: 300). If the server requests a delay exceeding this
> limit, Git fails immediately rather than waiting. This prevents
> indefinite blocking on unreasonable server requests.
>
> All three options can be overridden via environment variables:
> GIT_HTTP_MAX_RETRIES, GIT_HTTP_RETRY_AFTER, and
> GIT_HTTP_MAX_RETRY_TIME.
This is great information, and I am glad that it is written down in
http.adoc so that it shows up in git-config(1). I think that it's fine
to omit this level of detail from the commit message, since it
duplicates information from the authoritative source on configuration
knobs.
It might be reasonable to say something like:
Retry behavior is controlled by three new configuration options
(http.maxRetries, http.retryAfter, and http.maxRetryTime) which are
documented in git-config(1).
or something.
> diff --git a/http-push.c b/http-push.c
> index d86ce77119..a602a302ec 100644
> --- a/http-push.c
> +++ b/http-push.c
> @@ -716,6 +716,10 @@ static int fetch_indices(void)
> case HTTP_MISSING_TARGET:
> ret = 0;
> break;
> + case HTTP_RATE_LIMITED:
> + error("rate limited by '%s', please try again later", repo->url);
> + ret = -1;
Other strings in this file aren't marked for translation, but I think
we can/should mark this one like so:
error(_("rate limited by %s ..."), repo->url);
> diff --git a/http.c b/http.c
> index 41f850db16..212805cad5 100644
> --- a/http.c
> +++ b/http.c
> @@ -22,6 +22,7 @@
> #include "object-file.h"
> #include "odb.h"
> #include "tempfile.h"
> +#include "date.h"
>
> static struct trace_key trace_curl = TRACE_KEY_INIT(CURL);
> static int trace_curl_data = 1;
> @@ -149,6 +150,14 @@ static char *cached_accept_language;
> static char *http_ssl_backend;
>
> static int http_schannel_check_revoke = 1;
> +
> +/* Retry configuration */
> +static long http_retry_after = -1; /* Default retry-after in seconds when header is missing (-1 means not set, exit with 128) */
> +static long http_max_retries = 0; /* Maximum number of retry attempts (0 means retries are disabled) */
> +static long http_max_retry_time = 300; /* Maximum time to wait for a single retry (default 5 minutes) */
These comments should be OK to drop, the variables indicate what Git
configuration they correspond to (e.g., http_retry_after ->
http.retryAfter), so git-config(1) is the authoritative source for
documentation here.
> @@ -257,6 +267,47 @@ static size_t fwrite_wwwauth(char *ptr, size_t eltsize, size_t nmemb, void *p UN
> goto exit;
> }
>
> + /* Parse Retry-After header for rate limiting */
> + if (skip_iprefix_mem(ptr, size, "retry-after:", &val, &val_len)) {
Makes sense, though I wonder if we should rename this function, since
fwrite_wwwauth is now doing more than just handling WWW-Authenticate
headers.
Perhaps we should have a single top-level function that is registered as
our CURLOPT_HEADERFUNCTION that dispatches calls to header-specific
functions? Otherwise the actual parsing of the Retry-After header looks
good to me.
> @@ -1422,6 +1488,10 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
> set_long_from_env(&curl_tcp_keepintvl, "GIT_TCP_KEEPINTVL");
> set_long_from_env(&curl_tcp_keepcnt, "GIT_TCP_KEEPCNT");
>
> + set_long_from_env(&http_retry_after, "GIT_HTTP_RETRY_AFTER");
> + set_long_from_env(&http_max_retries, "GIT_HTTP_MAX_RETRIES");
> + set_long_from_env(&http_max_retry_time, "GIT_HTTP_MAX_RETRY_TIME");
> +
The configuration handling and overrides look good to me.
> @@ -2253,19 +2330,36 @@ static int update_url_from_redirect(struct strbuf *base,
> return 1;
> }
>
> +/*
> + * Sleep for the specified number of seconds before retrying.
> + */
> +static void sleep_for_retry(long retry_after)
> +{
> + if (retry_after > 0) {
> + unsigned int remaining;
> + warning(_("rate limited, waiting %ld seconds before retry"), retry_after);
> + remaining = sleep(retry_after);
What should we do if there are other active request slots? It has been a
couple of years since I have looked at Git's HTTP code, but I imagine
that we should be able to continue processing other requests while
waiting for the retry-after period to elapse here.
> @@ -2302,7 +2396,54 @@ static int http_request_reauth(const char *url,
> BUG("Unknown http_request target");
> }
>
> - credential_fill(the_repository, &http_auth, 1);
> + if (ret == HTTP_RATE_LIMITED) {
Should handling the retry behavior be moved into a separate function? I
think that http_request_reauth() might be clearer if it read:
if (ret == HTTP_RATE_LIMITED)
apply_rate_limit(...); /* presumably with a better name */
else
credential_fill(...);
, and likewise, should we rename this function as it is no longer just
re-authenticating HTTP requests?
> diff --git a/t/t5584-http-429-retry.sh b/t/t5584-http-429-retry.sh
> new file mode 100755
> index 0000000000..8bcc382763
> --- /dev/null
> +++ b/t/t5584-http-429-retry.sh
> @@ -0,0 +1,429 @@
> +#!/bin/sh
> +
> +test_description='test HTTP 429 Too Many Requests retry logic'
> +
> +. ./test-lib.sh
> +
> +. "$TEST_DIRECTORY"/lib-httpd.sh
> +
> +start_httpd
> +
> +test_expect_success 'setup test repository' '
> + test_commit initial &&
> + git clone --bare . "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
> + git --git-dir="$HTTPD_DOCUMENT_ROOT_PATH/repo.git" config http.receivepack true
> +'
> +
> +test_expect_success 'HTTP 429 with retries disabled (maxRetries=0) fails immediately' '
> + write_script "$HTTPD_ROOT_PATH/one-time-script" <<-\EOF &&
> + printf "Status: 429 Too Many Requests\r\n"
> + printf "Retry-After: 1\r\n"
> + printf "Content-Type: text/plain\r\n"
> + printf "\r\n"
> + printf "Rate limited\n"
> + cat "$1" >/dev/null
> + EOF
To avoid having to write this script multiple write, you can write it as
a separate script in t/lib-httpd and then make sure to list it in
prepare_httpd() (from t/lib-httpd.sh).
You can then list it in the apache.conf in the same directory and invoke
it however you like. If you need to take in arguments to the script
(e.g., to change the Retry-After value), you can use a ScriptAliasMatch
instead of a normal ScriptAlias to pass in extra parameters from the URL.
The one-time-script mechanism here will cause the test harness to delete
the script after its first (and only) use, which can be useful for some
cases but I suspect is not necessary for all of these tests.
> +
> + # Set maxRetries to 0 (disabled)
> + test_config http.maxRetries 0 &&
> + test_config http.retryAfter 1 &&
> +
> + # Should fail immediately without any retry attempt
> + test_must_fail git ls-remote "$HTTPD_URL/one_time_script/repo.git" 2>err &&
> +
> + # Verify no retry happened (no "waiting" message in stderr)
> + ! grep -i "waiting.*retry" err &&
test_grep can be helpful when reading the output of test failures, since
it dumps the contents of the file it was searching. Just make sure to
write "test_grep !" instead of "! test_grep" (there are a few such
instances of the latter that I just wrote patches to clean up).
"! test_grep" isn't *wrong* per-se, but it will pollute the test output
with "couldn't find xyz in abc".
I skimmed through the the remainder of the tests since I imagine that
they will change substantially after writing the script out explicitly
instead of using one-time-script, so I'll hold off on reviewing that
portion in more detail until then.
Thanks,
Taylor
next prev parent reply other threads:[~2025-12-09 23:15 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-26 12:30 [PATCH 0/3] http: add support for HTTP 429 rate limit retries Vaidas Pilkauskas via GitGitGadget
2025-11-26 12:30 ` [PATCH 1/3] " Vaidas Pilkauskas via GitGitGadget
2025-12-09 23:15 ` Taylor Blau [this message]
2025-12-12 12:36 ` Vaidas Pilkauskas
2025-11-26 12:30 ` [PATCH 2/3] remote-curl: fix memory leak in show_http_message() Vaidas Pilkauskas via GitGitGadget
2025-12-09 23:52 ` Taylor Blau
2025-11-26 12:30 ` [PATCH 3/3] http: add trace2 logging for retry operations Vaidas Pilkauskas via GitGitGadget
2025-12-18 14:44 ` [PATCH v2 0/2] http: add support for HTTP 429 rate limit retries Vaidas Pilkauskas via GitGitGadget
2025-12-18 14:44 ` [PATCH v2 1/2] " Vaidas Pilkauskas via GitGitGadget
2025-12-18 14:44 ` [PATCH v2 2/2] http: add trace2 logging for retry operations Vaidas Pilkauskas via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aTitfzeb7J8TUTYQ@nand.local \
--to=me@ttaylorr.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=vaidas.pilkauskas@shopify.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).