* [PATCH] http: handle absolute-path alternates from server root
@ 2026-05-12 16:26 Jeff King
2026-05-13 1:10 ` Junio C Hamano
0 siblings, 1 reply; 5+ messages in thread
From: Jeff King @ 2026-05-12 16:26 UTC (permalink / raw)
To: git; +Cc: slonkazoid
When a dumb http server reports alternates with an absolute path, we try
to paste that onto the root of the URL we're trying to fetch from. So if
we go to "http://example.com/path/to/child.git" and it tells us about an
alternate at "/parent.git", we'll hit "http://example.com/parent.git".
But there's a bug in computing the base when the URL does not have any
path component at all, like "http://example.com". When looking for the
first slash after the host, strchr() returns NULL, and we compute a
nonsense value for the length of the host portion. And then when we use
that length to copy the base of the URL into a strbuf, we're likely to
fail.
The security implications are minimal here. We store the nonsense length
("serverlen") as an int, so on a 64-bit system it may effectively be
anything (it is zero minus a 64-bit heap pointer, then truncated to
32-bits and stuffed into a signed value). When we feed that length to
strbuf_add(), it is cast into a size_t and one of four things will
happen:
1. If serverlen was negative, it will turn into a very large positive
value and strbuf_add() will fail to allocate, ending the program.
Ditto if serverlen was positive but just very large.
This doesn't really get an attacker anything; the victim will just
fail to clone their evil repo.
2. If serverlen was small enough, we'll successfully extend the target
strbuf, and then copy an arbitrary set of bytes from "base". And
then one of these is true:
a. That set of bytes is much larger than the length of the "base"
string. This is an out-of-bounds read, but there's no
out-of-bounds write, since the strbuf code both allocates and
copies using the same size_t. This is likely to cause a
segfault as we try to read unmapped pages of memory.
b. Like (2a), but if the set of bytes is small enough we might
not segfault. We might read random memory from the process and
copy it into the "target" strbuf.
What happens then? We know that "base" ends with a NUL
terminator, which will be copied into "target" as well. So
even though target.len might be 1000 bytes (or whatever), when
interpreted as a NUL-terminated string, target.buf is still
the exact same string as "base".
And that's all we ever do with target: pass it around as a C
string, and then eventually strbuf_detach() it to become a C
string. So even though there was arbitrary memory copied into
the strbuf, we never access it.
c. The other interesting case is when serverlen is actually
_shorter_ than the length of base. And there we truncate the
string. Probably in a way that makes it totally invalid, but
if you were very unlucky you could turn something like:
http://victim.com.evil.domain:8000
into:
http://victim.com
Which looks like the start of a redirect attack, except that
the attacker could just have written "http://victim.com" in
the first place! Either way we feed it to
is_alternate_allowed(), which is where we check redirect and
protocol rules.
I think we can just treat this like a regular bug.
And it's quite a weird setup in the first place, as it implies that the
root of the web server is serving a repository (i.e., that you can get
something useful from "http://example.com/info/refs"). The bug has been
there since b3661567cf ([PATCH] Add support for alternates in HTTP,
2005-09-14) without anybody noticing.
I kind of doubt anybody really cares about making this work, but it's
easy enough to do so: the host-portion of the URL ends at either the
first slash or the end-of-string. So we can just replace strchr() with
strchrnul().
The test setup is a little gross, as we take over the httpd document
root by shoving our bare-repo components into it. But it demonstrates
the problem and shows that our solution actually allows the alternate to
function, if the server is configured to allow it.
Reported-by: slonkazoid <slonkazoid@slonk.ing>
Signed-off-by: Jeff King <peff@peff.net>
---
http-walker.c | 2 +-
t/t5550-http-fetch-dumb.sh | 20 ++++++++++++++++++++
2 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/http-walker.c b/http-walker.c
index 1b6d496548..f252de089f 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -268,7 +268,7 @@ static void process_alternates_response(void *callback_data)
*/
const char *colon_ss = strstr(base,"://");
if (colon_ss) {
- serverlen = (strchr(colon_ss + 3, '/')
+ serverlen = (strchrnul(colon_ss + 3, '/')
- base);
okay = 1;
}
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 9d0a7f5c4b..b0080bf204 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -555,4 +555,24 @@ test_expect_success 'dumb http can fetch index v1' '
git -C idx-v1 fsck
'
+test_expect_success 'absolute-path alternate when url has no path' '
+ src=$HTTPD_DOCUMENT_ROOT_PATH/repo.git &&
+ alt=absolute-alt.git &&
+ git clone --bare --shared "$src" "$alt" &&
+
+ # Our repo has an alternate pointing to the absolute filesystem path,
+ # but that will not make any sense to an http client. So we will
+ # manually give it the equivalent path that the http server will
+ # understand.
+ echo "/dumb/repo.git/objects" >"$alt/objects/info/http-alternates" &&
+
+ # Now make our alt repository available at the root of the http
+ # server without any path (i.e., just http://localhost:1234).
+ git -C "$alt" update-server-info &&
+ mv absolute-alt.git/* "$HTTPD_DOCUMENT_ROOT_PATH" &&
+
+ git -c http.followRedirects=true clone "$HTTPD_URL" alt-clone.git 2>err &&
+ test_grep "adding alternate object store: $HTTPD_URL/dumb/repo.git" err
+'
+
test_done
--
2.54.0.420.gf0bcdff42b
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] http: handle absolute-path alternates from server root
2026-05-12 16:26 [PATCH] http: handle absolute-path alternates from server root Jeff King
@ 2026-05-13 1:10 ` Junio C Hamano
2026-05-13 18:58 ` Jeff King
0 siblings, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2026-05-13 1:10 UTC (permalink / raw)
To: Jeff King; +Cc: git, slonkazoid
Jeff King <peff@peff.net> writes:
> ... Probably in a way that makes it totally invalid, but
> if you were very unlucky you could turn something like:
>
> http://victim.com.evil.domain:8000
>
> into:
>
> http://victim.com
>
> Which looks like the start of a redirect attack, except that
> the attacker could just have written "http://victim.com" in
> the first place! Either way we feed it to
> is_alternate_allowed(), which is where we check redirect and
> protocol rules.
Yuck. I know I am the guilty party who introduced the dumb HTTP
walker but I wish we could kill it off after all these years. I did
not even recall that we supported the alternate object store in the
"protocol" until I saw this patch X-<.
> I think we can just treat this like a regular bug.
Absolutely. Thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] http: handle absolute-path alternates from server root
2026-05-13 1:10 ` Junio C Hamano
@ 2026-05-13 18:58 ` Jeff King
2026-05-15 7:41 ` Patrick Steinhardt
0 siblings, 1 reply; 5+ messages in thread
From: Jeff King @ 2026-05-13 18:58 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, slonkazoid
On Wed, May 13, 2026 at 10:10:54AM +0900, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > ... Probably in a way that makes it totally invalid, but
> > if you were very unlucky you could turn something like:
> >
> > http://victim.com.evil.domain:8000
> >
> > into:
> >
> > http://victim.com
> >
> > Which looks like the start of a redirect attack, except that
> > the attacker could just have written "http://victim.com" in
> > the first place! Either way we feed it to
> > is_alternate_allowed(), which is where we check redirect and
> > protocol rules.
>
> Yuck. I know I am the guilty party who introduced the dumb HTTP
> walker but I wish we could kill it off after all these years. I did
> not even recall that we supported the alternate object store in the
> "protocol" until I saw this patch X-<.
Me too. It's been the source of many obscure bugs, and I think a couple
of vulnerabilities (even though clients never intend to use dumb clones
in the first place).
We talked about dropping it a few years ago, but Eric countered that
dumb clones are easier on the server in some cases (like gigantic
public-inbox repos that are packed to keep most of the old history in
one big pack that is never updated). The verbatim pack-reuse feature
tries to get smart clones closer to that, but it's hard to beat serving
a static file from the server's perspective. I haven't measured anything
in that area in a while, though.
-Peff
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] http: handle absolute-path alternates from server root
2026-05-13 18:58 ` Jeff King
@ 2026-05-15 7:41 ` Patrick Steinhardt
2026-05-15 17:01 ` Jeff King
0 siblings, 1 reply; 5+ messages in thread
From: Patrick Steinhardt @ 2026-05-15 7:41 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, git, slonkazoid
On Wed, May 13, 2026 at 02:58:25PM -0400, Jeff King wrote:
> On Wed, May 13, 2026 at 10:10:54AM +0900, Junio C Hamano wrote:
>
> > Jeff King <peff@peff.net> writes:
> >
> > > ... Probably in a way that makes it totally invalid, but
> > > if you were very unlucky you could turn something like:
> > >
> > > http://victim.com.evil.domain:8000
> > >
> > > into:
> > >
> > > http://victim.com
> > >
> > > Which looks like the start of a redirect attack, except that
> > > the attacker could just have written "http://victim.com" in
> > > the first place! Either way we feed it to
> > > is_alternate_allowed(), which is where we check redirect and
> > > protocol rules.
> >
> > Yuck. I know I am the guilty party who introduced the dumb HTTP
> > walker but I wish we could kill it off after all these years. I did
> > not even recall that we supported the alternate object store in the
> > "protocol" until I saw this patch X-<.
>
> Me too. It's been the source of many obscure bugs, and I think a couple
> of vulnerabilities (even though clients never intend to use dumb clones
> in the first place).
>
> We talked about dropping it a few years ago, but Eric countered that
> dumb clones are easier on the server in some cases (like gigantic
> public-inbox repos that are packed to keep most of the old history in
> one big pack that is never updated). The verbatim pack-reuse feature
> tries to get smart clones closer to that, but it's hard to beat serving
> a static file from the server's perspective. I haven't measured anything
> in that area in a while, though.
In theory we can get much closer with packfile URIs, too, can't we? If
the packfiles are directly accessible anyway the server could just
announce these directly and have the client fetch them. That should
significantly reduce the load on the server even further.
Of course, the big downside is that "fetch.uriProtocols" is empty by
default, so Git will not use them. Makes me wonder whether this is
something we want to eventually change, but I guess the current default
behaviour is somewhat insecure as it would allow the server to redirect
clients to arbitrary locations. It would be great if we had a mechanism
that only allowed packfile URIs that use the same host, which would make
this a lot more reasonable to enable by default.
Patrick
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] http: handle absolute-path alternates from server root
2026-05-15 7:41 ` Patrick Steinhardt
@ 2026-05-15 17:01 ` Jeff King
0 siblings, 0 replies; 5+ messages in thread
From: Jeff King @ 2026-05-15 17:01 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Junio C Hamano, git, slonkazoid
On Fri, May 15, 2026 at 09:41:06AM +0200, Patrick Steinhardt wrote:
> > We talked about dropping it a few years ago, but Eric countered that
> > dumb clones are easier on the server in some cases (like gigantic
> > public-inbox repos that are packed to keep most of the old history in
> > one big pack that is never updated). The verbatim pack-reuse feature
> > tries to get smart clones closer to that, but it's hard to beat serving
> > a static file from the server's perspective. I haven't measured anything
> > in that area in a while, though.
>
> In theory we can get much closer with packfile URIs, too, can't we? If
> the packfiles are directly accessible anyway the server could just
> announce these directly and have the client fetch them. That should
> significantly reduce the load on the server even further.
Packfile URIs help with the actual pack generation (even if we're
blitting out bits from the disk with verbatim packfile reuse, we still
have to handle gaps and compute the checksum over the output pack).
But it doesn't help with the server computing the set of objects the
client needs in the first place. IIRC, packfile URIs work by the server
saying "oh, I was going to send you object XYZ, but you can get it from
this stable pack instead". So the server still has to compute the set of
objects (and send any that are not mentioned in URI packs). Bitmaps
help, but there's still non-trivial computation and storage on the
server.
Contrast that with a client that instead pulls a packfile over dumb
storage on its own, and then comes to the server for a top-off fetch.
The server still has to do some computation, but it's usually quite
small, because both sides agree quickly that there's no need to dig down
further than the tips in that dumb packfile.
> Of course, the big downside is that "fetch.uriProtocols" is empty by
> default, so Git will not use them. Makes me wonder whether this is
> something we want to eventually change, but I guess the current default
> behaviour is somewhat insecure as it would allow the server to redirect
> clients to arbitrary locations. It would be great if we had a mechanism
> that only allowed packfile URIs that use the same host, which would make
> this a lot more reasonable to enable by default.
It's been a while since I've looked at it, but I seem to recall that the
server-side tools for specifying which packfile URIs to use were not
that mature. Maybe that has changed, though (I'm probably 5 years out of
date since the last time I really thought about these things).
-Peff
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-05-15 17:01 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-12 16:26 [PATCH] http: handle absolute-path alternates from server root Jeff King
2026-05-13 1:10 ` Junio C Hamano
2026-05-13 18:58 ` Jeff King
2026-05-15 7:41 ` Patrick Steinhardt
2026-05-15 17:01 ` Jeff King
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox