Git development
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Patrick Steinhardt <ps@pks.im>
Cc: Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org, slonkazoid <slonkazoid@slonk.ing>
Subject: Re: [PATCH] http: handle absolute-path alternates from server root
Date: Fri, 15 May 2026 13:01:34 -0400	[thread overview]
Message-ID: <20260515170134.GC88375@coredump.intra.peff.net> (raw)
In-Reply-To: <agbOEsZ8NmE8SyfV@pks.im>

On Fri, May 15, 2026 at 09:41:06AM +0200, Patrick Steinhardt wrote:

> > We talked about dropping it a few years ago, but Eric countered that
> > dumb clones are easier on the server in some cases (like gigantic
> > public-inbox repos that are packed to keep most of the old history in
> > one big pack that is never updated). The verbatim pack-reuse feature
> > tries to get smart clones closer to that, but it's hard to beat serving
> > a static file from the server's perspective. I haven't measured anything
> > in that area in a while, though.
> 
> In theory we can get much closer with packfile URIs, too, can't we? If
> the packfiles are directly accessible anyway the server could just
> announce these directly and have the client fetch them. That should
> significantly reduce the load on the server even further.

Packfile URIs help with the actual pack generation (even if we're
blitting out bits from the disk with verbatim packfile reuse, we still
have to handle gaps and compute the checksum over the output pack).

But it doesn't help with the server computing the set of objects the
client needs in the first place. IIRC, packfile URIs work by the server
saying "oh, I was going to send you object XYZ, but you can get it from
this stable pack instead". So the server still has to compute the set of
objects (and send any that are not mentioned in URI packs). Bitmaps
help, but there's still non-trivial computation and storage on the
server.

Contrast that with a client that instead pulls a packfile over dumb
storage on its own, and then comes to the server for a top-off fetch.
The server still has to do some computation, but it's usually quite
small, because both sides agree quickly that there's no need to dig down
further than the tips in that dumb packfile.

> Of course, the big downside is that "fetch.uriProtocols" is empty by
> default, so Git will not use them. Makes me wonder whether this is
> something we want to eventually change, but I guess the current default
> behaviour is somewhat insecure as it would allow the server to redirect
> clients to arbitrary locations. It would be great if we had a mechanism
> that only allowed packfile URIs that use the same host, which would make
> this a lot more reasonable to enable by default.

It's been a while since I've looked at it, but I seem to recall that the
server-side tools for specifying which packfile URIs to use were not
that mature. Maybe that has changed, though (I'm probably 5 years out of
date since the last time I really thought about these things).

-Peff

      reply	other threads:[~2026-05-15 17:01 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-12 16:26 [PATCH] http: handle absolute-path alternates from server root Jeff King
2026-05-13  1:10 ` Junio C Hamano
2026-05-13 18:58   ` Jeff King
2026-05-15  7:41     ` Patrick Steinhardt
2026-05-15 17:01       ` Jeff King [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260515170134.GC88375@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=ps@pks.im \
    --cc=slonkazoid@slonk.ing \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox