Git development
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org, slonkazoid <slonkazoid@slonk.ing>
Subject: Re: [PATCH] http: handle absolute-path alternates from server root
Date: Thu, 21 May 2026 13:50:06 +0200	[thread overview]
Message-ID: <ag7xbkTF11N22waX@pks.im> (raw)
In-Reply-To: <20260515170134.GC88375@coredump.intra.peff.net>

On Fri, May 15, 2026 at 01:01:34PM -0400, Jeff King wrote:
> On Fri, May 15, 2026 at 09:41:06AM +0200, Patrick Steinhardt wrote:
> 
> > > We talked about dropping it a few years ago, but Eric countered that
> > > dumb clones are easier on the server in some cases (like gigantic
> > > public-inbox repos that are packed to keep most of the old history in
> > > one big pack that is never updated). The verbatim pack-reuse feature
> > > tries to get smart clones closer to that, but it's hard to beat serving
> > > a static file from the server's perspective. I haven't measured anything
> > > in that area in a while, though.
> > 
> > In theory we can get much closer with packfile URIs, too, can't we? If
> > the packfiles are directly accessible anyway the server could just
> > announce these directly and have the client fetch them. That should
> > significantly reduce the load on the server even further.
> 
> Packfile URIs help with the actual pack generation (even if we're
> blitting out bits from the disk with verbatim packfile reuse, we still
> have to handle gaps and compute the checksum over the output pack).
> 
> But it doesn't help with the server computing the set of objects the
> client needs in the first place. IIRC, packfile URIs work by the server
> saying "oh, I was going to send you object XYZ, but you can get it from
> this stable pack instead". So the server still has to compute the set of
> objects (and send any that are not mentioned in URI packs). Bitmaps
> help, but there's still non-trivial computation and storage on the
> server.

I guess it depends on the actual server-side implementation, but in the
general case this is of course true. A server could decide to for
example overserve objects in case the client does a full clone, or it
could arrange packfiles in a special way that allows it to serve at
least some kinds of requests efficiently.

> Contrast that with a client that instead pulls a packfile over dumb
> storage on its own, and then comes to the server for a top-off fetch.
> The server still has to do some computation, but it's usually quite
> small, because both sides agree quickly that there's no need to dig down
> further than the tips in that dumb packfile.

So this here is in theory possible with packfile URIs, as well, by
computing the top-off fetch depending on the packfile layout.

But this requires quite a bunch of server-side logic and very specific
layouts, I guess.

> > Of course, the big downside is that "fetch.uriProtocols" is empty by
> > default, so Git will not use them. Makes me wonder whether this is
> > something we want to eventually change, but I guess the current default
> > behaviour is somewhat insecure as it would allow the server to redirect
> > clients to arbitrary locations. It would be great if we had a mechanism
> > that only allowed packfile URIs that use the same host, which would make
> > this a lot more reasonable to enable by default.
> 
> It's been a while since I've looked at it, but I seem to recall that the
> server-side tools for specifying which packfile URIs to use were not
> that mature. Maybe that has changed, though (I'm probably 5 years out of
> date since the last time I really thought about these things).

Packfile URIs definitely need some love to become feasible, yes, and I
don't think they have evolved much since their introduction. I still
feel like they are the better mechanism for offloading traffic compared
to bundle URIs though, as we already have packfiles around anyway.

Patrick

  reply	other threads:[~2026-05-21 11:50 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-12 16:26 [PATCH] http: handle absolute-path alternates from server root Jeff King
2026-05-13  1:10 ` Junio C Hamano
2026-05-13 18:58   ` Jeff King
2026-05-15  7:41     ` Patrick Steinhardt
2026-05-15 17:01       ` Jeff King
2026-05-21 11:50         ` Patrick Steinhardt [this message]
2026-05-22  4:55           ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ag7xbkTF11N22waX@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=slonkazoid@slonk.ing \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox