git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Shawn O. Pearce" <spearce@spearce.org>
To: Nicolas Pitre <nico@cam.org>
Cc: Julian Phillips <julian@quantumfyre.co.uk>,
	Daniel Barkalow <barkalow@iabervon.org>,
	Junio C Hamano <gitster@pobox.com>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	git@vger.kernel.org
Subject: Re: [PATCH] fix simple deepening of a repo
Date: Mon, 24 Aug 2009 19:12:24 -0700	[thread overview]
Message-ID: <20090825021223.GE1033@spearce.org> (raw)
In-Reply-To: <alpine.LFD.2.00.0908242001250.6044@xanadu.home>

Nicolas Pitre <nico@cam.org> wrote:
> Well... Johan Herland says he has to deal with repositories containing 
> around 50000 refs.  So in that case it is certainly a good idea not to 
> send the whole 50000 refs back if only one or two (or a hundred) need to 
> be updated.  And quickfetch() won't help in that case since its purpose 
> is only to determine if there is anything at all to update.
...
> 50000 refs * 45 bytes each = 2.25 MB.  That's all wasted bandwidth if 
> only one ref needs updating.

Not just Johan Herland.  Gerrit Code Review creates a new ref
for every patch proposed for review.  Imagine taking every email
message on git ML that has "[PATCH]" in the subject, and creating
a new ref for that in a git.git clone.

We aren't quite at the 50k ref stage yet, but we're starting to
consider that some of our repositories have a ton of refs, and
that the initial advertisement for either fetch or push is horrid.

Since the refs are immutable I could actually teach the JGit
daemon to hide them from JGit's receive-pack, thus cutting down the
advertisement on push, but the refs exist so you can literally say:

  git fetch URL refs/changes/88/4488/2
  git show FETCH_HEAD

to inspect the "v2" version of whatever 4488 is, and if 4488 was
the last commit in a patch series, you'd also be able to do:

  git log -p --reverse ..FETCH_HEAD

to see the complete series.

Given how infrequent it is to grab a given change is though, I'm
starting to consider either a protocol extension that allows the
client to probe for a ref which wasn't in the initial advertisement,
or take it on a command line flag, e.g.:

  git fetch --uploadpack='git upload-pack --ref refs/changes/88/4488/2' URL refs/changes/88/4488/2

Personally I'd prefer extending the protocol, because making the
end user supply information twice is stupid.

I don't know enough about Johan's case though to know whether or
not he can get away with hiding the bulk of the refs in the initial
advertisement.  In the case of Gerrit Code Review, the bulk of the
refs is under refs/changes/, only a handful of things are under the
refs/heads/ and ref/tags/ namespace, and most fetches actually are
for only refs/heads/ and refs/tags/.  So hiding the refs/changes/
namespace would make large improvement in the advertisement cost.

-- 
Shawn.

  reply	other threads:[~2009-08-25  2:12 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-22  5:52 git fetch --depth=* broken? Nicolas Pitre
2009-08-24  4:04 ` [PATCH] fix simple deepening of a repo Nicolas Pitre
2009-08-24  4:49   ` Junio C Hamano
2009-08-24 13:55     ` Nicolas Pitre
2009-08-24 14:20       ` Johan Herland
2009-08-24 22:21       ` Junio C Hamano
2009-08-24 16:26     ` Daniel Barkalow
2009-08-24 22:30       ` Julian Phillips
2009-08-25  0:18         ` Nicolas Pitre
2009-08-25  2:12           ` Shawn O. Pearce [this message]
2009-08-25  5:00             ` Sverre Rabbelier
2009-08-25  5:21             ` Junio C Hamano
2009-08-25  6:12               ` Shawn O. Pearce
2009-08-25  6:33                 ` Junio C Hamano
2009-08-25 15:14                   ` Shawn O. Pearce
2009-08-26  2:10                     ` Shawn O. Pearce
2009-08-26  7:08                       ` Johannes Sixt
2009-08-26  8:22                         ` Shawn O. Pearce
2009-08-26  9:03                           ` Junio C Hamano
2009-08-26 17:03                             ` Shawn O. Pearce
2009-08-28 17:30                       ` [RFC PATCH] upload-pack: expand capability advertises additional refs Shawn O. Pearce
2009-08-28 19:07                         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090825021223.GE1033@spearce.org \
    --to=spearce@spearce.org \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=barkalow@iabervon.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=julian@quantumfyre.co.uk \
    --cc=nico@cam.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).