git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Jon Smirl <jonsmirl@gmail.com>
Cc: Jeff King <peff@peff.net>,
	jnareb@gmail.com, Nicolas Pitre <nico@cam.org>,
	"Shawn O. Pearce" <spearce@spearce.org>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: git-daemon on NSLU2
Date: Sun, 26 Aug 2007 10:15:24 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LFD.0.999.0708260959050.25853@woody.linux-foundation.org> (raw)
In-Reply-To: <9e4733910708260934i1381e73ftb31c7de0d23f6cae@mail.gmail.com>



On Sun, 26 Aug 2007, Jon Smirl wrote:
> 
> Changing git-daemon only for the initial clone case also means that
> people don't need to change the way they manage packs.

I do agree that we might want to do some special-case handling for the 
initial clone (because it *is* kind of special), but it's not necessarily 
as easy as just re-using an existing pack.

At a minimum, we'd need to have something that knows how to make a single 
pack out of several packs and some loose objects. That shouldn't be 
*hard*, but it's certainly nontrivial, especially in the presense of the 
same objects possibly being available more than once in different packs.

[ The "duplicate object" thing does actually happen: even if you use only 
  "git native" protocols, you can get duplicate objects because a file was 
  changed back to an earlier version. The incremental packs you get from 
  push/pull'ing between two repositories try to send the minimal 
  incremental changes, but the keyword here is _try_: they will 
  potentially send objects that the receiver already has, if it's not 
  obvious that the receiver has them from the "commit boundary" cases ]

Maybe the client side will handle a pack with duplicate objects perfectly 
fine, and it's not an issue. Maybe. It might even be likely (I can't think 
of anything that would obviously break). But at a minimum, it would be 
something that needs some code on the sending side, and a lot of 
verification that the end result works ok on the receiving side.

And there's actually a deeper problem: the current native protocol 
guarantees that the objects sent over are only those that are reachable. 
That matters. It matters for subtle security issues (maybe you are 
exporting some repository that was rebased, and has objects that you 
didn't *intend* to make public!), but it also matters for issues like git 
"alternates" files.

If you only ever look at a single repo, you'll never see the alternates 
issue, but if you're seriously looking at serving git repositories, I 
don't really see the "single repo" case as being at all the most common or 
interesting case. 

And if you look at something like kernel.org, the "alternates" thing is 
*much* more important than how much memory git-daemon uses! Yes, 
kernel.org would probably be much happier if git-daemon wasn't such a 
memory pig occasionally, but on the other hand, the win from using 
alternates and being able to share 99% of all objects in all the various 
related kernel repositories is actually likely to be a *bigger* memory win 
than any git-daemon memory usage, because now the disk caching works a 
hell of a lot better!

So it's not actually clear how the initial clone thing can be optimized on 
the server side.

It's easier to optimize on the *client* side: just do the initial clone 
with rsync/http (and "git gc" it on the client afterwards), and then 
change it to the git native protocol after the clone.

That may not sound very user-friendly, but let's face it, I think there is 
exactly one person in the whole universe that tries to use an NSLU2 as a 
git server. So the "client-side workaround" is likely to affect a very 
limited number of clients ;)

		Linus

  reply	other threads:[~2007-08-26 17:16 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-24  5:54 git-daemon on NSLU2 Jon Smirl
2007-08-24  6:21 ` Shawn O. Pearce
2007-08-24 19:38   ` Jon Smirl
2007-08-24 20:23     ` Nicolas Pitre
2007-08-24 21:17       ` Jon Smirl
2007-08-24 21:54         ` Nicolas Pitre
2007-08-24 22:06         ` Jon Smirl
2007-08-24 22:39           ` Jakub Narebski
2007-08-24 22:59             ` Junio C Hamano
2007-08-24 23:21               ` Jakub Narebski
2007-08-24 23:46             ` Jon Smirl
2007-08-25  0:04               ` Junio C Hamano
2007-08-25  7:12                 ` David Kastrup
2007-08-25 17:02                 ` Salikh Zakirov
2007-08-25  0:10           ` Nicolas Pitre
2007-08-24 23:28         ` Linus Torvalds
2007-08-25 15:44           ` Jon Smirl
2007-08-26  9:33             ` Jeff King
2007-08-26 16:34               ` Jon Smirl
2007-08-26 17:15                 ` Linus Torvalds [this message]
2007-08-26 18:06                   ` Jon Smirl
2007-08-26 18:26                     ` Linus Torvalds
2007-08-26 19:00                       ` Jon Smirl
2007-08-26 20:19                         ` Linus Torvalds
2007-08-26 21:22                           ` Junio C Hamano
2007-08-27 11:03                       ` Theodore Tso
2007-08-27 16:26                         ` Linus Torvalds
2007-08-26 22:24                   ` Daniel Hulme
2007-08-27  0:14               ` Jakub Narebski
2007-08-24 20:27     ` Jon Smirl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.0.999.0708260959050.25853@woody.linux-foundation.org \
    --to=torvalds@linux-foundation.org \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    --cc=jonsmirl@gmail.com \
    --cc=nico@cam.org \
    --cc=peff@peff.net \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).