Git development
 help / color / mirror / Atom feed
From: Shawn Pearce <spearce@spearce.org>
To: Sam Vilain <sam@vilain.net>
Cc: git@vger.kernel.org
Subject: Re: is git-unpack-objects now redundant for 'git-push' and friends?
Date: Tue, 10 Oct 2006 12:14:05 -0400	[thread overview]
Message-ID: <20061010161405.GB16412@spearce.org> (raw)
In-Reply-To: <452B9EE8.5020702@vilain.net>

Sam Vilain <sam@vilain.net> wrote:
> When pushing or pulling to/from a repository, why unpack the objects?
> Why not just fsck and then throw the pack into $GIT_DIR/objects/pack?
> 
> If you're pushing the entire repository, for instance, currently you
> might create 10,000's of files, which will just be thrown away later
> when you `git-repack -d'.
> 
> I suspect that this was never changed, because there never used to be
> more than one packfile allowed, correct?
> 
> If the server *does* send us duplicates of objects we already have for
> some reason, well that's what `git-repack -a -d' is for.
> 
> I'm just wondering if there are any good reasons to do this any more.

There's still a few good reasons to unpack things, even though I
hate sitting through 5000 objects unpacking on a Windows system,
or a very slow NFS system (one NFS based system I work with unpacks
small <1KiB objects at a rate of about 1 every 15 seconds).

 - We don't completely trust the remote peer.  If the remote sends
   us an object we already have we want to use the bytes we have
   and not the bytes they are sending.  That way even if the remote
   were able to generate a hash collision and produce a replacement
   byte sequence with the same hash, our known good byte sequence
   is not affected.

 - Packs sent over the network can be thin packs.  A think pack
   may not contain the object used as the delta base for a delta
   within that pack.  This is done to save network bandwidth when
   both sides already have the delta base for an object that is
   being transferred.

 - Local packs cannot access delta base objects not in the same
   pack.  This limitation simplifies the pack access code in a
   lot of the system.  It also means that you can in a worst case
   scenario obtain everything in that pack back when all you have
   is the pack file itself.

I'm sure there's more reasons than that, but those are the major
ones.

-- 
Shawn.

  reply	other threads:[~2006-10-10 16:14 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-10-10 13:23 is git-unpack-objects now redundant for 'git-push' and friends? Sam Vilain
2006-10-10 16:14 ` Shawn Pearce [this message]
2006-10-10 17:44 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061010161405.GB16412@spearce.org \
    --to=spearce@spearce.org \
    --cc=git@vger.kernel.org \
    --cc=sam@vilain.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox