git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Shawn O. Pearce" <spearce@spearce.org>
To: Jakub Narebski <jnareb@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: Errors cloning large repo
Date: Mon, 12 Mar 2007 10:24:51 -0400	[thread overview]
Message-ID: <20070312142451.GC15150@spearce.org> (raw)
In-Reply-To: <200703121209.35052.jnareb@gmail.com>

Jakub Narebski <jnareb@gmail.com> wrote:
> But what would happen if server supporting concatenated packfiles
> sends such stream to the old client? So I think some kind of protocol
> extension, or at least new request / new feature is needed for that.

No, a protocol extension is not required.  The packfile format
is: 12 byte header, objects, 20 byte SHA-1 footer.  When sending
concatenated packfiles to a client the server just needs to:

  - figure out how many objects total will be sent;
  - send its own (new) header with that count;
  - initialize a SHA-1 context and update it with the header;
  - for each packfile to be sent:
    - strip the first 12 bytes of the packfile;
    - send the remaining bytes, except the last 20;
    - update the SHA-1 context with the packfile data;
  - send its own footer with the SHA-1 context.

Very simple.  Even the oldest Git clients (pre multi-ack extension)
would understand that.  That's what's great about the way the
packfile protocol and disk format is organized.  ;-)
 
> Wouldn't it be better to pack loose objects into separate pack
> (and perhaps save it, if some threshold is crossed, and we have
> writing rights to repo), by the way?

Perhaps.  Interesting food for thought, something nobody has tried
to experiment with.  Currently servers pack to update the fetching
client.  That means they may be sending a mixture of already-packed
(older) objects and loose (newer) objects.  But with the new kept
pack thing in receive-pack its more likely that things are already
packed on the server, and not loose.  (I suspect most public open
source users are pushing >100 objects when they do push to their
server.)
 
> > The client could easily segment that into multiple packfiles
> > locally using two rules:
> > 
> >   - if the last object was not a OBJ_COMMIT and this object is
> >   an OBJ_COMMIT, start a new packfile with this object.
...
> 
> Without first rule, wouldn't client end with strange packfile?
> Or would it have to rewrite a pack?

Nope.  We don't care about the order of the objects in a packfile.
Never have.  Never will.  Even in pack v4 where we have special
object types that should only appear once in a packfile, they can
appear at any position within the packfile.  MUCH simpler code.

-- 
Shawn.

  reply	other threads:[~2007-03-12 14:24 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-10  2:37 Errors cloning large repo Anton Tropashko
2007-03-10  3:07 ` Shawn O. Pearce
2007-03-10  5:54   ` Linus Torvalds
2007-03-10  6:01     ` Shawn O. Pearce
2007-03-10 22:32       ` Martin Waitz
2007-03-10 22:46         ` Linus Torvalds
2007-03-11 21:35           ` Martin Waitz
2007-03-10 10:27   ` Jakub Narebski
2007-03-11  2:00     ` Shawn O. Pearce
2007-03-12 11:09       ` Jakub Narebski
2007-03-12 14:24         ` Shawn O. Pearce [this message]
2007-03-17 13:23           ` Jakub Narebski
     [not found]   ` <82B0999F-73E8-494E-8D66-FEEEDA25FB91@adacore.com>
2007-03-10 22:21     ` Linus Torvalds
2007-03-10  5:10 ` Linus Torvalds
  -- strict thread matches above, loose matches on Subject: below --
2007-03-13  0:02 Anton Tropashko
2007-03-12 17:39 Anton Tropashko
2007-03-12 18:40 ` Linus Torvalds
2007-03-10  1:21 Anton Tropashko
2007-03-10  1:45 ` Linus Torvalds
2007-03-09 23:48 Anton Tropashko
2007-03-10  0:54 ` Linus Torvalds
2007-03-10  2:03   ` Linus Torvalds
2007-03-10  2:12     ` Junio C Hamano
2007-03-09 19:20 Anton Tropashko
2007-03-09 21:37 ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070312142451.GC15150@spearce.org \
    --to=spearce@spearce.org \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).