git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <junkio@cox.net>
To: Nicolas Pitre <nico@cam.org>
Cc: git@vger.kernel.org, Linus Torvalds <torvalds@osdl.org>
Subject: Re: Mozilla .git tree
Date: Wed, 30 Aug 2006 04:42:05 -0700	[thread overview]
Message-ID: <7vzmdmh2lu.fsf@assigned-by-dhcp.cox.net> (raw)
In-Reply-To: <Pine.LNX.4.64.0608300124550.9796@xanadu.home> (Nicolas Pitre's message of "Wed, 30 Aug 2006 01:53:48 -0400 (EDT)")

Nicolas Pitre <nico@cam.org> writes:

> And the zlib header contains a CRC which we're about to use for 
> validating the data when doing delta data reuse in order to prevent pack 
> corruption propagation like the one recently posted on the list.  

Ah, never thought of using the CRC directly.  I was thinking
about inflating into void and see if it succeeds, which as you
say is perhaps quite expensive.

This brings me back to my pet-peeve, though.  I do not think
zlib stream seeks back and leaves some clue at the beginning to
tell me the deflated length, so it is quite hard to find where
each deflated stream ends in a packfile cheaply.  Loose objects
(with new or legacy style header) are easy (st.st_size is
available), but I do not think of a way short of building a
reverse index of pack .idx file, which means I am already
talking about not so cheap way X-<.

It might be a reason to define a new .idx format.  We could lift
32-bit offset limitation while we are at it.  Each entry could
have 20-byte hash, 64-bit offset into the corresponding .pack,
and 32-bit deflated length (heh, why not make it 64-bit while we
are at it).  Luckily, .idx _is_ a local matter so we can even
have a flag day and tell people to run the updated index-pack on
existing packfiles to regenerate .idx.

> Using an offset instead of a sha1 to reference a delta base object is 
> certainly a good idea though.  But I'd use the same variable encoding as 
> the object size to avoid the 32-bit limit issue.  When generating a thin 
> pack the real sha1 of the delta object could be substituted for the 
> offset quite easily if the base object is not sent a part of the same 
> pack.

That sounds quite a reasonable suggestion.  I love this kind of
moment when I find us very fortunate to have bright people on
the list ;-).

  reply	other threads:[~2006-08-30 11:41 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <9e4733910608290943g6aa79855q62b98caf4f19510@mail.gmail.com>
     [not found] ` <20060829165811.GB21729@spearce.org>
     [not found]   ` <9e4733910608291037k2d9fb791v18abc19bdddf5e89@mail.gmail.com>
     [not found]     ` <20060829175819.GE21729@spearce.org>
     [not found]       ` <9e4733910608291155g782953bbv5df1b74878f4fcf1@mail.gmail.com>
     [not found]         ` <20060829190548.GK21729@spearce.org>
     [not found]           ` <9e4733910608291252q130fc723r945e6ab906ca6969@mail.gmail.com>
     [not found]             ` <20060829232007.GC22935@spearce.org>
     [not found]               ` <9e4733910608291807q9b896e4sdbfaa9e49de58c2b@mail.gmail.com>
2006-08-30  1:51                 ` Mozilla .git tree Shawn Pearce
2006-08-30  2:25                   ` Shawn Pearce
2006-08-30  2:58                   ` Jon Smirl
2006-08-30  3:10                     ` Shawn Pearce
2006-08-30  3:27                       ` Jon Smirl
2006-08-30  5:53                       ` Nicolas Pitre
2006-08-30 11:42                         ` Junio C Hamano [this message]
2006-09-01  7:42                           ` Junio C Hamano
2006-09-02  1:19                             ` Shawn Pearce
2006-09-02  4:01                               ` Junio C Hamano
2006-09-02  4:39                                 ` Shawn Pearce
2006-09-02 11:06                                   ` Junio C Hamano
2006-09-02 14:20                                     ` Jon Smirl
2006-09-02 17:39                                       ` Shawn Pearce
2006-09-02 18:56                                         ` Linus Torvalds
2006-09-02 20:53                                           ` Junio C Hamano
2006-09-02 17:44                                     ` Shawn Pearce
2006-09-02  2:04                             ` Shawn Pearce
2006-09-02 11:02                               ` Junio C Hamano
2006-09-02 17:51                                 ` Shawn Pearce
2006-09-02 20:55                                   ` Junio C Hamano
2006-09-03  3:54                                     ` Shawn Pearce
2006-09-01 17:45                           ` A Large Angry SCM
2006-09-01 18:35                             ` Linus Torvalds
2006-09-01 19:56                               ` Junio C Hamano
2006-09-01 23:14                               ` [PATCH] pack-objects: re-validate data we copy from elsewhere Junio C Hamano
2006-09-02  0:23                                 ` Linus Torvalds
2006-09-02  1:39                                   ` VGER BF report? Johannes Schindelin
2006-09-02  5:58                                     ` Sam Ravnborg
2006-09-02  1:52                                   ` [PATCH] pack-objects: re-validate data we copy from elsewhere Junio C Hamano
2006-09-02  3:52                                   ` Junio C Hamano
2006-09-02  4:52                                     ` Shawn Pearce
2006-09-02  9:42                                       ` Junio C Hamano
2006-09-02 17:43                                         ` Linus Torvalds
2006-09-02 10:09                                       ` Junio C Hamano
2006-09-02 17:54                                         ` Shawn Pearce
2006-09-03 21:00                                           ` Junio C Hamano
2006-09-04  4:10                                             ` Shawn Pearce
2006-09-04  5:50                                               ` Junio C Hamano
2006-09-04  6:44                                                 ` Shawn Pearce
2006-09-04  7:39                                                   ` Junio C Hamano
2006-09-03  0:27                                         ` Linus Torvalds
2006-09-03  0:32                                           ` Junio C Hamano
2006-09-05  8:12                                           ` Junio C Hamano
2006-09-02 18:43                                     ` Linus Torvalds
2006-09-02 20:56                                       ` Junio C Hamano
2006-09-03 21:48                                       ` Junio C Hamano
2006-09-03 22:00                                         ` Linus Torvalds
2006-09-03 22:16                                           ` Linus Torvalds
2006-09-03 22:34                                           ` Junio C Hamano
2006-09-04  4:06                                             ` Junio C Hamano
2006-09-04 15:19                                               ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vzmdmh2lu.fsf@assigned-by-dhcp.cox.net \
    --to=junkio@cox.net \
    --cc=git@vger.kernel.org \
    --cc=nico@cam.org \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).