From: Nicolas Pitre <nico@cam.org>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Sergey Vlasov <vsu@altlinux.ru>, Junio C Hamano <junkio@cox.net>,
git@vger.kernel.org
Subject: Re: heads-up: git-index-pack in "next" is broken
Date: Tue, 17 Oct 2006 17:21:39 -0400 (EDT) [thread overview]
Message-ID: <Pine.LNX.4.64.0610171706260.1971@xanadu.home> (raw)
In-Reply-To: <Pine.LNX.4.64.0610171339030.3962@g5.osdl.org>
On Tue, 17 Oct 2006, Linus Torvalds wrote:
>
>
> On Tue, 17 Oct 2006, Nicolas Pitre wrote:
> > On Tue, 17 Oct 2006, Sergey Vlasov wrote:
> > >
> > > Yes, on x86_64 this is 24 because of 8-byte alignment for longs:
> >
> > Ah bummer. Then this is most likely the cause. And here's a simple
> > fix (Junio please confirm):
>
> Why do you use "unsigned long" in the first place?
Because offsets into packs are expressed as unsigned long everywhere
else (except in the current pack index on-disk format).
> For some structure like this, it sounds positively wrong. Pack-files
> should be architecture-neutral, which means that they shouldn't depend on
> word-size, and they should be in some neutral byte-order.
But they do. Please consider this code:
case OBJ_OFS_DELTA:
memset(delta_base, 0, sizeof(*delta_base));
c = pack_base[pos++];
base_offset = c & 127;
while (c & 128) {
base_offset += 1;
if (!base_offset || base_offset & ~(~0UL >> 7))
bad_object(offset, "offset value overflow for delta base object");
if (pos >= pack_limit)
bad_object(offset, "object extends past end of pack");
c = pack_base[pos++];
base_offset = (base_offset << 7) + (c & 127);
}
delta_base->offset = offset - base_offset;
if (delta_base->offset >= offset)
bad_object(offset, "delta base offset is out of bound");
break;
Do you see anything inerently wrong in this code? The above is already
64-bit ready such that it'll just work on 64-bit archs and will display
a sensible message if a 32-bit arch encounter a pack larger than 4GB.
But the on-disk pack format has no limitation what so ever.
> Quite frankly, this all makes me go "Eww..". The original pack-file (well,
> v2) format was well-defined and had none of these issues. In contrast, the
> new code in 'next' is just _ugly_.
I beg to differ. Please reconsider in light of the above.
> And maybe it's just me, but I consider unions to be bug-prone on their
> own. The "master" branch has exactly two unions: the "grep_expr" structure
> contains one (where the union member is clearly defined by the node type
> in that structure), and object.c has a "union any_object" that _literally_
> exists as purely an allocation size issue (ie it is used _only_ to
> allocate the maximum size of any of the possible structures).
>
> In contrast, the new union introduced in "next" is just horrid. There's
> not even any way to know which member to use, except apparently that it
> expects that a SHA1 is never zero in the last 12 bytes. Which is probably
> true, but still - that's some ugly stuff.
This union should be looked at just like a sortable hash pointing to a
base object so that deltas with the same base object can be sorted
together. And the field to use is well defined of course: deltas with
sha1 to base use the sha1 member, deltas with offset to base use the
offset member. This hash, together with the delta type, constitute a
tuple guaranteed to be unique so there can't be any confusion.
> Is this something you want to bet a big project on?
I don't see why not.
Nicolas
next prev parent reply other threads:[~2006-10-17 21:22 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-10-17 4:55 heads-up: git-index-pack in "next" is broken Junio C Hamano
2006-10-17 15:39 ` Nicolas Pitre
2006-10-17 16:07 ` Junio C Hamano
2006-10-17 17:00 ` Nicolas Pitre
2006-10-17 18:11 ` Junio C Hamano
2006-10-17 18:47 ` Nicolas Pitre
2006-10-17 19:36 ` Sergey Vlasov
2006-10-17 20:10 ` Junio C Hamano
2006-10-17 20:25 ` Nicolas Pitre
2006-10-17 20:23 ` Nicolas Pitre
2006-10-17 20:51 ` Linus Torvalds
2006-10-17 21:21 ` Nicolas Pitre [this message]
2006-10-17 21:46 ` Linus Torvalds
2006-10-18 0:20 ` Nicolas Pitre
2006-10-18 0:57 ` Linus Torvalds
2006-10-18 2:08 ` Nicolas Pitre
2006-10-18 3:12 ` Linus Torvalds
2006-10-18 6:09 ` Davide Libenzi
2006-10-18 14:56 ` Linus Torvalds
2006-10-18 16:17 ` Davide Libenzi
2006-10-18 16:52 ` Linus Torvalds
2006-10-18 21:21 ` Davide Libenzi
2006-10-18 21:48 ` Linus Torvalds
2006-10-18 22:34 ` Davide Libenzi
2006-10-18 1:30 ` Junio C Hamano
2006-10-18 2:23 ` Nicolas Pitre
2006-10-18 4:16 ` Junio C Hamano
2006-10-18 5:07 ` Junio C Hamano
2006-10-18 10:00 ` Johannes Schindelin
2006-10-18 13:13 ` Nicolas Pitre
2006-10-18 13:02 ` Nicolas Pitre
2006-10-17 21:54 ` Junio C Hamano
2006-10-18 1:38 ` Nicolas Pitre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0610171706260.1971@xanadu.home \
--to=nico@cam.org \
--cc=git@vger.kernel.org \
--cc=junkio@cox.net \
--cc=torvalds@osdl.org \
--cc=vsu@altlinux.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).