git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicolas Pitre <nico@cam.org>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Sergey Vlasov <vsu@altlinux.ru>, Junio C Hamano <junkio@cox.net>,
	git@vger.kernel.org
Subject: Re: heads-up: git-index-pack in "next" is broken
Date: Tue, 17 Oct 2006 17:21:39 -0400 (EDT)	[thread overview]
Message-ID: <Pine.LNX.4.64.0610171706260.1971@xanadu.home> (raw)
In-Reply-To: <Pine.LNX.4.64.0610171339030.3962@g5.osdl.org>

On Tue, 17 Oct 2006, Linus Torvalds wrote:

> 
> 
> On Tue, 17 Oct 2006, Nicolas Pitre wrote:
> > On Tue, 17 Oct 2006, Sergey Vlasov wrote:
> > > 
> > > Yes, on x86_64 this is 24 because of 8-byte alignment for longs:
> > 
> > Ah bummer.  Then this is most likely the cause.  And here's a simple 
> > fix (Junio please confirm):
> 
> Why do you use "unsigned long" in the first place?

Because offsets into packs are expressed as unsigned long everywhere 
else (except in the current pack index on-disk format).

> For some structure like this, it sounds positively wrong. Pack-files 
> should be architecture-neutral, which means that they shouldn't depend on 
> word-size, and they should be in some neutral byte-order.

But they do.  Please consider this code:

	case OBJ_OFS_DELTA:
		memset(delta_base, 0, sizeof(*delta_base));
		c = pack_base[pos++];
		base_offset = c & 127;
		while (c & 128) {
			base_offset += 1;
			if (!base_offset || base_offset & ~(~0UL >> 7))
				bad_object(offset, "offset value overflow for delta base object");
			if (pos >= pack_limit)
				bad_object(offset, "object extends past end of pack");
			c = pack_base[pos++];
			base_offset = (base_offset << 7) + (c & 127);
		}
		delta_base->offset = offset - base_offset;
		if (delta_base->offset >= offset)
			bad_object(offset, "delta base offset is out of bound");
		break;

Do you see anything inerently wrong in this code?  The above is already 
64-bit ready such that it'll just work on 64-bit archs and will display 
a sensible message if a 32-bit arch encounter a pack larger than 4GB.  
But the on-disk pack format has no limitation what so ever.

> Quite frankly, this all makes me go "Eww..". The original pack-file (well, 
> v2) format was well-defined and had none of these issues. In contrast, the 
> new code in 'next' is just _ugly_.

I beg to differ.  Please reconsider in light of the above.

> And maybe it's just me, but I consider unions to be bug-prone on their 
> own. The "master" branch has exactly two unions: the "grep_expr" structure 
> contains one (where the union member is clearly defined by the node type 
> in that structure), and object.c has a "union any_object" that _literally_ 
> exists as purely an allocation size issue (ie it is used _only_ to 
> allocate the maximum size of any of the possible structures).
> 
> In contrast, the new union introduced in "next" is just horrid. There's 
> not even any way to know which member to use, except apparently that it 
> expects that a SHA1 is never zero in the last 12 bytes. Which is probably 
> true, but still - that's some ugly stuff.

This union should be looked at just like a sortable hash pointing to a 
base object so that deltas with the same base object can be sorted 
together.  And the field to use is well defined of course: deltas with 
sha1 to base use the sha1 member, deltas with offset to base use the 
offset member.  This hash, together with the delta type, constitute a 
tuple guaranteed to be unique so there can't be any confusion.

> Is this something you want to bet a big project on?

I don't see why not.


Nicolas

  reply	other threads:[~2006-10-17 21:22 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-10-17  4:55 heads-up: git-index-pack in "next" is broken Junio C Hamano
2006-10-17 15:39 ` Nicolas Pitre
2006-10-17 16:07   ` Junio C Hamano
2006-10-17 17:00     ` Nicolas Pitre
2006-10-17 18:11       ` Junio C Hamano
2006-10-17 18:47         ` Nicolas Pitre
2006-10-17 19:36           ` Sergey Vlasov
2006-10-17 20:10             ` Junio C Hamano
2006-10-17 20:25               ` Nicolas Pitre
2006-10-17 20:23             ` Nicolas Pitre
2006-10-17 20:51               ` Linus Torvalds
2006-10-17 21:21                 ` Nicolas Pitre [this message]
2006-10-17 21:46                   ` Linus Torvalds
2006-10-18  0:20                     ` Nicolas Pitre
2006-10-18  0:57                       ` Linus Torvalds
2006-10-18  2:08                         ` Nicolas Pitre
2006-10-18  3:12                           ` Linus Torvalds
2006-10-18  6:09                             ` Davide Libenzi
2006-10-18 14:56                               ` Linus Torvalds
2006-10-18 16:17                                 ` Davide Libenzi
2006-10-18 16:52                                   ` Linus Torvalds
2006-10-18 21:21                                     ` Davide Libenzi
2006-10-18 21:48                                       ` Linus Torvalds
2006-10-18 22:34                                         ` Davide Libenzi
2006-10-18  1:30                       ` Junio C Hamano
2006-10-18  2:23                         ` Nicolas Pitre
2006-10-18  4:16                           ` Junio C Hamano
2006-10-18  5:07                             ` Junio C Hamano
2006-10-18 10:00                               ` Johannes Schindelin
2006-10-18 13:13                                 ` Nicolas Pitre
2006-10-18 13:02                               ` Nicolas Pitre
2006-10-17 21:54                 ` Junio C Hamano
2006-10-18  1:38                   ` Nicolas Pitre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0610171706260.1971@xanadu.home \
    --to=nico@cam.org \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=torvalds@osdl.org \
    --cc=vsu@altlinux.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).