* heads-up: git-index-pack in "next" is broken @ 2006-10-17 4:55 Junio C Hamano 2006-10-17 15:39 ` Nicolas Pitre 0 siblings, 1 reply; 33+ messages in thread From: Junio C Hamano @ 2006-10-17 4:55 UTC (permalink / raw) To: git; +Cc: Nicolas Pitre I'm still a bit under the weather and do not have enough concentration to dig into the problem tonight, but I noticed that something in "next", most likely the delta-base-offset patchset, broke git-index-pack: $ X=ec0c3491753e115e1775256f6b7bd1bce4dea7cd $ wget http://www.kernel.org/pub/scm/git/git.git/objects/pack/pack-$X.pack $ ~/git-master/bin/git-index-pack pack-$X.pack ec0c3491753e115e1775256f6b7bd1bce4dea7cd $ git-index-pack pack-$X.pack fatal: packfile 'pack-ec0c3491753e115e1775256f6b7bd1bce4dea7cd.pack' has unresolved deltas ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-17 4:55 heads-up: git-index-pack in "next" is broken Junio C Hamano @ 2006-10-17 15:39 ` Nicolas Pitre 2006-10-17 16:07 ` Junio C Hamano 0 siblings, 1 reply; 33+ messages in thread From: Nicolas Pitre @ 2006-10-17 15:39 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Mon, 16 Oct 2006, Junio C Hamano wrote: > I'm still a bit under the weather and do not have enough > concentration to dig into the problem tonight, but I noticed > that something in "next", most likely the delta-base-offset > patchset, broke git-index-pack: > > $ X=ec0c3491753e115e1775256f6b7bd1bce4dea7cd > $ wget http://www.kernel.org/pub/scm/git/git.git/objects/pack/pack-$X.pack > $ ~/git-master/bin/git-index-pack pack-$X.pack > ec0c3491753e115e1775256f6b7bd1bce4dea7cd > $ git-index-pack pack-$X.pack > fatal: packfile 'pack-ec0c3491753e115e1775256f6b7bd1bce4dea7cd.pack' has unresolved deltas Using the tip of the "next" branch (git version 1.4.2.4.gf9fe) I just cannot reproduce this problem at all. I always get a good index and ec0c3491753e115e1775256f6b7bd1bce4dea7cd back. Nicolas ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-17 15:39 ` Nicolas Pitre @ 2006-10-17 16:07 ` Junio C Hamano 2006-10-17 17:00 ` Nicolas Pitre 0 siblings, 1 reply; 33+ messages in thread From: Junio C Hamano @ 2006-10-17 16:07 UTC (permalink / raw) To: Nicolas Pitre; +Cc: git Nicolas Pitre <nico@cam.org> writes: > On Mon, 16 Oct 2006, Junio C Hamano wrote: > >> I'm still a bit under the weather and do not have enough >> concentration to dig into the problem tonight, but I noticed >> that something in "next", most likely the delta-base-offset >> patchset, broke git-index-pack: >> >> $ X=ec0c3491753e115e1775256f6b7bd1bce4dea7cd >> $ wget http://www.kernel.org/pub/scm/git/git.git/objects/pack/pack-$X.pack >> $ ~/git-master/bin/git-index-pack pack-$X.pack >> ec0c3491753e115e1775256f6b7bd1bce4dea7cd >> $ git-index-pack pack-$X.pack >> fatal: packfile 'pack-ec0c3491753e115e1775256f6b7bd1bce4dea7cd.pack' has unresolved deltas > > Using the tip of the "next" branch (git version 1.4.2.4.gf9fe) I just > cannot reproduce this problem at all. I always get a good index and > ec0c3491753e115e1775256f6b7bd1bce4dea7cd back. Hmph. I just got exactly the same breakage; could this be another 64-bit breakage? My breakage was on x86-64. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-17 16:07 ` Junio C Hamano @ 2006-10-17 17:00 ` Nicolas Pitre 2006-10-17 18:11 ` Junio C Hamano 0 siblings, 1 reply; 33+ messages in thread From: Nicolas Pitre @ 2006-10-17 17:00 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Tue, 17 Oct 2006, Junio C Hamano wrote: > Nicolas Pitre <nico@cam.org> writes: > > > On Mon, 16 Oct 2006, Junio C Hamano wrote: > > > >> I'm still a bit under the weather and do not have enough > >> concentration to dig into the problem tonight, but I noticed > >> that something in "next", most likely the delta-base-offset > >> patchset, broke git-index-pack: > >> > >> $ X=ec0c3491753e115e1775256f6b7bd1bce4dea7cd > >> $ wget http://www.kernel.org/pub/scm/git/git.git/objects/pack/pack-$X.pack > >> $ ~/git-master/bin/git-index-pack pack-$X.pack > >> ec0c3491753e115e1775256f6b7bd1bce4dea7cd > >> $ git-index-pack pack-$X.pack > >> fatal: packfile 'pack-ec0c3491753e115e1775256f6b7bd1bce4dea7cd.pack' has unresolved deltas > > > > Using the tip of the "next" branch (git version 1.4.2.4.gf9fe) I just > > cannot reproduce this problem at all. I always get a good index and > > ec0c3491753e115e1775256f6b7bd1bce4dea7cd back. > > Hmph. I just got exactly the same breakage; could this be > another 64-bit breakage? My breakage was on x86-64. I've been suspecting that since then as well. I indeed tested on i386. But reviewing the code I just can't find any obvious spot where 64-bit would be an issue, especially since your pack does not have any OFS_DELTA objects. Could you instrument the code at the end of index-pack.c:parse_pack_objects() to display how many deltas were actually resolved and how many were not? IOW is it a case of all or nothing, or is there an isolated case of corruption lurking somewhere? Nicolas ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-17 17:00 ` Nicolas Pitre @ 2006-10-17 18:11 ` Junio C Hamano 2006-10-17 18:47 ` Nicolas Pitre 0 siblings, 1 reply; 33+ messages in thread From: Junio C Hamano @ 2006-10-17 18:11 UTC (permalink / raw) To: Nicolas Pitre; +Cc: git Nicolas Pitre <nico@cam.org> writes: > On Tue, 17 Oct 2006, Junio C Hamano wrote: > >> Nicolas Pitre <nico@cam.org> writes: >> >> > On Mon, 16 Oct 2006, Junio C Hamano wrote: >> > >> >> I'm still a bit under the weather and do not have enough >> >> concentration to dig into the problem tonight, but I noticed >> >> that something in "next", most likely the delta-base-offset >> >> patchset, broke git-index-pack: >> >> >> >> $ X=ec0c3491753e115e1775256f6b7bd1bce4dea7cd >> >> $ wget http://www.kernel.org/pub/scm/git/git.git/objects/pack/pack-$X.pack >> >> $ ~/git-master/bin/git-index-pack pack-$X.pack >> >> ec0c3491753e115e1775256f6b7bd1bce4dea7cd >> >> $ git-index-pack pack-$X.pack >> >> fatal: packfile 'pack-ec0c3491753e115e1775256f6b7bd1bce4dea7cd.pack' has unresolved deltas >> > >> > Using the tip of the "next" branch (git version 1.4.2.4.gf9fe) I just >> > cannot reproduce this problem at all. I always get a good index and >> > ec0c3491753e115e1775256f6b7bd1bce4dea7cd back. >> >> Hmph. I just got exactly the same breakage; could this be >> another 64-bit breakage? My breakage was on x86-64. > > I've been suspecting that since then as well. I indeed tested on i386. > But reviewing the code I just can't find any obvious spot where 64-bit > would be an issue, especially since your pack does not have any > OFS_DELTA objects. > > Could you instrument the code at the end of > index-pack.c:parse_pack_objects() to display how many deltas were > actually resolved and how many were not? IOW is it a case of all or > nothing, or is there an isolated case of corruption lurking somewhere? fatal: packfile 'pack-ec0c3491753e115e1775256f6b7bd1bce4dea7cd.pack' has 18915 unresolved ref-deltas and 0 ofs-deltas among 21205 By the way, "Gaaaah". Is this find_delta() called from find_delta_children() doing the right thing? I wonder if this is open to accidental collisions?. If you have an object name whose last 12-bytes are all NUL and you have a pack offset whose bytes happens to be a good prefix for an object, what happens? ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-17 18:11 ` Junio C Hamano @ 2006-10-17 18:47 ` Nicolas Pitre 2006-10-17 19:36 ` Sergey Vlasov 0 siblings, 1 reply; 33+ messages in thread From: Nicolas Pitre @ 2006-10-17 18:47 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Tue, 17 Oct 2006, Junio C Hamano wrote: > Nicolas Pitre <nico@cam.org> writes: > > > Could you instrument the code at the end of > > index-pack.c:parse_pack_objects() to display how many deltas were > > actually resolved and how many were not? IOW is it a case of all or > > nothing, or is there an isolated case of corruption lurking somewhere? > > fatal: packfile 'pack-ec0c3491753e115e1775256f6b7bd1bce4dea7cd.pack' has > 18915 unresolved ref-deltas and 0 ofs-deltas among 21205 Hmmm.... Interesting. Is it possible that sizeof(union delta_base) might not be equal to 20 for you? > By the way, "Gaaaah". Is this find_delta() called from > find_delta_children() doing the right thing? I wonder if this > is open to accidental collisions?. If you have an object name > whose last 12-bytes are all NUL and you have a pack offset whose > bytes happens to be a good prefix for an object, what happens? It is filtered out later thusly: ... for (j = ref_first; j <= ref_last; j++) if (deltas[j].obj->type == OBJ_REF_DELTA) resolve_delta(&deltas[j], data, ...); So if a collision happens the object won't be of the right type and it is simply skipped. Nicolas ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-17 18:47 ` Nicolas Pitre @ 2006-10-17 19:36 ` Sergey Vlasov 2006-10-17 20:10 ` Junio C Hamano 2006-10-17 20:23 ` Nicolas Pitre 0 siblings, 2 replies; 33+ messages in thread From: Sergey Vlasov @ 2006-10-17 19:36 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Junio C Hamano, git [-- Attachment #1: Type: text/plain, Size: 1054 bytes --] On Tue, 17 Oct 2006 14:47:16 -0400 (EDT) Nicolas Pitre wrote: > On Tue, 17 Oct 2006, Junio C Hamano wrote: > > > Nicolas Pitre <nico@cam.org> writes: > > > > > Could you instrument the code at the end of > > > index-pack.c:parse_pack_objects() to display how many deltas were > > > actually resolved and how many were not? IOW is it a case of all or > > > nothing, or is there an isolated case of corruption lurking somewhere? > > > > fatal: packfile 'pack-ec0c3491753e115e1775256f6b7bd1bce4dea7cd.pack' has > > 18915 unresolved ref-deltas and 0 ofs-deltas among 21205 > > Hmmm.... Interesting. > > Is it possible that sizeof(union delta_base) might not be equal to 20 > for you? Yes, on x86_64 this is 24 because of 8-byte alignment for longs: $ echo 'unsigned x = sizeof(union{unsigned char sha1[20]; long offset;});' | gcc -S -x c -m64 -o - - | grep long .long 24 $ echo 'unsigned x = sizeof(union{unsigned char sha1[20]; long offset;});' | gcc -S -x c -m32 -o - - | grep long .long 20 [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-17 19:36 ` Sergey Vlasov @ 2006-10-17 20:10 ` Junio C Hamano 2006-10-17 20:25 ` Nicolas Pitre 2006-10-17 20:23 ` Nicolas Pitre 1 sibling, 1 reply; 33+ messages in thread From: Junio C Hamano @ 2006-10-17 20:10 UTC (permalink / raw) To: Nicolas Pitre; +Cc: git, Sergey Vlasov Sergey Vlasov <vsu@altlinux.ru> writes: > On Tue, 17 Oct 2006 14:47:16 -0400 (EDT) Nicolas Pitre wrote: > >> Is it possible that sizeof(union delta_base) might not be equal to 20 >> for you? > > Yes, on x86_64 this is 24 because of 8-byte alignment for longs: Enough eyeballs made this bug shallow ;-) Thanks. diff --git a/index-pack.c b/index-pack.c index fffddd2..49b6efe 100644 --- a/index-pack.c +++ b/index-pack.c @@ -166,6 +166,7 @@ static void *unpack_raw_entry(unsigned l case OBJ_REF_DELTA: if (pos + 20 >= pack_limit) bad_object(offset, "object extends past end of pack"); + memset(delta_base, 0, sizeof(*delta_base)); hashcpy(delta_base->sha1, pack_base + pos); pos += 20; break; @@ -290,6 +291,7 @@ static void resolve_delta(struct delta_e bad_object(obj->offset, "failed to apply delta"); sha1_object(result, result_size, type, obj->sha1); + memset(&delta_base, 0, sizeof(delta_base)); hashcpy(delta_base.sha1, obj->sha1); if (!find_delta_childs(&delta_base, &first, &last)) { for (j = first; j <= last; j++) @@ -365,6 +367,7 @@ static void parse_pack_objects(void) if (obj->type == OBJ_REF_DELTA || obj->type == OBJ_OFS_DELTA) continue; + memset(&base, 0, sizeof(base)); hashcpy(base.sha1, obj->sha1); ref = !find_delta_childs(&base, &ref_first, &ref_last); memset(&base, 0, sizeof(base)); ^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-17 20:10 ` Junio C Hamano @ 2006-10-17 20:25 ` Nicolas Pitre 0 siblings, 0 replies; 33+ messages in thread From: Nicolas Pitre @ 2006-10-17 20:25 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Sergey Vlasov On Tue, 17 Oct 2006, Junio C Hamano wrote: > Sergey Vlasov <vsu@altlinux.ru> writes: > > > On Tue, 17 Oct 2006 14:47:16 -0400 (EDT) Nicolas Pitre wrote: > > > >> Is it possible that sizeof(union delta_base) might not be equal to 20 > >> for you? > > > > Yes, on x86_64 this is 24 because of 8-byte alignment for longs: > > Enough eyeballs made this bug shallow ;-) Thanks. > > diff --git a/index-pack.c b/index-pack.c > index fffddd2..49b6efe 100644 > --- a/index-pack.c > +++ b/index-pack.c > @@ -166,6 +166,7 @@ static void *unpack_raw_entry(unsigned l > case OBJ_REF_DELTA: > if (pos + 20 >= pack_limit) > bad_object(offset, "object extends past end of pack"); > + memset(delta_base, 0, sizeof(*delta_base)); > hashcpy(delta_base->sha1, pack_base + pos); Yeah.... but this is a bit wasteful since (especially on 32-bit archs) you just write to the same memory twice in a row. Nicolas ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-17 19:36 ` Sergey Vlasov 2006-10-17 20:10 ` Junio C Hamano @ 2006-10-17 20:23 ` Nicolas Pitre 2006-10-17 20:51 ` Linus Torvalds 1 sibling, 1 reply; 33+ messages in thread From: Nicolas Pitre @ 2006-10-17 20:23 UTC (permalink / raw) To: Sergey Vlasov; +Cc: Junio C Hamano, git On Tue, 17 Oct 2006, Sergey Vlasov wrote: > On Tue, 17 Oct 2006 14:47:16 -0400 (EDT) Nicolas Pitre wrote: > > > Is it possible that sizeof(union delta_base) might not be equal to 20 > > for you? > > Yes, on x86_64 this is 24 because of 8-byte alignment for longs: Ah bummer. Then this is most likely the cause. And here's a simple fix (Junio please confirm): diff --git a/index-pack.c b/index-pack.c index fffddd2..56c590e 100644 --- a/index-pack.c +++ b/index-pack.c @@ -23,6 +23,12 @@ union delta_base { unsigned long offset; }; +/* + * Even if sizeof(union delta_base) == 24 on 64-bit archs, we really want + * to memcmp() only the first 20 bytes. + */ +#define UNION_BASE_SZ 20 + struct delta_entry { struct object_entry *obj; @@ -211,7 +217,7 @@ static int find_delta(const union delta_ struct delta_entry *delta = &deltas[next]; int cmp; - cmp = memcmp(base, &delta->base, sizeof(*base)); + cmp = memcmp(base, &delta->base, UNION_BASE_SZ); if (!cmp) return next; if (cmp < 0) { @@ -232,9 +238,9 @@ static int find_delta_childs(const union if (first < 0) return -1; - while (first > 0 && !memcmp(&deltas[first - 1].base, base, sizeof(*base))) + while (first > 0 && !memcmp(&deltas[first - 1].base, base, UNION_BASE_SZ)) --first; - while (last < end && !memcmp(&deltas[last + 1].base, base, sizeof(*base))) + while (last < end && !memcmp(&deltas[last + 1].base, base, UNION_BASE_SZ)) ++last; *first_index = first; *last_index = last; @@ -312,7 +318,7 @@ static int compare_delta_entry(const voi { const struct delta_entry *delta_a = a; const struct delta_entry *delta_b = b; - return memcmp(&delta_a->base, &delta_b->base, sizeof(union delta_base)); + return memcmp(&delta_a->base, &delta_b->base, UNION_BASE_SZ); } static void parse_pack_objects(void) Nicolas ^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-17 20:23 ` Nicolas Pitre @ 2006-10-17 20:51 ` Linus Torvalds 2006-10-17 21:21 ` Nicolas Pitre 2006-10-17 21:54 ` Junio C Hamano 0 siblings, 2 replies; 33+ messages in thread From: Linus Torvalds @ 2006-10-17 20:51 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Sergey Vlasov, Junio C Hamano, git On Tue, 17 Oct 2006, Nicolas Pitre wrote: > On Tue, 17 Oct 2006, Sergey Vlasov wrote: > > > > Yes, on x86_64 this is 24 because of 8-byte alignment for longs: > > Ah bummer. Then this is most likely the cause. And here's a simple > fix (Junio please confirm): Why do you use "unsigned long" in the first place? For some structure like this, it sounds positively wrong. Pack-files should be architecture-neutral, which means that they shouldn't depend on word-size, and they should be in some neutral byte-order. Quite frankly, this all makes me go "Eww..". The original pack-file (well, v2) format was well-defined and had none of these issues. In contrast, the new code in 'next' is just _ugly_. And the thing about "ugly" is that it also tends to mean "fragile" and "buggy" and "hard to extend later". And maybe it's just me, but I consider unions to be bug-prone on their own. The "master" branch has exactly two unions: the "grep_expr" structure contains one (where the union member is clearly defined by the node type in that structure), and object.c has a "union any_object" that _literally_ exists as purely an allocation size issue (ie it is used _only_ to allocate the maximum size of any of the possible structures). In contrast, the new union introduced in "next" is just horrid. There's not even any way to know which member to use, except apparently that it expects that a SHA1 is never zero in the last 12 bytes. Which is probably true, but still - that's some ugly stuff. Is this something you want to bet a big project on? Linus ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-17 20:51 ` Linus Torvalds @ 2006-10-17 21:21 ` Nicolas Pitre 2006-10-17 21:46 ` Linus Torvalds 2006-10-17 21:54 ` Junio C Hamano 1 sibling, 1 reply; 33+ messages in thread From: Nicolas Pitre @ 2006-10-17 21:21 UTC (permalink / raw) To: Linus Torvalds; +Cc: Sergey Vlasov, Junio C Hamano, git On Tue, 17 Oct 2006, Linus Torvalds wrote: > > > On Tue, 17 Oct 2006, Nicolas Pitre wrote: > > On Tue, 17 Oct 2006, Sergey Vlasov wrote: > > > > > > Yes, on x86_64 this is 24 because of 8-byte alignment for longs: > > > > Ah bummer. Then this is most likely the cause. And here's a simple > > fix (Junio please confirm): > > Why do you use "unsigned long" in the first place? Because offsets into packs are expressed as unsigned long everywhere else (except in the current pack index on-disk format). > For some structure like this, it sounds positively wrong. Pack-files > should be architecture-neutral, which means that they shouldn't depend on > word-size, and they should be in some neutral byte-order. But they do. Please consider this code: case OBJ_OFS_DELTA: memset(delta_base, 0, sizeof(*delta_base)); c = pack_base[pos++]; base_offset = c & 127; while (c & 128) { base_offset += 1; if (!base_offset || base_offset & ~(~0UL >> 7)) bad_object(offset, "offset value overflow for delta base object"); if (pos >= pack_limit) bad_object(offset, "object extends past end of pack"); c = pack_base[pos++]; base_offset = (base_offset << 7) + (c & 127); } delta_base->offset = offset - base_offset; if (delta_base->offset >= offset) bad_object(offset, "delta base offset is out of bound"); break; Do you see anything inerently wrong in this code? The above is already 64-bit ready such that it'll just work on 64-bit archs and will display a sensible message if a 32-bit arch encounter a pack larger than 4GB. But the on-disk pack format has no limitation what so ever. > Quite frankly, this all makes me go "Eww..". The original pack-file (well, > v2) format was well-defined and had none of these issues. In contrast, the > new code in 'next' is just _ugly_. I beg to differ. Please reconsider in light of the above. > And maybe it's just me, but I consider unions to be bug-prone on their > own. The "master" branch has exactly two unions: the "grep_expr" structure > contains one (where the union member is clearly defined by the node type > in that structure), and object.c has a "union any_object" that _literally_ > exists as purely an allocation size issue (ie it is used _only_ to > allocate the maximum size of any of the possible structures). > > In contrast, the new union introduced in "next" is just horrid. There's > not even any way to know which member to use, except apparently that it > expects that a SHA1 is never zero in the last 12 bytes. Which is probably > true, but still - that's some ugly stuff. This union should be looked at just like a sortable hash pointing to a base object so that deltas with the same base object can be sorted together. And the field to use is well defined of course: deltas with sha1 to base use the sha1 member, deltas with offset to base use the offset member. This hash, together with the delta type, constitute a tuple guaranteed to be unique so there can't be any confusion. > Is this something you want to bet a big project on? I don't see why not. Nicolas ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-17 21:21 ` Nicolas Pitre @ 2006-10-17 21:46 ` Linus Torvalds 2006-10-18 0:20 ` Nicolas Pitre 0 siblings, 1 reply; 33+ messages in thread From: Linus Torvalds @ 2006-10-17 21:46 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Sergey Vlasov, Junio C Hamano, git On Tue, 17 Oct 2006, Nicolas Pitre wrote: > > Because offsets into packs are expressed as unsigned long everywhere > else (except in the current pack index on-disk format). Until your work, that "unsigned long" was totally just an internal thing that didn't actually bleed into anything else. > > For some structure like this, it sounds positively wrong. Pack-files > > should be architecture-neutral, which means that they shouldn't depend on > > word-size, and they should be in some neutral byte-order. > > But they do. Please consider this code: Right. The pack-file itself. But the code that actually _generates_ it mixes things in alarming ways. > > In contrast, the new union introduced in "next" is just horrid. There's > > not even any way to know which member to use, except apparently that it > > expects that a SHA1 is never zero in the last 12 bytes. Which is probably > > true, but still - that's some ugly stuff. > > This union should be looked at just like a sortable hash pointing to a > base object so that deltas with the same base object can be sorted > together. .. and it sorts _differently_ on a big-endian vs little-endian thing, doesn't it? So now the sort order depends on endianness and/or wordsize. That just sounds really really wrong. Linus ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-17 21:46 ` Linus Torvalds @ 2006-10-18 0:20 ` Nicolas Pitre 2006-10-18 0:57 ` Linus Torvalds 2006-10-18 1:30 ` Junio C Hamano 0 siblings, 2 replies; 33+ messages in thread From: Nicolas Pitre @ 2006-10-18 0:20 UTC (permalink / raw) To: Linus Torvalds; +Cc: Sergey Vlasov, Junio C Hamano, git On Tue, 17 Oct 2006, Linus Torvalds wrote: > > > On Tue, 17 Oct 2006, Nicolas Pitre wrote: > > > > Because offsets into packs are expressed as unsigned long everywhere > > else (except in the current pack index on-disk format). > > Until your work, that "unsigned long" was totally just an internal thing > that didn't actually bleed into anything else. And would you please explain how my work changes that state of affairs? Sorry but I don't follow you here. Still _I_ wrote that code. > > > For some structure like this, it sounds positively wrong. Pack-files > > > should be architecture-neutral, which means that they shouldn't depend on > > > word-size, and they should be in some neutral byte-order. > > > > But they do. Please consider this code: > > Right. The pack-file itself. But the code that actually _generates_ it > mixes things in alarming ways. ??? > > > In contrast, the new union introduced in "next" is just horrid. There's > > > not even any way to know which member to use, except apparently that it > > > expects that a SHA1 is never zero in the last 12 bytes. Which is probably > > > true, but still - that's some ugly stuff. > > > > This union should be looked at just like a sortable hash pointing to a > > base object so that deltas with the same base object can be sorted > > together. > > .. and it sorts _differently_ on a big-endian vs little-endian thing, > doesn't it? Sure. But who cares? The sorting is just there to 1) perform binary searches on the list of deltas based from a given object, and 2) find a list of all deltas with the same base object. > So now the sort order depends on endianness and/or wordsize. That just > sounds really really wrong. Again, who cares? That ordering doesn't influence any data produced by the tool. It is an internal and private strategy to speed up the _local_ _searching_ process. It could be replaced by a dumb linear list walk if you wish and the end result i.e. the produced pack index would be exactly the same (with a significant slowdown notwitstanding). So let me summarize: - the union is a hash. - the hash is either an offset value or a sha1 digest. - this hash is used for fast object lookup _only_. - it does sort differently on big vs little endian machines. - but we don't care at all because - it is a private algorithmic thing that doesn't "bleed" into any _real_ data structure, and - it doesn't have any influence on the format of the end result. - it is only a runtime abstraction and nothing else. - It never gets into the pack nor the pack index themselves. Do you still have issues with that? Nicolas ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-18 0:20 ` Nicolas Pitre @ 2006-10-18 0:57 ` Linus Torvalds 2006-10-18 2:08 ` Nicolas Pitre 2006-10-18 1:30 ` Junio C Hamano 1 sibling, 1 reply; 33+ messages in thread From: Linus Torvalds @ 2006-10-18 0:57 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Sergey Vlasov, Junio C Hamano, git On Tue, 17 Oct 2006, Nicolas Pitre wrote: > > > > .. and it sorts _differently_ on a big-endian vs little-endian thing, > > doesn't it? > > Sure. But who cares? The sorting is just there to 1) perform binary > searches on the list of deltas based from a given object, and 2) find a > list of all deltas with the same base object. _I_ care. The new code is messy. It's fragile, and already showed one very fundamental bug which depended on architectures. These things matter. We have had very few bugs in git, and one of the reasons is (I believe) that we haven't had ad-hoc code. I get _very_ nervous when you mix up SHA1 names with somethign totally different without even a flag to say which one it is. That's just nasty. The fact that the code then behaves (and behave_d_) differently on different architectures is just a sign of the problem. "Who cares?" is not a good question to ask for a SCM. Linus ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-18 0:57 ` Linus Torvalds @ 2006-10-18 2:08 ` Nicolas Pitre 2006-10-18 3:12 ` Linus Torvalds 0 siblings, 1 reply; 33+ messages in thread From: Nicolas Pitre @ 2006-10-18 2:08 UTC (permalink / raw) To: Linus Torvalds; +Cc: Sergey Vlasov, Junio C Hamano, git On Tue, 17 Oct 2006, Linus Torvalds wrote: > > > On Tue, 17 Oct 2006, Nicolas Pitre wrote: > > > > > > .. and it sorts _differently_ on a big-endian vs little-endian thing, > > > doesn't it? > > > > Sure. But who cares? The sorting is just there to 1) perform binary > > searches on the list of deltas based from a given object, and 2) find a > > list of all deltas with the same base object. > > _I_ care. OK, So I do. > The new code is messy. It's fragile, and already showed one very > fundamental bug which depended on architectures. My stance is that it is not fragile. Sure it had one bug that depended on an architecture difference, but so was commit ac58c7b18e about. The code also has many consistency checks all over so that it doesn't write out garbage if such bugs arise. And that fundamental bug was actually a trivial one that was caught right away by such consistency check. > These things matter. We have had very few bugs in git, and one of the > reasons is (I believe) that we haven't had ad-hoc code. I get _very_ > nervous when you mix up SHA1 names with somethign totally different > without even a flag to say which one it is. That's just nasty. But there _is_ a flag for damn sake. Did you at least try to understand the code and not just skim over it from 10000 feet above? It is really simple: - if the found union content matches with a reference union initialized through the sha1 member then deltas[j].obj->type == OBJ_REF_DELTA must be true. - if the found union content matches with a reference union initialized through the sha1 member then deltas[j].obj->type == OBJ_OFS_DELTA must be true. - For all deltas with deltas[j].obj->type == OBJ_REF_DELTA there can not be more than one of them with the same union value. - For all deltas with deltas[j].obj->type == OBJ_OFS_DELTA there can not be more than one of them with the same union value. There is _no_ confusion possible. > The fact that the code then behaves (and behave_d_) differently on > different architectures is just a sign of the problem. Does this mean that, with your own change to xdiff that has just been committed, you actually created a "problem"? Because this is a change that creates different behaviors whether a 32-bit or 64-bit architecture is used, Right? But of course not. We want it to behave differently on 64-bit than 32-bit. My code is in the _same_ camp since I explicitly want it to sort numbers differently whether it is a little endian or big endian machine. So this is not a problem this is a feature, and very by design. > "Who cares?" is not a good question to ask for a SCM. Please just try to understand why I'm claming this is not important in this very case. Please do me this favor. Nicolas ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-18 2:08 ` Nicolas Pitre @ 2006-10-18 3:12 ` Linus Torvalds 2006-10-18 6:09 ` Davide Libenzi 0 siblings, 1 reply; 33+ messages in thread From: Linus Torvalds @ 2006-10-18 3:12 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Sergey Vlasov, Junio C Hamano, git On Tue, 17 Oct 2006, Nicolas Pitre wrote: > > But there _is_ a flag for damn sake. Did you at least try to understand > the code and not just skim over it from 10000 feet above? I only looked at it from the patches, and the actual data structure, and they didn't have it, so.. > There is _no_ confusion possible. Ok. Good. > Does this mean that, with your own change to xdiff that has just been > committed, you actually created a "problem"? Because this is a change > that creates different behaviors whether a 32-bit or 64-bit architecture > is used, Right? If you go back to that discussion, I actually pointed out several times that the whole bug _was_ actually introduced exactly because the xdiff code used things that behave differently depending on word-size. My suggestion for a _proper_ fix was to not use "unsigned long" for that, and the patch I suggested (and eventually got merged) was to use the _low_ bits of the hash, exactly because the low bits are the ones that act the same, regardless of wordsize. > But of course not. We want it to behave differently on 64-bit than > 32-bit. No, we actually don't. Not for xdiff, at least. The last thing you want is for different architectures to get different results. It's horrible. It means that bugs are hard to reproduce, and it means that even code that is "tested" is actually tested only for a particular architecture. So the bug in xdiff was _exactly_ that somebody - totally incorrectly - thought it should work "better" on 64-bits. > Please just try to understand why I'm claming this is not important in > this very case. Please do me this favor. Maybe the code is fine. Maybe the particular detail wasn't important. But the original code didn't have _any_ dependencies on things like structure alignment that caused it to do strange things. And dammit, the fact is, I think the new format is just worse. I think it was a good thing to have the full SHA1 in the pack-file. I think the code got less understandable, and had more special cases, just because now we have two totally different kinds of deltas. So maybe I'm reacting to the fact that I think the bug happened in the first place for a very simple reason: the data structure wasn't unambiguous any more. Linus ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-18 3:12 ` Linus Torvalds @ 2006-10-18 6:09 ` Davide Libenzi 2006-10-18 14:56 ` Linus Torvalds 0 siblings, 1 reply; 33+ messages in thread From: Davide Libenzi @ 2006-10-18 6:09 UTC (permalink / raw) To: Linus Torvalds; +Cc: Nicolas Pitre, Sergey Vlasov, Junio C Hamano, git On Tue, 17 Oct 2006, Linus Torvalds wrote: > > Does this mean that, with your own change to xdiff that has just been > > committed, you actually created a "problem"? Because this is a change > > that creates different behaviors whether a 32-bit or 64-bit architecture > > is used, Right? > > If you go back to that discussion, I actually pointed out several times > that the whole bug _was_ actually introduced exactly because the xdiff > code used things that behave differently depending on word-size. Ehm, I think there's a little bit of confusion. The incorrect golden ratio prime selection for 64 bits machines was coalescing hash indexes into a very limited number of buckets, hence creating very bad performance on diff operations. The result of the diff would have been exacly the same, just coming out after the time for a cup of coffee and a croissant ;) > So the bug in xdiff was _exactly_ that somebody - totally incorrectly - > thought it should work "better" on 64-bits. Haha, not exactly. I had just forgot about the incoming 64 bits world at the time. - Davide ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-18 6:09 ` Davide Libenzi @ 2006-10-18 14:56 ` Linus Torvalds 2006-10-18 16:17 ` Davide Libenzi 0 siblings, 1 reply; 33+ messages in thread From: Linus Torvalds @ 2006-10-18 14:56 UTC (permalink / raw) To: Davide Libenzi; +Cc: Nicolas Pitre, Sergey Vlasov, Junio C Hamano, git On Tue, 17 Oct 2006, Davide Libenzi wrote: > > Ehm, I think there's a little bit of confusion. The incorrect golden ratio > prime selection for 64 bits machines was coalescing hash indexes into a > very limited number of buckets, hence creating very bad performance on diff > operations. The result of the diff would have been exacly the same, just > coming out after the time for a cup of coffee and a croissant ;) But my point is, you would have been better off _without_ an algorithm that cared about the word-size at all, or with just using "uint32_t". See? Yes, a "unsigned long" has more bits for hashing on a 64-bit architecture. But that's totally the wrong way of thinking about it. YOU DO NOT WANT MORE BITS! You want the same damn answer regardless of architecture! A diff algorithm that gives different answers on a 32-bit LE architecture than on a 64-bit BE architecture is BROKEN. If I run on x86-64, I want the same answers I got on x86-32, and the same ones I get on ppc32. Anything else is SIMPLY NOT ACCEPTABLE! So the whole idea that you should have used 64-bit values was broken, broken, broken. You should never have had anything that cared, because anything that cares is by definition buggy. This is why we should use the _low_ bits. Never the high bits. Linus ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-18 14:56 ` Linus Torvalds @ 2006-10-18 16:17 ` Davide Libenzi 2006-10-18 16:52 ` Linus Torvalds 0 siblings, 1 reply; 33+ messages in thread From: Davide Libenzi @ 2006-10-18 16:17 UTC (permalink / raw) To: Linus Torvalds; +Cc: Nicolas Pitre, Sergey Vlasov, Junio C Hamano, git On Wed, 18 Oct 2006, Linus Torvalds wrote: > On Tue, 17 Oct 2006, Davide Libenzi wrote: > > > > Ehm, I think there's a little bit of confusion. The incorrect golden ratio > > prime selection for 64 bits machines was coalescing hash indexes into a > > very limited number of buckets, hence creating very bad performance on diff > > operations. The result of the diff would have been exacly the same, just > > coming out after the time for a cup of coffee and a croissant ;) > > But my point is, you would have been better off _without_ an algorithm > that cared about the word-size at all, or with just using "uint32_t". Yes. At the time I picked Knuth's hash function because it was simple and fast enough. But it needed special handling for different word sizes, exactly like kernel's hash_long() does. > A diff algorithm that gives different answers on a 32-bit LE architecture > than on a 64-bit BE architecture is BROKEN. If I run on x86-64, I want the > same answers I got on x86-32, and the same ones I get on ppc32. Anything > else is SIMPLY NOT ACCEPTABLE! Speaking in general, seen at the hash function level, of course an interface should not give different result for different word sizes or word endianess. Considering the diff algorithm as interface, as I said, the output was unaffected by the 64 bits word size. It was just very slow. - Davide ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-18 16:17 ` Davide Libenzi @ 2006-10-18 16:52 ` Linus Torvalds 2006-10-18 21:21 ` Davide Libenzi 0 siblings, 1 reply; 33+ messages in thread From: Linus Torvalds @ 2006-10-18 16:52 UTC (permalink / raw) To: Davide Libenzi; +Cc: Nicolas Pitre, Sergey Vlasov, Junio C Hamano, git On Wed, 18 Oct 2006, Davide Libenzi wrote: > > Speaking in general, seen at the hash function level, of course an interface > should not give different result for different word sizes or word endianess. > Considering the diff algorithm as interface, as I said, the output was > unaffected by the 64 bits word size. It was just very slow. Well, even the output may actually be affected, in the case of _real_ hash collisions (as opposed to just the hash _list_ collision that XDL_HASHLONG caused). So I actually think it would be better to have "uint32_t" as the hash value - because that would mean that all diffs (or, in the case of the block-algorithm, the deltas) are guaranteed to give the same results regardless of architecture. Right now, we actually generate a 64-bit hash value (BUT: for short lines, it's likely only _interesting_ in the low bits, so the high bits tend to have a very high likelihood of being zero). So hash collisions are different: on a 32-bit architecture, two lines may have the same hash, while on a 64-bit one, they are different. And together with some of the limiters we have (eg XDL_MAX_EQLIMIT) hash collisions can sometimes affect the output. Admittedly, in _practice_ this is really unlikely to affect anything (you'd get a valid diff in either case, they'd just possibly be subtly different, and the input data must be _really_ strange to even see that case), but I do think that the hash algorithm can matter. NOTE! I'm not talking about XDL_HASHLONG(), I'm talking about the xdl_hash_record() hash, which returns differently-sized hash results on 32-bit and 64-bit. And there are cases where we _only_ compare the hashes, and don't actually double-check the contents. So I think that in _practice_ you can't see differences between a 32-bit version and a 64-bit one, but the possibility is there. Using "uint32_t" instead of "unsigned long" to keep track of hashes would avoid that theoretical problem (and might actually make for better performance on 64-bit archtiectures, if only because of denser data structures and thus better cache behaviour). Linus ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-18 16:52 ` Linus Torvalds @ 2006-10-18 21:21 ` Davide Libenzi 2006-10-18 21:48 ` Linus Torvalds 0 siblings, 1 reply; 33+ messages in thread From: Davide Libenzi @ 2006-10-18 21:21 UTC (permalink / raw) To: Linus Torvalds; +Cc: Nicolas Pitre, Sergey Vlasov, Junio C Hamano, git On Wed, 18 Oct 2006, Linus Torvalds wrote: > On Wed, 18 Oct 2006, Davide Libenzi wrote: > > > > Speaking in general, seen at the hash function level, of course an interface > > should not give different result for different word sizes or word endianess. > > Considering the diff algorithm as interface, as I said, the output was > > unaffected by the 64 bits word size. It was just very slow. > > Well, even the output may actually be affected, in the case of _real_ hash > collisions (as opposed to just the hash _list_ collision that XDL_HASHLONG > caused). > > So I actually think it would be better to have "uint32_t" as the hash > value - because that would mean that all diffs (or, in the case of the > block-algorithm, the deltas) are guaranteed to give the same results > regardless of architecture. > > Right now, we actually generate a 64-bit hash value (BUT: for short lines, > it's likely only _interesting_ in the low bits, so the high bits tend to > have a very high likelihood of being zero). So hash collisions are > different: on a 32-bit architecture, two lines may have the same hash, > while on a 64-bit one, they are different. > > And together with some of the limiters we have (eg XDL_MAX_EQLIMIT) hash > collisions can sometimes affect the output. > > Admittedly, in _practice_ this is really unlikely to affect anything > (you'd get a valid diff in either case, they'd just possibly be subtly > different, and the input data must be _really_ strange to even see that > case), but I do think that the hash algorithm can matter. > > NOTE! I'm not talking about XDL_HASHLONG(), I'm talking about the > xdl_hash_record() hash, which returns differently-sized hash results on > 32-bit and 64-bit. And there are cases where we _only_ compare the hashes, > and don't actually double-check the contents. The hash value (hence the hash bucket index) simply directs you to the bucket where a real record-compare loop is performed. Collisions here means only performance loss (you don't spread buckets enough), nothing more. So the different behaviour does not apply to the diff algo. But, it would apply if the knowledge of the hash indexing would be "exported", for example inside an external file. Think about a trivial DB file where you store hash buckets on file an you want to lookup records based on the store hash layout. In that case, of course, if the hash function that generated the DB bucket layout is different from the one that you use to get the bucket index at lookup time, you're in trouble. IOW if the hash function result is not "exported" is some way, it doesn't really matter if it's an 'unsigned long' or a 'uint32_t'. In the same way you cannot export binary structures using 'int' or 'long', and expect to be compatible over different architectures. - Davide ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-18 21:21 ` Davide Libenzi @ 2006-10-18 21:48 ` Linus Torvalds 2006-10-18 22:34 ` Davide Libenzi 0 siblings, 1 reply; 33+ messages in thread From: Linus Torvalds @ 2006-10-18 21:48 UTC (permalink / raw) To: Davide Libenzi; +Cc: Nicolas Pitre, Sergey Vlasov, Junio C Hamano, git On Wed, 18 Oct 2006, Davide Libenzi wrote: > > The hash value (hence the hash bucket index) simply directs you to the > bucket where a real record-compare loop is performed. As far as I can tell not all loops do a real "record-compare" thing. Some of the hash loops _only_ look at the hash, and as such a bad hash will do more than just cause bad performance, it will actually degrade the diff itself. Isn't that what XDL_MAX_EQLIMIT effectively does? Btw, the binary delta generator doesn't seem to have this issue at all: it uses "unsigned int" for the hash values, so the xdiff delta generation will give the same exact results on 32-bit and 64-bit architectures. Or was that one of the changes by Nico? (I only looked at the git version of that code) Linus ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-18 21:48 ` Linus Torvalds @ 2006-10-18 22:34 ` Davide Libenzi 0 siblings, 0 replies; 33+ messages in thread From: Davide Libenzi @ 2006-10-18 22:34 UTC (permalink / raw) To: Linus Torvalds; +Cc: Nicolas Pitre, Sergey Vlasov, Junio C Hamano, git On Wed, 18 Oct 2006, Linus Torvalds wrote: > > > On Wed, 18 Oct 2006, Davide Libenzi wrote: > > > > The hash value (hence the hash bucket index) simply directs you to the > > bucket where a real record-compare loop is performed. > > As far as I can tell not all loops do a real "record-compare" thing. > > Some of the hash loops _only_ look at the hash, and as such a bad hash > will do more than just cause bad performance, it will actually degrade the > diff itself. Isn't that what XDL_MAX_EQLIMIT effectively does? The XDL_MAX_EQLIMIT is used to limit the search for equal records, in the record-discard phase. Note though, that at that point that "ha" value is a record-class ID (every different record/line in the input has a unique ID). Look at what xdl_classify_record() does. So in that case, XDL_HASHLONG can really simply be a bitmask. So comparing "ha" in the loop in there, does actually the right thing in any case (equal "ha" means really equal record). > Btw, the binary delta generator doesn't seem to have this issue at all: it > uses "unsigned int" for the hash values, so the xdiff delta generation > will give the same exact results on 32-bit and 64-bit architectures. > > Or was that one of the changes by Nico? (I only looked at the git version > of that code) The binary diff in libxdiff uses a chaining hash, so even in that case it wouldn't have made a difference. I think Nico changed the hash to be a coalesced hash, and in that case it does change the output. - Davide ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-18 0:20 ` Nicolas Pitre 2006-10-18 0:57 ` Linus Torvalds @ 2006-10-18 1:30 ` Junio C Hamano 2006-10-18 2:23 ` Nicolas Pitre 1 sibling, 1 reply; 33+ messages in thread From: Junio C Hamano @ 2006-10-18 1:30 UTC (permalink / raw) To: Nicolas Pitre; +Cc: git Nicolas Pitre <nico@cam.org> writes: > So let me summarize: > > - the union is a hash. > > - the hash is either an offset value or a sha1 digest. > > - this hash is used for fast object lookup _only_. > > - it does sort differently on big vs little endian machines. > > - but we don't care at all because > > - it is a private algorithmic thing that doesn't "bleed" into any > _real_ data structure, and > > - it doesn't have any influence on the format of the end result. > > - it is only a runtime abstraction and nothing else. > > - It never gets into the pack nor the pack index themselves. > > Do you still have issues with that? The part you pointed out to me about "accidental collision" still bothers me somewhat. Right now we do not produce ref-delta and ofs-delta in the same stream, but if somebody did so then it would mean a disaster to have an accidental collision of an 8-byte offset value plus 12-byte traiing NUL and another base object whose object name happens to match that pattern. I am actually Ok if we say the code assumes one stream has only ref-delta or ofs-delta and never both. But then I suspect the first pass of parse_pack_objects() should make sure that assumption holds true for the pack being inspected and barf if it is not. Also the second pass do not have to run two find_delta_childs() calls per delta object because by that time we know which kind would never appear in the packfile. By the way can we call that find_delta_children() pretty please? ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-18 1:30 ` Junio C Hamano @ 2006-10-18 2:23 ` Nicolas Pitre 2006-10-18 4:16 ` Junio C Hamano 0 siblings, 1 reply; 33+ messages in thread From: Nicolas Pitre @ 2006-10-18 2:23 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Tue, 17 Oct 2006, Junio C Hamano wrote: > The part you pointed out to me about "accidental collision" > still bothers me somewhat. I'll try to clear them away. > Right now we do not produce ref-delta and ofs-delta in the same > stream, It is fully supported nevertheless. > but if somebody did so then it would mean a disaster to have an > accidental collision of an 8-byte offset value plus 12-byte traiing > NUL and another base object whose object name happens to match that > pattern. Not really. The only effect that would have on the sorted list of delta entries -- such sorting used to bring all deltas with the same base object contigously -- is that those deltas might not be perfectly contigous wrt their base object. This is why there is a test to skip deltas if they happen not to be of the expected type. > I am actually Ok if we say the code assumes one stream has only > ref-delta or ofs-delta and never both. I'm perfectly OK with both types completely randomized. > But then I suspect the first pass of parse_pack_objects() should > make sure that assumption holds true for the pack being > inspected and barf if it is not. This is an unnecessary restriction though. > Also the second pass do not have to run two find_delta_childs() calls > per delta object because by that time we know which kind would never > appear in the packfile. True, but the flexibility is worth having I think. It makes the thing more robust instead of less. > By the way can we call that find_delta_children() pretty please? I have no problem with that. Nicolas ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-18 2:23 ` Nicolas Pitre @ 2006-10-18 4:16 ` Junio C Hamano 2006-10-18 5:07 ` Junio C Hamano 0 siblings, 1 reply; 33+ messages in thread From: Junio C Hamano @ 2006-10-18 4:16 UTC (permalink / raw) To: Nicolas Pitre; +Cc: git Nicolas Pitre <nico@cam.org> writes: >> but if somebody did so then it would mean a disaster to have an >> accidental collision of an 8-byte offset value plus 12-byte traiing >> NUL and another base object whose object name happens to match that >> pattern. > > Not really. The only effect that would have on the sorted list of > delta entries -- such sorting used to bring all deltas with the same > base object contigously -- is that those deltas might not be perfectly > contigous wrt their base object. This is why there is a test to skip > deltas if they happen not to be of the expected type. Ah, I misread the code that uses union actually checks the type in struct delta_entry (which embeds the union). There won't be any collision problem and you support both types at the same time just fine. And your patch to compare only the first 20-bytes makes sense (assuming ulong is always shorter than 20-bytes which I think is safe to assume). ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-18 4:16 ` Junio C Hamano @ 2006-10-18 5:07 ` Junio C Hamano 2006-10-18 10:00 ` Johannes Schindelin 2006-10-18 13:02 ` Nicolas Pitre 0 siblings, 2 replies; 33+ messages in thread From: Junio C Hamano @ 2006-10-18 5:07 UTC (permalink / raw) To: Nicolas Pitre; +Cc: git, Linus Torvalds Junio C Hamano <junkio@cox.net> writes: > Ah, I misread the code that uses union actually checks the type > in struct delta_entry (which embeds the union). There won't be > any collision problem and you support both types at the same > time just fine. > > And your patch to compare only the first 20-bytes makes sense > (assuming ulong is always shorter than 20-bytes which I think is > safe to assume). Does this sound fair (the code is yours, just asking about the log message)? If we really wanted to be purist, we could run comparison with the union and obj->type as two keys, but I do not think it is worth it. -- >8 -- From: Nicolas Pitre <nico@cam.org> Date: Tue, 17 Oct 2006 16:23:26 -0400 Subject: [PATCH] index-pack: compare the first 20-bytes of the key. The "union delta_base" is a strange beast. It is a 20-byte binary blob key to search a binary searchable deltas[] array, each element of which uses it to represent its base object with either a full 20-byte SHA-1 or an offset in the pack. Which representation is used is determined by another field of the deltas[] array element, obj->type, so there is no room for confusion, as long as we make sure we compare the keys for the same type only with appropriate length. The code compared the full union with memcmp(). When storing the in-pack offset, the union was first cleared before storing an unsigned long, so comparison worked fine. On 64-bit architectures, however, the union typically is 24-byte long; the code did not clear the remaining 4-byte alignment padding when storing a full 20-byte SHA-1 representation. Using memcmp() to compare the whole union was wrong. This fixes the comparison to look at the first 20-bytes of the union, regardless of the architecture. As long as ulong is smaller than 20-bytes this works fine. Signed-off-by: Junio C Hamano <junkio@cox.net> --- index-pack.c | 14 ++++++++++---- 1 files changed, 10 insertions(+), 4 deletions(-) diff --git a/index-pack.c b/index-pack.c index fffddd2..56c590e 100644 --- a/index-pack.c +++ b/index-pack.c @@ -23,6 +23,12 @@ union delta_base { unsigned long offset; }; +/* + * Even if sizeof(union delta_base) == 24 on 64-bit archs, we really want + * to memcmp() only the first 20 bytes. + */ +#define UNION_BASE_SZ 20 + struct delta_entry { struct object_entry *obj; @@ -211,7 +217,7 @@ static int find_delta(const union delta_ struct delta_entry *delta = &deltas[next]; int cmp; - cmp = memcmp(base, &delta->base, sizeof(*base)); + cmp = memcmp(base, &delta->base, UNION_BASE_SZ); if (!cmp) return next; if (cmp < 0) { @@ -232,9 +238,9 @@ static int find_delta_childs(const union if (first < 0) return -1; - while (first > 0 && !memcmp(&deltas[first - 1].base, base, sizeof(*base))) + while (first > 0 && !memcmp(&deltas[first - 1].base, base, UNION_BASE_SZ)) --first; - while (last < end && !memcmp(&deltas[last + 1].base, base, sizeof(*base))) + while (last < end && !memcmp(&deltas[last + 1].base, base, UNION_BASE_SZ)) ++last; *first_index = first; *last_index = last; @@ -312,7 +318,7 @@ static int compare_delta_entry(const voi { const struct delta_entry *delta_a = a; const struct delta_entry *delta_b = b; - return memcmp(&delta_a->base, &delta_b->base, sizeof(union delta_base)); + return memcmp(&delta_a->base, &delta_b->base, UNION_BASE_SZ); } static void parse_pack_objects(void) -- 1.4.2.4.gf9fe ^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-18 5:07 ` Junio C Hamano @ 2006-10-18 10:00 ` Johannes Schindelin 2006-10-18 13:13 ` Nicolas Pitre 2006-10-18 13:02 ` Nicolas Pitre 1 sibling, 1 reply; 33+ messages in thread From: Johannes Schindelin @ 2006-10-18 10:00 UTC (permalink / raw) To: Junio C Hamano; +Cc: Nicolas Pitre, git, Linus Torvalds Hi, On Tue, 17 Oct 2006, Junio C Hamano wrote: > +/* > + * Even if sizeof(union delta_base) == 24 on 64-bit archs, we really want > + * to memcmp() only the first 20 bytes. > + */ > +#define UNION_BASE_SZ 20 Excuse me for joining the game, but why don't you just use the recently introduced hashcmp() for that purpose? AFAIU you do exactly that, you compare hashes. Ciao, Dscho ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-18 10:00 ` Johannes Schindelin @ 2006-10-18 13:13 ` Nicolas Pitre 0 siblings, 0 replies; 33+ messages in thread From: Nicolas Pitre @ 2006-10-18 13:13 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Junio C Hamano, git, Linus Torvalds On Wed, 18 Oct 2006, Johannes Schindelin wrote: > Hi, > > On Tue, 17 Oct 2006, Junio C Hamano wrote: > > > +/* > > + * Even if sizeof(union delta_base) == 24 on 64-bit archs, we really want > > + * to memcmp() only the first 20 bytes. > > + */ > > +#define UNION_BASE_SZ 20 > > Excuse me for joining the game, but why don't you just use the > recently introduced hashcmp() for that purpose? AFAIU you do exactly that, > you compare hashes. Yes, and that is what I did originally. But that could lead to false assumptions (and this thread already proved this code has its share of false assumption leads already). The thing is that the memory chunk that is being compared is not always the same kind of hash as usually used with hashcmp(). Throughout the code hashcmp() is always used with a 20-byte sha1 digest. In this case it can be either a 20-byte sha1 digest, or a long offset value. And by using hashcmp() I would be afraid someone else could assume the hash is always a sha1 digest which it is not. Nicolas ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-18 5:07 ` Junio C Hamano 2006-10-18 10:00 ` Johannes Schindelin @ 2006-10-18 13:02 ` Nicolas Pitre 1 sibling, 0 replies; 33+ messages in thread From: Nicolas Pitre @ 2006-10-18 13:02 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Linus Torvalds On Tue, 17 Oct 2006, Junio C Hamano wrote: > Junio C Hamano <junkio@cox.net> writes: > > > Ah, I misread the code that uses union actually checks the type > > in struct delta_entry (which embeds the union). There won't be > > any collision problem and you support both types at the same > > time just fine. > > > > And your patch to compare only the first 20-bytes makes sense > > (assuming ulong is always shorter than 20-bytes which I think is > > safe to assume). > > Does this sound fair (the code is yours, just asking about the > log message)? > > If we really wanted to be purist, we could run comparison with > the union and obj->type as two keys, but I do not think it is > worth it. > > -- >8 -- > From: Nicolas Pitre <nico@cam.org> > Date: Tue, 17 Oct 2006 16:23:26 -0400 > Subject: [PATCH] index-pack: compare the first 20-bytes of the key. > > The "union delta_base" is a strange beast. It is a 20-byte > binary blob key to search a binary searchable deltas[] array, > each element of which uses it to represent its base object with > either a full 20-byte SHA-1 or an offset in the pack. Which > representation is used is determined by another field of the > deltas[] array element, obj->type, so there is no room for > confusion, as long as we make sure we compare the keys for the > same type only with appropriate length. The code compared the > full union with memcmp(). > > When storing the in-pack offset, the union was first cleared > before storing an unsigned long, so comparison worked fine. > > On 64-bit architectures, however, the union typically is 24-byte > long; the code did not clear the remaining 4-byte alignment > padding when storing a full 20-byte SHA-1 representation. Using > memcmp() to compare the whole union was wrong. > > This fixes the comparison to look at the first 20-bytes of the > union, regardless of the architecture. As long as ulong is > smaller than 20-bytes this works fine. > > Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Nicolas Pitre <nico@cam.org> ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-17 20:51 ` Linus Torvalds 2006-10-17 21:21 ` Nicolas Pitre @ 2006-10-17 21:54 ` Junio C Hamano 2006-10-18 1:38 ` Nicolas Pitre 1 sibling, 1 reply; 33+ messages in thread From: Junio C Hamano @ 2006-10-17 21:54 UTC (permalink / raw) To: Linus Torvalds; +Cc: Nicolas Pitre, Sergey Vlasov, git Linus Torvalds <torvalds@osdl.org> writes: > On Tue, 17 Oct 2006, Nicolas Pitre wrote: >> On Tue, 17 Oct 2006, Sergey Vlasov wrote: >> > >> > Yes, on x86_64 this is 24 because of 8-byte alignment for longs: >> >> Ah bummer. Then this is most likely the cause. And here's a simple >> fix (Junio please confirm): > > Why do you use "unsigned long" in the first place? > > For some structure like this, it sounds positively wrong. Pack-files > should be architecture-neutral, which means that they shouldn't depend on > word-size, and they should be in some neutral byte-order. This is an in-core structure if I am not mistaken, and is never written out in the resulting pack file. The program is reading from .pack and building .idx that corresponds to it. I agree that it is ugly, but I do not hink using neutral byte-order in this part of the processing buys us anything in this particular case. > In contrast, the new union introduced in "next" is just horrid. There's > not even any way to know which member to use, except apparently that it > expects that a SHA1 is never zero in the last 12 bytes. Which is probably > true, but still - that's some ugly stuff. Again I agree with the ugliness objection, and I raised the "expecting no collision" issue which Nico refuted later. The code is probably safe (not just safe in practice but safe in theory as well), although that would not change the fact that it is ugly X-<. Nico, could we have an un-uglied version please? ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: heads-up: git-index-pack in "next" is broken 2006-10-17 21:54 ` Junio C Hamano @ 2006-10-18 1:38 ` Nicolas Pitre 0 siblings, 0 replies; 33+ messages in thread From: Nicolas Pitre @ 2006-10-18 1:38 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, Sergey Vlasov, git On Tue, 17 Oct 2006, Junio C Hamano wrote: > Linus Torvalds <torvalds@osdl.org> writes: > > > On Tue, 17 Oct 2006, Nicolas Pitre wrote: > >> On Tue, 17 Oct 2006, Sergey Vlasov wrote: > >> > > >> > Yes, on x86_64 this is 24 because of 8-byte alignment for longs: > >> > >> Ah bummer. Then this is most likely the cause. And here's a simple > >> fix (Junio please confirm): > > > > Why do you use "unsigned long" in the first place? > > > > For some structure like this, it sounds positively wrong. Pack-files > > should be architecture-neutral, which means that they shouldn't depend on > > word-size, and they should be in some neutral byte-order. > > This is an in-core structure if I am not mistaken, and is never > written out in the resulting pack file. The program is reading > from .pack and building .idx that corresponds to it. I agree > that it is ugly, but I do not hink using neutral byte-order in > this part of the processing buys us anything in this particular > case. Exact. > > In contrast, the new union introduced in "next" is just horrid. There's > > not even any way to know which member to use, except apparently that it > > expects that a SHA1 is never zero in the last 12 bytes. Which is probably > > true, but still - that's some ugly stuff. > > Again I agree with the ugliness objection, and I raised the > "expecting no collision" issue which Nico refuted later. The > code is probably safe (not just safe in practice but safe in > theory as well), although that would not change the fact that it > is ugly X-<. The problem is that I just don't see what you find "ugly" in the thing. Maybe it is just a matter of few missing comments? Please let me walk you through bits of the code. I hope you'll understand better why it is like it is after that. In parse_pack_objects() you can read this: /* * First pass: * - find locations of all objects; * - calculate SHA1 of all non-delta objects; * - remember base SHA1 for all deltas. */ So we reads through the whole pack to find where objects are located. This information is stored in the objects[] array for all objects. Entries for that array are: struct object_entry { unsigned long offset; enum object_type type; enum object_type real_type; unsigned char sha1[20]; }; ON that first pass only the offset and type can be determined for all objects. The sha1 can be determined for non delta objects only. There is a second array, deltas[], to record information about deltas as they are found: struct delta_entry { struct object_entry *obj; union delta_base base; }; The "obj" member points to the entry in the objects[] array for given delta entry. And the "base" member, as you may guess, is a reference to the base object for that delta. So here we are with that dreaded union. Now why isn't it a second struct object_entry pointer instead? Because one kind of delta allows for its base object to appear later in the pack and all we've got for identifying that base object is a sha1 reference since we cannot predict where the base object will be. With the other object type we _could_ find out about the location of the base object right away. But that would mean adding a special case for those objects through a different code path unnecessarily. Instead, the base offset is treated just like a very special sha1 value and the same array is used. So maybe the comment that says "remember base SHA1 for all deltas" could be adjusted to mention "base sha1 or offset" to be exact. Now what next: /* Sort deltas by base SHA1/offset for fast searching */ qsort(deltas, nr_deltas, sizeof(struct delta_entry), compare_delta_entry); This comment, though, is extremely accurate. We sort delta entries "for fast searching". There is no other use for the union blob. To avoid the union there could have been two separate delta arrays: one for OBJ_REF_DELTA and another for OBJ_OFS_DELTA. Then two compare_delta_entry functions whould have been needed instead of only one. But because we only use the union blob for searching given another such union blob for target, we actually don't care if the union content is a sha1 or an offset, and we don't care if that offset is 32-bit, 64-bit, little endian or big endian. All that we care about is that the delta entries are sorted according to that union blob for fast binary search. That is really all there is about it and therefore a single qsort() does the trick. Next: /* * Second pass: * - for all non-delta objects, look if it is used as a base for * deltas; * - if used as a base, uncompress the object and apply all deltas, * recursively checking if the resulting object is used as a base * for some more deltas. */ Here we want to fill out the blanks in the objects[] array for delta objects. It is of course much more efficient to start from a given base object, inflate it, and use it successively on all delta objects that depend on it. And let's do it recursively if the delta is also a base object itself while at it. So how does the dreaded union comes into play? It is simple. For each non-delta objects, we create two union blobs. One with the sha1 member, the other with the offset member. Those blobs are used as the key to search for a range of deltas that have that non-delta object for base. This has the possibility to find two ranges: one for deltas with base as sha1, and the other range for deltas with offsets as sha1. Once the delta ranges are known, the deltas are applied on that base object, the resulting object's sha1 is computed and the whole thing is repeated for those resulting objects as potential base objects for more deltas. Junio mentioned the possibility for collision between a union value that should be treated as a sha1 and a union value that should be treated as a delta base offset. Although such a collision is extremely improbable, it is still possible and if so the wrong delta type could be applied on top of the wrong base object. To avoid this issue, a test is made on the delta type before it is replayed. So if the search key was created using the sha1 union member, we make sure that it is used with OBJ_REF_DELTA objects only. And ditto for the search key created with the offset union member. This way there just can not be any mistake even if there happens to be a collision in the base object reference namespace. Finally, the objects[] array is completely populated with objects sha1 and offsets. The dreaded union is completely forgotten (the deltas array is actually freed). And only then the pack index is written out to disk. > Nico, could we have an un-uglied version please? I'm really sorry to say that I don't know how I could make it less "ugly". To me it is beautiful and efficient as is and I don't see why it needs to be done another way. If you still think otherwise then your guidance would be highly appreciated. Nicolas ^ permalink raw reply [flat|nested] 33+ messages in thread
end of thread, other threads:[~2006-10-18 22:34 UTC | newest] Thread overview: 33+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-10-17 4:55 heads-up: git-index-pack in "next" is broken Junio C Hamano 2006-10-17 15:39 ` Nicolas Pitre 2006-10-17 16:07 ` Junio C Hamano 2006-10-17 17:00 ` Nicolas Pitre 2006-10-17 18:11 ` Junio C Hamano 2006-10-17 18:47 ` Nicolas Pitre 2006-10-17 19:36 ` Sergey Vlasov 2006-10-17 20:10 ` Junio C Hamano 2006-10-17 20:25 ` Nicolas Pitre 2006-10-17 20:23 ` Nicolas Pitre 2006-10-17 20:51 ` Linus Torvalds 2006-10-17 21:21 ` Nicolas Pitre 2006-10-17 21:46 ` Linus Torvalds 2006-10-18 0:20 ` Nicolas Pitre 2006-10-18 0:57 ` Linus Torvalds 2006-10-18 2:08 ` Nicolas Pitre 2006-10-18 3:12 ` Linus Torvalds 2006-10-18 6:09 ` Davide Libenzi 2006-10-18 14:56 ` Linus Torvalds 2006-10-18 16:17 ` Davide Libenzi 2006-10-18 16:52 ` Linus Torvalds 2006-10-18 21:21 ` Davide Libenzi 2006-10-18 21:48 ` Linus Torvalds 2006-10-18 22:34 ` Davide Libenzi 2006-10-18 1:30 ` Junio C Hamano 2006-10-18 2:23 ` Nicolas Pitre 2006-10-18 4:16 ` Junio C Hamano 2006-10-18 5:07 ` Junio C Hamano 2006-10-18 10:00 ` Johannes Schindelin 2006-10-18 13:13 ` Nicolas Pitre 2006-10-18 13:02 ` Nicolas Pitre 2006-10-17 21:54 ` Junio C Hamano 2006-10-18 1:38 ` Nicolas Pitre
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).