git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicolas Pitre <nico@cam.org>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Sergey Vlasov <vsu@altlinux.ru>, Junio C Hamano <junkio@cox.net>,
	git@vger.kernel.org
Subject: Re: heads-up: git-index-pack in "next" is broken
Date: Tue, 17 Oct 2006 20:20:09 -0400 (EDT)	[thread overview]
Message-ID: <Pine.LNX.4.64.0610171959070.1971@xanadu.home> (raw)
In-Reply-To: <Pine.LNX.4.64.0610171440080.3962@g5.osdl.org>

On Tue, 17 Oct 2006, Linus Torvalds wrote:

> 
> 
> On Tue, 17 Oct 2006, Nicolas Pitre wrote:
> > 
> > Because offsets into packs are expressed as unsigned long everywhere 
> > else (except in the current pack index on-disk format).
> 
> Until your work, that "unsigned long" was totally just an internal thing 
> that didn't actually bleed into anything else.

And would you please explain how my work changes that state of affairs?
Sorry but I don't follow you here.  Still _I_ wrote that code.

> > > For some structure like this, it sounds positively wrong. Pack-files 
> > > should be architecture-neutral, which means that they shouldn't depend on 
> > > word-size, and they should be in some neutral byte-order.
> > 
> > But they do.  Please consider this code:
> 
> Right. The pack-file itself. But the code that actually _generates_ it 
> mixes things in alarming ways.

???

> > > In contrast, the new union introduced in "next" is just horrid. There's 
> > > not even any way to know which member to use, except apparently that it 
> > > expects that a SHA1 is never zero in the last 12 bytes. Which is probably 
> > > true, but still - that's some ugly stuff.
> > 
> > This union should be looked at just like a sortable hash pointing to a 
> > base object so that deltas with the same base object can be sorted 
> > together.
> 
> .. and it sorts _differently_ on a big-endian vs little-endian thing, 
> doesn't it?

Sure.  But who cares?  The sorting is just there to 1) perform binary 
searches on the list of deltas based from a given object, and 2) find a 
list of all deltas with the same base object.

> So now the sort order depends on endianness and/or wordsize. That just 
> sounds really really wrong.

Again, who cares?  That ordering doesn't influence any data produced by 
the tool.  It is an internal and private strategy to speed up the 
_local_ _searching_ process.  It could be replaced by a dumb linear 
list walk if you wish and the end result i.e. 
the produced pack index would be exactly the same (with a significant 
slowdown notwitstanding).

So let me summarize:

 - the union is a hash.

 - the hash is either an offset value or a sha1 digest.

 - this hash is used for fast object lookup _only_.

 - it does sort differently on big vs little endian machines.

 - but we don't care at all because

 - it is a private algorithmic thing that doesn't "bleed" into any 
   _real_ data structure, and

 - it doesn't have any influence on the format of the end result.

 - it is only a runtime abstraction and nothing else.

 - It never gets into the pack nor the pack index themselves.

Do you still have issues with that?


Nicolas

  reply	other threads:[~2006-10-18  0:20 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-10-17  4:55 heads-up: git-index-pack in "next" is broken Junio C Hamano
2006-10-17 15:39 ` Nicolas Pitre
2006-10-17 16:07   ` Junio C Hamano
2006-10-17 17:00     ` Nicolas Pitre
2006-10-17 18:11       ` Junio C Hamano
2006-10-17 18:47         ` Nicolas Pitre
2006-10-17 19:36           ` Sergey Vlasov
2006-10-17 20:10             ` Junio C Hamano
2006-10-17 20:25               ` Nicolas Pitre
2006-10-17 20:23             ` Nicolas Pitre
2006-10-17 20:51               ` Linus Torvalds
2006-10-17 21:21                 ` Nicolas Pitre
2006-10-17 21:46                   ` Linus Torvalds
2006-10-18  0:20                     ` Nicolas Pitre [this message]
2006-10-18  0:57                       ` Linus Torvalds
2006-10-18  2:08                         ` Nicolas Pitre
2006-10-18  3:12                           ` Linus Torvalds
2006-10-18  6:09                             ` Davide Libenzi
2006-10-18 14:56                               ` Linus Torvalds
2006-10-18 16:17                                 ` Davide Libenzi
2006-10-18 16:52                                   ` Linus Torvalds
2006-10-18 21:21                                     ` Davide Libenzi
2006-10-18 21:48                                       ` Linus Torvalds
2006-10-18 22:34                                         ` Davide Libenzi
2006-10-18  1:30                       ` Junio C Hamano
2006-10-18  2:23                         ` Nicolas Pitre
2006-10-18  4:16                           ` Junio C Hamano
2006-10-18  5:07                             ` Junio C Hamano
2006-10-18 10:00                               ` Johannes Schindelin
2006-10-18 13:13                                 ` Nicolas Pitre
2006-10-18 13:02                               ` Nicolas Pitre
2006-10-17 21:54                 ` Junio C Hamano
2006-10-18  1:38                   ` Nicolas Pitre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0610171959070.1971@xanadu.home \
    --to=nico@cam.org \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=torvalds@osdl.org \
    --cc=vsu@altlinux.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).