git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Shawn O. Pearce" <spearce@spearce.org>
Cc: Mike Hommey <mh@glandium.org>, David Kastrup <dak@gnu.org>,
	git@vger.kernel.org
Subject: Re: confused about preserved permissions
Date: Mon, 20 Aug 2007 19:06:27 -0700	[thread overview]
Message-ID: <7vejhxproc.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: <20070821013541.GC27913@spearce.org> (Shawn O. Pearce's message of "Mon, 20 Aug 2007 21:35:41 -0400")

"Shawn O. Pearce" <spearce@spearce.org> writes:

> Mike Hommey <mh@glandium.org> wrote:
>> > >> > sprintf "%06o %s\0%s", $mode, $file, pack("H[40]", $sha1)
>> 
>> The question here was why the permissions are encoded with "%06o" while
>> the hash is packed. Anyways, it's just a boring detail.
>
> Because text format is simple and pretty much everyone understands
> it, especially when you are talking about UNIX mode/permission
> bits in octal, the name is "text", and then oh, wait, those 40
> bytes of hex is a lot of data - we'll just make that 20 bytes of
> binary instead.  :-)

That is almost true, but there is one factual error I need to
correct.

In-tree representation of the mode is not actually "%06o" but
just "%o".  In very early days of git, we used to have extra
leading "0" in trees (e.g. "040000"), but that is something
modern fsck even warns about.  IOW, it is not the norm.

It is represented as text because we _can_ add any number of
bits to the data later if we wanted to.  Basic tree objects that
contain only the kind of data we traditionally used will
continue to work, while trees that contain (yet to be invented)
new types that are represented with longer mode bits may of
course not be read by older tools.

On the other hand, by definition, SHA-1 is 20-bytes.  If we
wanted to be able to replace hash function, we _could_ have done
hashtype + length + data format (and length is represented with
either text or "7-bit-per-byte plus stop bit" format as in the
pack format), but there was no reason to.

The same logic applies to the loose object header -- length is
not "network byte order 4-byte integer" (or 8-byte), but just a
textual representation of an unsigned integer of unspecified
length.  The current code happen to use "%lu" with ulong only
because that is the largest integral type that can be used
portably and is not cumbersome to use.  On future architectures
with larger word size, we do not have to update the data
structure definition nor even the code to read and write loose
objects.  Using a blob that is longer than 2^64 bytes may or may
not be possible depending on how long your ulong is, of course.

  reply	other threads:[~2007-08-21  2:06 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-20 16:44 confused about preserved permissions martin f krafft
2007-08-20 16:54 ` Pierre Habouzit
2007-08-20 17:38   ` martin f krafft
2007-08-20 17:41 ` Mike Hommey
2007-08-20 17:58   ` David Kastrup
2007-08-20 18:13     ` Mike Hommey
2007-08-20 18:44       ` David Kastrup
     [not found]       ` <86zm0mgicy.fsf@lola.quinscape.zz>
2007-08-20 18:48         ` Mike Hommey
2007-08-20 19:43           ` Jan Hudec
2007-08-20 19:50             ` Mike Hommey
2007-08-20 20:07               ` Alex Riesen
2007-08-20 20:10                 ` Mike Hommey
2007-08-20 20:27                   ` Jan Hudec
2007-08-20 20:42                   ` David Kastrup
2007-08-20 20:44                     ` Mike Hommey
2007-08-20 20:08               ` Jan Hudec
2007-08-20 20:39           ` David Kastrup
2007-08-20 20:50             ` Mike Hommey
2007-08-20 21:03               ` David Kastrup
2007-08-21  1:35               ` Shawn O. Pearce
2007-08-21  2:06                 ` Junio C Hamano [this message]
2007-08-21  5:34                   ` Mike Hommey
2007-08-21  6:04                     ` David Kastrup
2007-08-21 18:01   ` René Scharfe
2007-08-21 18:01   ` [PATCH] Documentation: update tar.umask default René Scharfe
2007-08-21 21:15     ` Mike Hommey
2007-08-22 21:03       ` René Scharfe
2007-08-20 18:35 ` confused about preserved permissions Alex Riesen
2007-08-22 12:18 ` Benoit SIGOURE
2007-08-22 12:52   ` Johannes Sixt
2007-08-22 22:09   ` Junio C Hamano
2007-08-23  6:00     ` martin f krafft
2007-08-23  6:12       ` David Kastrup
2007-08-23  6:23         ` martin f krafft
2007-08-23  7:48         ` Benoit SIGOURE
2007-08-23  7:57           ` Junio C Hamano
2007-08-23  8:08             ` David Kastrup
2007-09-03 18:59           ` Jan Hudec
2007-08-23  7:03       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vejhxproc.fsf@gitster.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=dak@gnu.org \
    --cc=git@vger.kernel.org \
    --cc=mh@glandium.org \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).