Re: Lets avoid the SHA-1 term (was [doc] User Manual Suggestion)

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Michael J Gruber <git@drmicha.warpmail.net>
To: Felipe Contreras <felipe.contreras@gmail.com>
Cc: "Björn Steinbrink" <B.Steinbrink@gmx.de>,
	"David Abrahams" <dave@boostpro.com>,
	"Michael Witten" <mfwitten@gmail.com>,
	"Jeff King" <peff@peff.net>,
	"Daniel Barkalow" <barkalow@iabervon.org>,
	"Johan Herland" <johan@herland.net>,
	git@vger.kernel.org, "J. Bruce Fields" <bfields@fieldses.org>,
	"Johannes Sixt" <j.sixt@viscovery.net>,
	"Wincent Colaiuta" <win@wincent.com>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Dmitry Potapov" <dpotapov@gmail.com>
Subject: Re: Lets avoid the SHA-1 term (was [doc] User Manual Suggestion)
Date: Mon, 27 Apr 2009 14:06:25 +0200	[thread overview]
Message-ID: <49F59FC1.5020708@drmicha.warpmail.net> (raw)
In-Reply-To: <94a0d4530904261638o6cbda368p4f3aa641505a6768@mail.gmail.com>

Felipe Contreras venit, vidit, dixit 27.04.2009 01:38:
> 2009/4/27 Björn Steinbrink <B.Steinbrink@gmx.de>:
>> On 2009.04.24 20:48:57 -0400, David Abrahams wrote:
>>>
>>> On Apr 24, 2009, at 8:01 PM, Michael Witten wrote:
>>>
>>>>> What's wrong with just calling the object name "object name"?
>>>>
>>>> What's wrong with calling the object address "object address"?
>>>
>>> Neither captures the connection to the object's contents.  I think
>>> "value ID" would be closer, but it's probably too horrible.
>>
>> I think I asked this in another mail, but I'm quite tired, so just to
>> make sure: What do you mean by "value"? I might be weird (I'm not a
>> native speaker, so I probably make funny and wrong connotations from
>> time to time), but while I can accept "content" to include the type and
>> size of the object, the term "value" makes me want to exclude those
>> pieces of meta data. So "value" somehow feels wrong to me, as the hash
>> covers those two fields.
> 
> Just to summarize.
> 
> Do you agree that SHA-1 is not the proper term to choose?
> 
> Do you agree that either 'id' or 'hash' would work fine?
> 
> Personally I think there's an advantage of choosing 'hash'; if we pick
> 'id' then the user might think that he can change the contents of the
> object while keeping the same id, if we pick 'hash' then it's obvious
> the 'id' is tied to the content and why.
> 

Apparently a branch of that thread touched the "[PATCH 0/2] Unify use of
[sha,SHA][,-]1", so I'll do a cc merge, feeling entitled to summarize
the latter:

- There are two SHA-1ish things we talk about: the SHA-1 hash
algorithm/function on the one hand and git object names on the other hand.

- The object name of a file is not the SHA-1 checksum of its contents:
That's more or less obvious because there are no files in git, only
objects. The object name is the SHA-1 of a representation of an object
(which, for blobs, consists of header + content).

- There seemed to be an implicit claim that the Doc uses SHA-1 for the
algorithm and sha1/SHA1 for the object name. That's not founded by facts
(see below) and is not practical.

- The glossary defines SHA1 to be equivalent to the object name and does
not mention any other spelling.

The stats (line counts for simplicity) and facts for Documentation/ are:

SHA-1: 56
Used exclusively for the object name.

SHA1: 73
Used mostly for the object name, but also for the patch-id (SHA-1
checksum of patch), in the tutorial, and pack-format, i.e. in places
where the actual hash algorithm/function is mentioned.

sha1: 102
Used all over the place, mostly for the object name and when quoting
from the source. I don't think it's used for the hash algorithm/function.

sha-1: 0

So, the current confusion is mostly due to the fact that 3 different
names are used for the same thing (object name) and to a much lesser
degree to the fact that the same name (SHA1) is used for 2 different
things (hash algorithm/function vs. object name).

My patch tried to lessen the confusion by naming one thing by 1 name
only (SHA-1). It continued the tradition of identifying the object name
with the hash algorithm which is used in forming that name. I don't
think it matters much (confusion-wise) which one we choose from those 3,
it would be easy to rewrite the patch to use SHA1 or sha1 instead of
SHA-1 (and I'd be willing to), but consistently so.

An alternative patch would substitute most occurrences of the above by
X, X being the future term for "object name" to be agreed upon, and go
for say SHA-1 at the very few places where the actual algorithm is
mentioned. I just don't want to bet on that agreement and patch happening.

Michael

     prev parent reply	other threads:[~2009-04-27 12:06 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-26 23:38 Lets avoid the SHA-1 term (was [doc] User Manual Suggestion) Felipe Contreras
2009-04-27  0:28 ` Björn Steinbrink
2009-04-27 13:02   ` Michael Witten
2009-05-02 15:37     ` Björn Steinbrink
2009-04-27 12:06 ` Michael J Gruber [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49F59FC1.5020708@drmicha.warpmail.net \
    --to=git@drmicha.warpmail.net \
    --cc=B.Steinbrink@gmx.de \
    --cc=barkalow@iabervon.org \
    --cc=bfields@fieldses.org \
    --cc=dave@boostpro.com \
    --cc=dpotapov@gmail.com \
    --cc=felipe.contreras@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j.sixt@viscovery.net \
    --cc=johan@herland.net \
    --cc=mfwitten@gmail.com \
    --cc=peff@peff.net \
    --cc=win@wincent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).