From: Michael J Gruber <git@drmicha.warpmail.net>
To: Felipe Contreras <felipe.contreras@gmail.com>
Cc: "Björn Steinbrink" <B.Steinbrink@gmx.de>,
"David Abrahams" <dave@boostpro.com>,
"Michael Witten" <mfwitten@gmail.com>,
"Jeff King" <peff@peff.net>,
"Daniel Barkalow" <barkalow@iabervon.org>,
"Johan Herland" <johan@herland.net>,
git@vger.kernel.org, "J. Bruce Fields" <bfields@fieldses.org>,
"Johannes Sixt" <j.sixt@viscovery.net>,
"Wincent Colaiuta" <win@wincent.com>,
"Junio C Hamano" <gitster@pobox.com>,
"Dmitry Potapov" <dpotapov@gmail.com>
Subject: Re: Lets avoid the SHA-1 term (was [doc] User Manual Suggestion)
Date: Mon, 27 Apr 2009 14:06:25 +0200 [thread overview]
Message-ID: <49F59FC1.5020708@drmicha.warpmail.net> (raw)
In-Reply-To: <94a0d4530904261638o6cbda368p4f3aa641505a6768@mail.gmail.com>
Felipe Contreras venit, vidit, dixit 27.04.2009 01:38:
> 2009/4/27 Björn Steinbrink <B.Steinbrink@gmx.de>:
>> On 2009.04.24 20:48:57 -0400, David Abrahams wrote:
>>>
>>> On Apr 24, 2009, at 8:01 PM, Michael Witten wrote:
>>>
>>>>> What's wrong with just calling the object name "object name"?
>>>>
>>>> What's wrong with calling the object address "object address"?
>>>
>>> Neither captures the connection to the object's contents. I think
>>> "value ID" would be closer, but it's probably too horrible.
>>
>> I think I asked this in another mail, but I'm quite tired, so just to
>> make sure: What do you mean by "value"? I might be weird (I'm not a
>> native speaker, so I probably make funny and wrong connotations from
>> time to time), but while I can accept "content" to include the type and
>> size of the object, the term "value" makes me want to exclude those
>> pieces of meta data. So "value" somehow feels wrong to me, as the hash
>> covers those two fields.
>
> Just to summarize.
>
> Do you agree that SHA-1 is not the proper term to choose?
>
> Do you agree that either 'id' or 'hash' would work fine?
>
> Personally I think there's an advantage of choosing 'hash'; if we pick
> 'id' then the user might think that he can change the contents of the
> object while keeping the same id, if we pick 'hash' then it's obvious
> the 'id' is tied to the content and why.
>
Apparently a branch of that thread touched the "[PATCH 0/2] Unify use of
[sha,SHA][,-]1", so I'll do a cc merge, feeling entitled to summarize
the latter:
- There are two SHA-1ish things we talk about: the SHA-1 hash
algorithm/function on the one hand and git object names on the other hand.
- The object name of a file is not the SHA-1 checksum of its contents:
That's more or less obvious because there are no files in git, only
objects. The object name is the SHA-1 of a representation of an object
(which, for blobs, consists of header + content).
- There seemed to be an implicit claim that the Doc uses SHA-1 for the
algorithm and sha1/SHA1 for the object name. That's not founded by facts
(see below) and is not practical.
- The glossary defines SHA1 to be equivalent to the object name and does
not mention any other spelling.
The stats (line counts for simplicity) and facts for Documentation/ are:
SHA-1: 56
Used exclusively for the object name.
SHA1: 73
Used mostly for the object name, but also for the patch-id (SHA-1
checksum of patch), in the tutorial, and pack-format, i.e. in places
where the actual hash algorithm/function is mentioned.
sha1: 102
Used all over the place, mostly for the object name and when quoting
from the source. I don't think it's used for the hash algorithm/function.
sha-1: 0
So, the current confusion is mostly due to the fact that 3 different
names are used for the same thing (object name) and to a much lesser
degree to the fact that the same name (SHA1) is used for 2 different
things (hash algorithm/function vs. object name).
My patch tried to lessen the confusion by naming one thing by 1 name
only (SHA-1). It continued the tradition of identifying the object name
with the hash algorithm which is used in forming that name. I don't
think it matters much (confusion-wise) which one we choose from those 3,
it would be easy to rewrite the patch to use SHA1 or sha1 instead of
SHA-1 (and I'd be willing to), but consistently so.
An alternative patch would substitute most occurrences of the above by
X, X being the future term for "object name" to be agreed upon, and go
for say SHA-1 at the very few places where the actual algorithm is
mentioned. I just don't want to bet on that agreement and patch happening.
Michael
prev parent reply other threads:[~2009-04-27 12:06 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-26 23:38 Lets avoid the SHA-1 term (was [doc] User Manual Suggestion) Felipe Contreras
2009-04-27 0:28 ` Björn Steinbrink
2009-04-27 13:02 ` Michael Witten
2009-05-02 15:37 ` Björn Steinbrink
2009-04-27 12:06 ` Michael J Gruber [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49F59FC1.5020708@drmicha.warpmail.net \
--to=git@drmicha.warpmail.net \
--cc=B.Steinbrink@gmx.de \
--cc=barkalow@iabervon.org \
--cc=bfields@fieldses.org \
--cc=dave@boostpro.com \
--cc=dpotapov@gmail.com \
--cc=felipe.contreras@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=j.sixt@viscovery.net \
--cc=johan@herland.net \
--cc=mfwitten@gmail.com \
--cc=peff@peff.net \
--cc=win@wincent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).