git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: "brian m. carlson" <sandals@crustytoothpaste.net>, git@vger.kernel.org
Subject: Re: [RFC PATCH 0/9] Use a structure for object IDs.
Date: Sun, 04 May 2014 08:35:00 +0200	[thread overview]
Message-ID: <5365DF94.9060707@alum.mit.edu> (raw)
In-Reply-To: <1399147942-165308-1-git-send-email-sandals@crustytoothpaste.net>

On 05/03/2014 10:12 PM, brian m. carlson wrote:
> This is a preliminary RFC patch series to move all the relevant uses of
> unsigned char [20] to struct object_id.  It should not be applied to any
> branch yet.
> 
> The goal of this series to improve type-checking in the codebase and to
> make it easier to move to a different hash function if the project
> decides to do that.  This series does not convert all of the codebase,
> but only parts.  I'm looking for feedback to see if there is consensus
> that this is the right direction before investing a large amount of
> time.
> 
> Certain parts of the code have to be converted before others to keep the
> patch sizes small, maintainable, and bisectable, so functions and
> structures that are used across the codebase (e.g. hashcmp and struct
> object) will be converted later.  Conversion has been done in a roughly
> alphabetical order by name of file.
> 
> The constants for raw and hex sizes of SHA-1 values are maintained.
> These constants are used where the quantity is the size of an SHA-1
> value, and sizeof(struct object_id) is used wherever memory is to be
> allocated.  This is done to permit the struct to turn into a union later
> if multiple hashes are supported.  I left the names at GIT_OID_RAWSZ and
> GIT_OID_HEXSZ because that's what libgit2 uses and what Junio seemed to
> prefer, but they can be changed later if there's a desire to do that.
> 
> I called the structure member "oid" because it was easily grepable and
> distinct from the rest of the codebase.  It, too, can be changed if we
> decide on a better name.  I specifically did not choose "sha1" since it
> looks weird to have "sha1->sha1" and I didn't want to rename lots of
> variables.

That means that we will have sha1->oid all over the place, right?
That's unfortunate, because it is exactly backwards from what we would
want in a hypothetical future where OIDs are not necessarily SHA-1s.  In
that future we would certainly have to support SHA-1s in parallel with
the new hash.  So (in that hypothetical future) we will probably want
these expressions to look like oid->sha1, to allow, say, a second struct
or union field oid->sha256 [1].

If that future would come to pass, then we would also want to have
distinct constants like GIT_SHA1_RAWSZ and GIT_SHA256_RAWSZ rather than
the generically-named GIT_OID_RAWSZ.

I think that this patch series will improve the code clarity and type
safety independent of thoughts about supporting different hash
algorithms, so I'm not objecting to your naming decision.  But *if* such
support is part of your long-term hope, then you might ease the future
transition by choosing different names now.

(Maybe renaming local variables "sha1 -> oid" might be a handy way of
making clear which code has been converted to the new style.)

Just to be clear, the above are just some random thoughts for your
consideration, but feel free to disregard them.

In any case, it sure will be a lot of code churn.  If you succeed in
this project, then "git blame" will probably consider you the author of
about 2/3 of git :-)

Michael

[1] I'm certainly not advocating that we want to support a different
hash, let alone that that hash should be SHA-256; these examples are
just for illustration.

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

  parent reply	other threads:[~2014-05-04  6:35 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-03 20:12 [RFC PATCH 0/9] Use a structure for object IDs brian m. carlson
2014-05-03 20:12 ` [PATCH 1/9] Define " brian m. carlson
2014-05-04  6:07   ` Michael Haggerty
2014-05-04  9:21     ` Johannes Sixt
2014-05-04  9:43       ` David Kastrup
2014-05-04 10:55       ` Andreas Schwab
2014-05-04 20:18         ` Johannes Sixt
2014-05-04 21:31           ` Andreas Schwab
2014-05-05  5:19             ` David Kastrup
2014-05-05  9:23               ` Andreas Schwab
2014-05-05  9:33                 ` James Denholm
2014-05-05  9:50                 ` David Kastrup
2014-05-05 10:52                   ` Michael Haggerty
2014-05-05 11:05                   ` Andreas Schwab
2014-05-05 11:23                     ` David Kastrup
2014-05-05 18:16                     ` Felipe Contreras
2014-05-04 12:29     ` Duy Nguyen
2014-05-04 16:07     ` brian m. carlson
2014-05-04 16:48       ` Andreas Schwab
2014-05-04 17:07         ` David Kastrup
2014-05-04 17:24           ` Andreas Schwab
2014-05-04 17:44             ` David Kastrup
2014-05-04 18:01               ` Andreas Schwab
2014-05-03 20:12 ` [PATCH 2/9] bisect.c: convert to use struct object_id brian m. carlson
2014-05-03 20:12 ` [PATCH 3/9] archive.c: " brian m. carlson
2014-05-03 20:12 ` [PATCH 4/9] zip: use GIT_OID_HEXSZ for trailers brian m. carlson
2014-05-03 20:12 ` [PATCH 5/9] branch.c: convert to use struct object_id brian m. carlson
2014-05-03 20:12 ` [PATCH 6/9] bulk-checkin.c: " brian m. carlson
2014-05-03 20:12 ` [PATCH 7/9] bundle.c: convert leaf functions to " brian m. carlson
2014-05-06 14:42   ` Michael Haggerty
2014-05-03 20:12 ` [PATCH 8/9] cache-tree: convert struct cache_tree to use object_id brian m. carlson
2014-05-06 14:53   ` Michael Haggerty
2014-05-06 15:02   ` Michael Haggerty
2014-05-03 20:12 ` [PATCH 9/9] diff: convert struct combine_diff_path to object_id brian m. carlson
2014-05-06 15:08   ` Michael Haggerty
2014-05-03 22:49 ` [RFC PATCH 0/9] Use a structure for object IDs brian m. carlson
2014-05-04  6:35 ` Michael Haggerty [this message]
2014-05-04  9:19   ` Johannes Sixt
2014-05-04 17:54   ` brian m. carlson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5365DF94.9060707@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).