All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>,
	"git\@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Migrating away from SHA-1?
Date: Tue, 12 Apr 2016 18:03:02 -0700	[thread overview]
Message-ID: <xmqqlh4imibd.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <20160412234251.GB2210@sigill.intra.peff.net> (Jeff King's message of "Tue, 12 Apr 2016 19:42:52 -0400")

Jeff King <peff@peff.net> writes:

> So a slightly nicer thing is to parameterize the algorithm for every
> object name reference. So commits look like:
>
>   tree sha256:1234abcd...
>   parent sha256:1234abcd...
>
> and so on. Of course trees don't have any space for this; they have a
> fixed-length for the hash part of each record, which is basically:
>
>   <mode> <name> NUL <20-byte-sha1>
>
> So we'd probably need a "treev2" object type that gives room for an
> algorithm byte (or we'd have to try to shove it into the mode, but since
> old versions won't know the new algorithm anyway, I don't think it
> solves that much...). Or you can just define for the whole tree object
> (either implicit in its type, or in a header) that it always uses
> algorithm X.

This will hurt the performance a lot during the transition period as
it no longer will be possible to rely on "most of the time a fine
grained commit changes only a small part of the tree, and we can
cheaply avoid descending into trees that haven't changed because we
can tell that the corresponding tree objects in the pre- and post-
trees have the same object name" optimization.  But we cannot avoid
it.

> Transitioning to that would be something like:
>
>   0. Overhaul all of the git code to handle arbitrary-sized object ids.
>
>   1. Decide on the new algorithm and implement it in git.
>
>   2. Recognize parameterized object ids in commits and tags (designing
>      format, implementing the reading side).
>
>   3. Recognize parameterized object ids somehow in trees (designing
>      format, implementing the reading side).
>
>   4. Teach the object database to index objects by the new algorithm (or
>      possibly both algorithms).
>
>   5. Add a protocol extension so that both sides can decide which
>      algorithm is being used when they talk about oids.
>
>   6. Add a config option to write references in objects using the new
>      algorithm.
>
>   7. After a while, flip the config option on. Hopefully the readers
>      from steps 1-5 have percolated to the masses by then, and it's not
>      a horrible flag day.
>
> We're basically on step 0 right now. I'm sure I'm missing some
> subtleties in there, too.

One subtlety is that 7. "not a flag day" may not be a good thing.

There has to be a section of a history that spans the transition,
set of commits and trees that have pointers to both kinds of object
names.  The narrower such a section of the history, the more
pleasant to use the result of the transition would be.

Different projects that can have their own flag days at their own
pace is a good thing, so the above observation does not invalidate
your transition plan, though.

  reply	other threads:[~2016-04-13  1:03 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-12 22:38 Migrating away from SHA-1? H. Peter Anvin
2016-04-12 23:00 ` Stefan Beller
2016-04-12 23:06   ` H. Peter Anvin
2016-04-12 23:15   ` Jeff King
2016-04-12 23:15   ` David Turner
2016-04-12 23:44     ` Jeff King
2016-04-14  1:53     ` Theodore Ts'o
2016-04-14 16:47       ` Joey Hess
2016-04-14 17:23       ` David Turner
2016-04-14 17:28         ` H. Peter Anvin
2016-04-14 22:40           ` Theodore Ts'o
2016-04-15  2:13             ` Jeff King
2016-04-15  2:18               ` Junio C Hamano
2016-04-15  2:22                 ` Jeff King
2016-04-12 23:42 ` Jeff King
2016-04-13  1:03   ` Junio C Hamano [this message]
2016-04-13  1:36     ` Jeff King
2016-04-13  1:38     ` H. Peter Anvin
2016-04-13  1:51 ` Duy Nguyen
2016-04-13  1:58   ` H. Peter Anvin
2016-04-15  1:50     ` brian m. carlson
  -- strict thread matches above, loose matches on Subject: below --
2016-06-18  2:10 Leo Gaspard
2016-06-18  3:30 ` Eric Wong
2016-06-24 18:17 ` brian m. carlson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqlh4imibd.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=hpa@zytor.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.