git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Johan Herland <johan@herland.net>,
	Jon Seymour <jon.seymour@gmail.com>,
	git@vger.kernel.org
Subject: Re: A generalization of git notes from blobs to trees - git metadata?
Date: Sun, 7 Feb 2010 00:02:55 -0500	[thread overview]
Message-ID: <20100207050255.GA17049@coredump.intra.peff.net> (raw)
In-Reply-To: <7v1vgxlr9q.fsf@alter.siamese.dyndns.org>

On Sat, Feb 06, 2010 at 06:21:37PM -0800, Junio C Hamano wrote:

> Johan Herland <johan@herland.net> writes:
> 
> > Furthermore, although we currently assume that all note objects are blobs, 
> > someone (who?) has already suggested (as mentioned in the notes TODO list) 
> > that a note object could also be a _tree_ object that can be unpacked/read 
> > to reveal further "sub-notes".
> 
> I would advice you not to go there.  How would you even _merge_ such a
> thing with other notes attached to the same object?  What determines the
> path in that tree object?
> 
> Clueless ones can freely make misguided suggestions without thinking
> things through and make things unnecessarily complex without real gain.
> You do not have to listen to every one of them.

I think I may have been the one to suggest trees or notes at one point.
But let me clarify that this is not exactly what the OP is proposing in
this thread.

My suggestion was that some use cases may have many key/value pairs of
notes for a single sha1. We basically have two options:

  1. store each in a separate notes ref, with each sha1 mapping to
     a blob. The note "name" is the name of the ref.

  2. store notes in a single notes ref, with each sha1 mapping to a
     tree with named sub-notes. The note "name" is the combination of
     ref-name and tree entry name.

The advantage of (1) is that notes are not bound tightly to each other.
I can distribute the notes tree for one "name" independent of the
others.  The advantage of (2) is that it is faster and smaller. In (1),
each note has a separate index, and we must traverse each note index
separately.

In practice, I would expect to use (1) for logically separate datasets.
For example, automatic bug-tracking notes would go in a different ref
from human annotations. But I would expect to use (2) if I had, say, 5
different pieces of bug tracking information and I wanted an easy way to
refer to them individually.

And a specialized merge for that is straightforward. In the simplest
case, you simply say "notes of this ref are tree-type, or they are
blob-type" and then you have no merge problems. But if you want to get
fancy, you can say that a conflict between "sha1/blob" and
"sha1/tree/key" should automatically "promote" the first one into
"sha1/tree/default" or some other canonical name.

Note that all of this is my pie-in-the-sky "here is what I was thinking
of when I looked at notes a long time ago". I don't care strongly if it
gets implemented or not at this point; I just wanted to add some context
to what Johan had in his notes todo list (or maybe I am wrong, and what
is in his todo list was based on something totally different said by
somebody else, and I have just confused the issue more. :) ).

With respect to the idea of storing an arbitrary tree, I agree it is
probably too complex with respect to merging. In addition, it makes
things like "git log --format=%N" confusing. I think you would do better
to simply store a tree sha1 inside the note blob, and callers who were
interested in the tree contents could then dereference it and examine as
they saw fit.  The only caveat is that you need some way of telling git
that the referenced trees are reachable and not to be pruned.

-Peff

  reply	other threads:[~2010-02-07  5:03 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-06 13:32 A generalization of git notes from blobs to trees - git metadata? Jon Seymour
2010-02-07  1:36 ` Johan Herland
2010-02-07  2:21   ` Junio C Hamano
2010-02-07  5:02     ` Jeff King [this message]
2010-02-07  5:36       ` Jon Seymour
2010-02-07  9:15         ` Jakub Narebski
2010-02-07  9:41           ` Jon Seymour
2010-02-07 10:15             ` Jon Seymour
2010-02-07 19:33         ` Jeff King
2010-02-07 20:25           ` Junio C Hamano
2010-02-08  2:03             ` Steven E. Harris
2010-02-10  5:09             ` Jeff King
2010-02-10  5:23               ` Junio C Hamano
2010-02-10  5:29                 ` Jeff King
2010-02-07 18:48       ` Junio C Hamano
2010-02-07 19:18         ` Jeff King
2010-02-07 22:46       ` Johan Herland
2010-02-07  3:27   ` Jon Seymour
2010-02-07  4:32     ` Jon Seymour

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100207050255.GA17049@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johan@herland.net \
    --cc=jon.seymour@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).