[RFC] [PATCH 0/5] Implement 'prior' commit object links

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC] [PATCH 0/5] Implement 'prior' commit object links
@ 2006-04-25  3:54 Sam Vilain
  2006-04-25  4:31 ` [PATCH 2/5] git-merge-base: follow 'prior' links to find merge bases Sam Vilain
                   ` (8 more replies)
  0 siblings, 9 replies; 63+ messages in thread
From: Sam Vilain @ 2006-04-25  3:54 UTC (permalink / raw)
  To: git

This patch series implements "prior" links in commit objects.  A
'prior' link on a commit represents its historical precedent, as
opposed to the previous commit(s) that this commit builds upon.

This is a proof of concept only; there is an outstanding bug (I put
the prior header right after parent, when it should really go after
author/committer), and room for improvement no doubt remain elsewhere.
Not to mention my shocking C coding style ;)

Examples of use cases this helps:

 1. heads that represent topic branch merges

    This is the "pu" branch case, where the head is a merge of several
    topic branches that is continually moved forward.

    topic branches     head
      ,___.   ,___.
     | TA1 | | TB1 |
      `---'   `---'    ,__.
         ^\_____^\____| H1 |
                       `--'

    + some topic branch changes and a republish:

      ,___.   ,___.
     | TA1 | | TB1 |
      `---'   `---'^   ,__.
        |^\_____^\____| H1 |
        |       |      `--'
      ,_|_.   ,_|_.      P
     | TA2 | | TB2 |     |
      `---'   `---'^     |
        ^       ^        |
      ,_|_.     |        |
     | TA3 |    |        |
      `---'     |      ,__.
         ^\______\____| H2 |
                       `--'

    key:  ^ = parent   P = prior

 2. revising published commits / re-basing

    This is what "stg" et al do.  The tools allow you to commit,
    rewind, revise, recommit, fast forward, etc.

    In this case, the "prior" link would point to the last revision of
    a patch.  Tools would probably

 3. sub-projects

    In this case, the commit on the "main" commit line would have a
    "prior" link to the commit on the sub-project.  The sub-project
    would effectively be its own head with copied commits objects on
    the main head.

 4. tracking cherry picking

    In this case, the "prior" link just points to the commit that was
    cherry picked.  This is perhaps a little different, but an idea
    that somebody else had for this feature.

Sam.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [PATCH 2/5] git-merge-base: follow 'prior' links to find merge bases
  2006-04-25  3:54 [RFC] [PATCH 0/5] Implement 'prior' commit object links Sam Vilain
@ 2006-04-25  4:31 ` Sam Vilain
  2006-04-25  5:19   ` Junio C Hamano
  2006-04-25  4:31 ` [PATCH 1/5] add 'prior' link in commit structure Sam Vilain
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 63+ messages in thread
From: Sam Vilain @ 2006-04-25  4:31 UTC (permalink / raw)
  To: git

From: Sam Vilain <sam.vilain@catalyst.net.nz>

It is possible that a good merge base may be found looking via "prior"
links as well.  We follow them where possible.
---

 merge-base.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/merge-base.c b/merge-base.c
index 07f5ab4..ed6d18c 100644
--- a/merge-base.c
+++ b/merge-base.c
@@ -207,6 +207,18 @@ static int merge_base(struct commit *rev
 			p->object.flags |= flags;
 			insert_by_date(p, &list);
 		}
+		/* If the commit has a "prior" reference, add it */
+		if (commit->prior) {
+			struct commit *prior;
+			prior = lookup_commit_reference_gently(commit->prior, 1);
+			if (prior) {
+				if ((prior->object.flags & flags) != flags) {
+					parse_commit(prior);
+					prior->object.flags |= flags;
+					insert_by_date(prior, &list);
+				}
+			}
+		}
 	}
 
 	if (!result)

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [PATCH 2/5] git-merge-base: follow 'prior' links to find merge bases
  2006-04-25  4:31 ` [PATCH 2/5] git-merge-base: follow 'prior' links to find merge bases Sam Vilain
@ 2006-04-25  5:19   ` Junio C Hamano
  0 siblings, 0 replies; 63+ messages in thread
From: Junio C Hamano @ 2006-04-25  5:19 UTC (permalink / raw)
  To: Sam Vilain; +Cc: git

Sam Vilain <sam.vilain@catalyst.net.nz> writes:

> From: Sam Vilain <sam.vilain@catalyst.net.nz>
>
> It is possible that a good merge base may be found looking via "prior"
> links as well.  We follow them where possible.

You need to define what "prior" means before making decision
like that.  If "prior" can mean cherry-picked one from unrelated
line of development, the above reasoning does not apply.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [PATCH 1/5] add 'prior' link in commit structure
  2006-04-25  3:54 [RFC] [PATCH 0/5] Implement 'prior' commit object links Sam Vilain
  2006-04-25  4:31 ` [PATCH 2/5] git-merge-base: follow 'prior' links to find merge bases Sam Vilain
@ 2006-04-25  4:31 ` Sam Vilain
  2006-04-25  5:18   ` Junio C Hamano
  2006-04-25  4:31 ` [PATCH 3/5] commit.c: parse 'prior' link Sam Vilain
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 63+ messages in thread
From: Sam Vilain @ 2006-04-25  4:31 UTC (permalink / raw)
  To: git

From: Sam Vilain <sam.vilain@catalyst.net.nz>

Add a space in the commit for a prior commit that forms this commit's
historical, not substantial, precedent.

For now this is just recorded as a char* pointer, as it is not an
error condition for the commit not to be present locally.
---

 commit.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/commit.h b/commit.h
index de142af..b00a6b9 100644
--- a/commit.h
+++ b/commit.h
@@ -13,6 +13,7 @@ struct commit {
 	struct object object;
 	unsigned long date;
 	struct commit_list *parents;
+	char *prior;
 	struct tree *tree;
 	char *buffer;
 };

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [PATCH 1/5] add 'prior' link in commit structure
  2006-04-25  4:31 ` [PATCH 1/5] add 'prior' link in commit structure Sam Vilain
@ 2006-04-25  5:18   ` Junio C Hamano
  0 siblings, 0 replies; 63+ messages in thread
From: Junio C Hamano @ 2006-04-25  5:18 UTC (permalink / raw)
  To: git

Sam Vilain <sam.vilain@catalyst.net.nz> writes:

> For now this is just recorded as a char* pointer, as it is not an
> error condition for the commit not to be present locally.

Object ancestry is parsed lazily, so you should not have to do this.
Just point at another commit if you are to have only one (I
recommend against it) or have another commit_list, but when you
instantiate you may want to have a flag in the commit object
itself that says "this need not exist".

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [PATCH 3/5] commit.c: parse 'prior' link
  2006-04-25  3:54 [RFC] [PATCH 0/5] Implement 'prior' commit object links Sam Vilain
  2006-04-25  4:31 ` [PATCH 2/5] git-merge-base: follow 'prior' links to find merge bases Sam Vilain
  2006-04-25  4:31 ` [PATCH 1/5] add 'prior' link in commit structure Sam Vilain
@ 2006-04-25  4:31 ` Sam Vilain
  2006-04-25  4:31 ` [PATCH 5/5] git-commit: add --prior to set prior link Sam Vilain
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 63+ messages in thread
From: Sam Vilain @ 2006-04-25  4:31 UTC (permalink / raw)
  To: git

From: Sam Vilain <sam.vilain@catalyst.net.nz>

Parse for the 'prior' link in a commit
---

 commit.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/commit.c b/commit.c
index 2717dd8..e4bc396 100644
--- a/commit.c
+++ b/commit.c
@@ -260,6 +260,18 @@ int parse_commit_buffer(struct commit *i
 			n_refs++;
 		}
 	}
+	if (!memcmp(bufptr, "prior ", 6)) {
+		unsigned char prior[20];
+		if (get_sha1_hex(bufptr + 6, prior) || bufptr[46] != '\n')
+			return error("bad prior in commit %s", sha1_to_hex(item->object.sha1));
+		bufptr += 47;
+
+		item->prior = xmalloc(21);
+		strncpy(item->prior, (char*)&prior, 20);
+		item->prior[20] = '\0';
+	} else {
+		item->prior = 0;
+	}
 	if (graft) {
 		int i;
 		struct commit *new_parent;

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH 5/5] git-commit: add --prior to set prior link
  2006-04-25  3:54 [RFC] [PATCH 0/5] Implement 'prior' commit object links Sam Vilain
                   ` (2 preceding siblings ...)
  2006-04-25  4:31 ` [PATCH 3/5] commit.c: parse 'prior' link Sam Vilain
@ 2006-04-25  4:31 ` Sam Vilain
  2006-04-25  4:31 ` [PATCH 4/5] git-commit-tree: add support for prior Sam Vilain
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 63+ messages in thread
From: Sam Vilain @ 2006-04-25  4:31 UTC (permalink / raw)
  To: git

From: Sam Vilain <sam.vilain@catalyst.net.nz>

Add command-line support for --prior and add a description to the
ASCIIDOC
---

 Documentation/git-commit.txt |   10 ++++++++++
 git-commit.sh                |   19 +++++++++++++++++--
 2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-commit.txt b/Documentation/git-commit.txt
index 6f2c495..ca5073c 100644
--- a/Documentation/git-commit.txt
+++ b/Documentation/git-commit.txt
@@ -10,6 +10,7 @@ SYNOPSIS
 [verse]
 'git-commit' [-a] [-s] [-v] [(-c | -C) <commit> | -F <file> | -m <msg>]
 	   [--no-verify] [--amend] [-e] [--author <author>]
+           [-p <commit>]
 	   [--] [[-i | -o ]<file>...]
 
 DESCRIPTION
@@ -106,6 +107,15 @@ but can be used to amend a merge commit.
 	index and the latest commit does not match on the
 	specified paths to avoid confusion.
 
+-p|--prior <commit>::
+	Specify a commit that this new commit is the next version of.
+        Use when you want a branch to supercede another branch, but
+        with a new commit history.  It is also use for sub-projects,
+        where commits on the parent tree mirror commits in the
+        sub-project.  <commit> does not have to exist in the local
+        repository, if it is specified as a full 40-digit hex SHA1
+        sum.  Otherwise it is parsed as a local revision.
+
 --::
 	Do not interpret any more arguments as options.
 
diff --git a/git-commit.sh b/git-commit.sh
index 26cd7ca..3feb60d 100755
--- a/git-commit.sh
+++ b/git-commit.sh
@@ -3,7 +3,7 @@ #
 # Copyright (c) 2005 Linus Torvalds
 # Copyright (c) 2006 Junio C Hamano
 
-USAGE='[-a] [-s] [-v] [--no-verify] [-m <message> | -F <logfile> | (-C|-c) <commit>) [--amend] [-e] [--author <author>] [[-i | -o] <path>...]'
+USAGE='[-a] [-s] [-v] [--no-verify] [-m <message> | -F <logfile> | (-C|-c) <commit>) [--amend] [-e] [--author <author>] [-p <commit>] [[-i | -o] <path>...]'
 SUBDIRECTORY_OK=Yes
 . git-sh-setup
 
@@ -200,6 +200,7 @@ log_given=
 log_message=
 verify=t
 verbose=
+prior=
 signoff=
 force_author=
 only_include_assumed=
@@ -344,6 +345,19 @@ do
       shift
       break
       ;;
+  -p|--p|--pr|--pri|--prio|--prior)
+      shift
+      prior="$1"
+      if echo $prior | perl -ne 'exit 1 unless /^[0-9a-f]{40}$/i'
+      then
+          prior=`echo "$prior" | tr '[A-Z]' '[a-z]'`
+      else
+	  prior=`git-rev-parse "$prior"`
+	  [ -n "$prior" ] || exit 1
+      fi
+      PRIOR="-r $prior"
+      shift
+      ;;
   -*)
       usage
       ;;
@@ -602,6 +616,7 @@ then
 		PARENTS=$(git-cat-file commit HEAD |
 			sed -n -e '/^$/q' -e 's/^parent /-p /p')
 	fi
+	
 	current=$(git-rev-parse --verify HEAD)
 else
 	if [ -z "$(git-ls-files)" ]; then
@@ -673,7 +688,7 @@ then
 		tree=$(GIT_INDEX_FILE="$TMP_INDEX" git-write-tree) &&
 		rm -f "$TMP_INDEX"
 	fi &&
-	commit=$(cat "$GIT_DIR"/COMMIT_MSG | git-commit-tree $tree $PARENTS) &&
+	commit=$(cat "$GIT_DIR"/COMMIT_MSG | git-commit-tree $tree $PARENTS $PRIOR) &&
 	git-update-ref HEAD $commit $current &&
 	rm -f -- "$GIT_DIR/MERGE_HEAD" &&
 	if test -f "$NEXT_INDEX"

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH 4/5] git-commit-tree: add support for prior
  2006-04-25  3:54 [RFC] [PATCH 0/5] Implement 'prior' commit object links Sam Vilain
                   ` (3 preceding siblings ...)
  2006-04-25  4:31 ` [PATCH 5/5] git-commit: add --prior to set prior link Sam Vilain
@ 2006-04-25  4:31 ` Sam Vilain
  2006-04-25  4:34 ` [RFC] [PATCH 0/5] Implement 'prior' commit object links Sam Vilain
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 63+ messages in thread
From: Sam Vilain @ 2006-04-25  4:31 UTC (permalink / raw)
  To: git

From: Sam Vilain <sam.vilain@catalyst.net.nz>

Add support in git-commit-tree for -r as well as associated
documentation.
---

 Documentation/git-commit-tree.txt |    6 ++++++
 commit-tree.c                     |   26 +++++++++++++++++++++-----
 2 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-commit-tree.txt b/Documentation/git-commit-tree.txt
index 27b3d12..e11ba1f 100644
--- a/Documentation/git-commit-tree.txt
+++ b/Documentation/git-commit-tree.txt
@@ -20,6 +20,9 @@ A commit object usually has 1 parent (a 
 to 16 parents.  More than one parent represents a merge of branches
 that led to them.
 
+A commit object can have 1 prior commit.  This represents the previous
+commit that this one replaces (including history).
+
 While a tree represents a particular directory state of a working
 directory, a commit represents that state in "time", and explains how
 to get there.
@@ -38,6 +41,8 @@ OPTIONS
 -p <parent commit>::
 	Each '-p' indicates the id of a parent commit object.
 	
+-r <other commit>::
+	One '-r' indicates the id of a prior commit object.
 
 Commit Information
 ------------------
@@ -45,6 +50,7 @@ Commit Information
 A commit encapsulates:
 
 - all parent object ids
+- a prior object id (optional)
 - author name, email and date
 - committer name and email and the commit time.
 
diff --git a/commit-tree.c b/commit-tree.c
index 2d86518..6660b01 100644
--- a/commit-tree.c
+++ b/commit-tree.c
@@ -61,8 +61,9 @@ static void check_valid(unsigned char *s
  */
 #define MAXPARENT (16)
 static unsigned char parent_sha1[MAXPARENT][20];
+static unsigned char prior_sha1[21] = "\0";
 
-static const char commit_tree_usage[] = "git-commit-tree <sha1> [-p <sha1>]* < changelog";
+static const char commit_tree_usage[] = "git-commit-tree <sha1> [-p <sha1>]* [-r <sha1>] < changelog";
 
 static int new_parent(int idx)
 {
@@ -99,11 +100,22 @@ int main(int argc, char **argv)
 	for (i = 2; i < argc; i += 2) {
 		char *a, *b;
 		a = argv[i]; b = argv[i+1];
-		if (!b || strcmp(a, "-p") || get_sha1(b, parent_sha1[parents]))
+		if (!b)
 			usage(commit_tree_usage);
-		check_valid(parent_sha1[parents], commit_type);
-		if (new_parent(parents))
-			parents++;
+		if (!strcmp(a, "-p")) {
+			if (get_sha1(b, parent_sha1[parents]) < 0)
+				usage(commit_tree_usage);
+			check_valid(parent_sha1[parents], commit_type);
+			if (new_parent(parents))
+				parents++;
+		}
+		else if (!strcmp(a, "-r")) {
+			if (strcmp(&prior_sha1, "") || get_sha1(b, &prior_sha1) < 0)
+				usage(commit_tree_usage);
+		}
+		else {
+			usage(commit_tree_usage);
+		}
 	}
 	if (!parents)
 		fprintf(stderr, "Committing initial tree %s\n", argv[1]);
@@ -118,6 +130,10 @@ int main(int argc, char **argv)
 	 */
 	for (i = 0; i < parents; i++)
 		add_buffer(&buffer, &size, "parent %s\n", sha1_to_hex(parent_sha1[i]));
+	if (strcmp(&prior_sha1, "")) {
+		fprintf(stderr, "Setting prior to %s\n", sha1_to_hex(&prior_sha1));
+		add_buffer(&buffer, &size, "prior %s\n", sha1_to_hex(&prior_sha1));
+	}
 
 	/* Person/date information */
 	add_buffer(&buffer, &size, "author %s\n", git_author_info(1));

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
  2006-04-25  3:54 [RFC] [PATCH 0/5] Implement 'prior' commit object links Sam Vilain
                   ` (4 preceding siblings ...)
  2006-04-25  4:31 ` [PATCH 4/5] git-commit-tree: add support for prior Sam Vilain
@ 2006-04-25  4:34 ` Sam Vilain
  2006-04-25  5:16 ` Junio C Hamano
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 63+ messages in thread
From: Sam Vilain @ 2006-04-25  4:34 UTC (permalink / raw)
  To: git

Sam Vilain wrote:

>    In this case, the "prior" link would point to the last revision of
>    a patch.  Tools would probably
>  
>
... support only doing this for selected, "published" patch chains

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
  2006-04-25  3:54 [RFC] [PATCH 0/5] Implement 'prior' commit object links Sam Vilain
                   ` (5 preceding siblings ...)
  2006-04-25  4:34 ` [RFC] [PATCH 0/5] Implement 'prior' commit object links Sam Vilain
@ 2006-04-25  5:16 ` Junio C Hamano
  2006-04-25 23:19   ` Sam Vilain
  2006-04-25  6:44 ` [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas) Jakub Narebski
  2006-04-25 15:10 ` [RFC] [PATCH 0/5] Implement 'prior' commit object links Linus Torvalds
  8 siblings, 1 reply; 63+ messages in thread
From: Junio C Hamano @ 2006-04-25  5:16 UTC (permalink / raw)
  To: Sam Vilain; +Cc: git

Sam Vilain <sam.vilain@catalyst.net.nz> writes:

> Examples of use cases this helps:

My reaction to this patch series is that you try to cover quite
different and unrelated things, without thinking things through,
and end up covering nothing usefully.  What is missing in these
"use cases" is a coherent semantics.

What the "prior" means to humans and tools.  And my *guess* of
what they mean suggests you are trying to make it mean many
unrelated concepts.

>  1. heads that represent topic branch merges
>
>     This is the "pu" branch case, where the head is a merge of several
>     topic branches that is continually moved forward.

For usage like "pu", the previous "pu" head could be recorded as
one of the parents; you do not need anything special.

The reason I do not include the previous head when I reconstruct
"pu" is because I explicitly *want* to drop history -- not
having to carry forward a failed experiment is what is desired
there.  Otherwise I would manage "pu" just like I currently do
"next" and "master".  So this is not a justification to add
something new.

>  2. revising published commits / re-basing
>
>     This is what "stg" et al do.  The tools allow you to commit,
>     rewind, revise, recommit, fast forward, etc.

stg wants to have a link to the fork-point commit.  I do not
know if it is absolutely necessary (you might be able to figure
it out using merge-base, I dunno).

>     In this case, the "prior" link would point to the last revision of
>     a patch.  Tools would probably

Probably what...???

>  3. sub-projects
>
>     In this case, the commit on the "main" commit line would have a
>     "prior" link to the commit on the sub-project.  The sub-project
>     would effectively be its own head with copied commits objects on
>     the main head.

You say you can have only one "prior" per commit, which makes
this unsuitable to bind multiple subprojects into a larger
project (the earlier "bind" proposal allows zero or more).

When you, a human, see a "prior" link in "git cat-file commit"
output, what does that tell you?  Is it "the previous commit
this thing replaces?"  Or is it a commit in a different line of
development which is its subproject?  Or is it a commit that was
cherry-picked from a different line?  How would you tell?  And
assuming you _could_ somehow tell, how would it help you to know
it?

When the Plumbing and the Porcelain sees a "prior" link, what
should they do?  It hugely depends on what that link means.  You
have a patch to merge-base to include the prior commit of the
commit in question in the ancestry chain, but that is probably
valid only for case 1. and perhaps 2. If the link points at a
commit of otherwise unrelated subproject head, you would _never_
want to include that in the merge-base computation.  Neither the
"this commit was taken out of context from otherwise unrelated
branch" link you envision to use for 4.  I think including
"prior" to ancestry list for case 1. and 2. makes some sense in
the merge-base example only because (1) it does not have to be any
different from an ordinary "parent" to begin with for case 1.,
and (2) it points at fork-point which is sort of a merge-base
already.

There may be some narrower concrete use case for which you can
devise coherent semantics, and teach tools and humans how to
interpret such inter-commit relationship that are _not_
parent-child ancestry.  For example, if you have one special
link to point at a "cherry-picked" commit, rebasing _could_ take
advantage of it.  When your side branch tip is at D, and commit
D has "this was cherry-picked from commit E" note, and if you
are rebasing your work on top of F:

        A---B---C---D
       /
  o---o---E---F

the tool can notice that F can reach E and carry forward only A,
B, and C on top of F, omitting D.  So having such a link might
be useful.  But if that is what you are going to do, I do not
think you would want to conflate that with other inter-commit
relationships, such as "previous hydra cap".

Oh, and you would need an update to rev-list --objects and
fsck-objects if you are to add any new link to commit objects.
Otherwise fetch/push would not get the related commits prior
points at, and prune will happily discard them.  But before even
bothering it, you need to come up with a semantics first.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
  2006-04-25  5:16 ` Junio C Hamano
@ 2006-04-25 23:19   ` Sam Vilain
  2006-04-26  5:06     ` Jakub Narebski
  0 siblings, 1 reply; 63+ messages in thread
From: Sam Vilain @ 2006-04-25 23:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano wrote:

>> 2. revising published commits / re-basing
>>
>>    This is what "stg" et al do.  The tools allow you to commit,
>>    rewind, revise, recommit, fast forward, etc.
>>    
>>
>
>stg wants to have a link to the fork-point commit.  I do not
>know if it is absolutely necessary (you might be able to figure
>it out using merge-base, I dunno).
>  
>

"stg pull" and "stg pick" could conceivably link individual patches in a
patchset to their precedent in a previous series. This would make
looking at the evolution of individual patches over time more feasible.

>>    In this case, the "prior" link would point to the last revision of
>>    a patch.  Tools would probably
>>    
>>
>
>Probably what...???
>  
>

...probably support this as an explicit operation - ie "publish", so
that winding whilst developing is not tracked.

>> 3. sub-projects
>>
>>    In this case, the commit on the "main" commit line would have a
>>    "prior" link to the commit on the sub-project.  The sub-project
>>    would effectively be its own head with copied commits objects on
>>    the main head.
>>    
>>
>
>You say you can have only one "prior" per commit, which makes
>this unsuitable to bind multiple subprojects into a larger
>project (the earlier "bind" proposal allows zero or more).
>  
>

It would still support that. Each commit to the sub-project involves a
change to the tree of the "main" commit line (a copy of the commit into
a sub-directory of it). The advantage is that the "tree" in the main
commit is the combined tree, you don't need to treat the case specially
to just get the contents out.

This is kind of like how SVK works by default - you have one local
repository, inside which you track remote repositories. Each commit on
the upstream repository is copied individually into your own repository.
So your local repository numbers easily reach into tens of thousands
(small numbers in git land, I know) while the upstream revisions are
just in the thousands.

>There may be some narrower concrete use case for which you can
>devise coherent semantics, and teach tools and humans how to
>interpret such inter-commit relationship that are _not_
>parent-child ancestry.  For example, if you have one special
>link to point at a "cherry-picked" commit, rebasing _could_ take
>advantage of it.  When your side branch tip is at D, and commit
>D has "this was cherry-picked from commit E" note, and if you
>are rebasing your work on top of F:
>
>        A---B---C---D
>       /
>  o---o---E---F
>
>the tool can notice that F can reach E and carry forward only A,
>B, and C on top of F, omitting D.  So having such a link might
>be useful.  But if that is what you are going to do, I do not
>think you would want to conflate that with other inter-commit
>relationships, such as "previous hydra cap".
>  
>

Right, I see the problem, a strong argument for a more generic solution
as you presented.

Sam.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
  2006-04-25 23:19   ` Sam Vilain
@ 2006-04-26  5:06     ` Jakub Narebski
  2006-04-26  5:22       ` Jakub Narebski
  2006-04-26  6:51       ` Sam Vilain
  0 siblings, 2 replies; 63+ messages in thread
From: Jakub Narebski @ 2006-04-26  5:06 UTC (permalink / raw)
  To: git

Sam Vilain wrote:

> Junio C Hamano wrote:

>>> 3. sub-projects
>>>
>>>    In this case, the commit on the "main" commit line would have a
>>>    "prior" link to the commit on the sub-project.  The sub-project
>>>    would effectively be its own head with copied commits objects on
>>>    the main head.
>>>
>>
>>You say you can have only one "prior" per commit, which makes
>>this unsuitable to bind multiple subprojects into a larger
>>project (the earlier "bind" proposal allows zero or more).
> 
> It would still support that. Each commit to the sub-project involves a
> change to the tree of the "main" commit line (a copy of the commit into
> a sub-directory of it). The advantage is that the "tree" in the main
> commit is the combined tree, you don't need to treat the case specially
> to just get the contents out.

As far as I understand, for subproject commit "bind" link (and perhaps the
keyword/name "link" or "ref" would be better than "related") point to other
subprojects commits (trees), while the Sam's "prior (3)" example link would
point to the toplevel project (gathering all subprojects) commit, and it
would probably be named/noted "toplevel", not "prior".

Am I correct?

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
  2006-04-26  5:06     ` Jakub Narebski
@ 2006-04-26  5:22       ` Jakub Narebski
  2006-04-26  5:36         ` [OT] " Junio C Hamano
  2006-04-26  6:51       ` Sam Vilain
  1 sibling, 1 reply; 63+ messages in thread
From: Jakub Narebski @ 2006-04-26  5:22 UTC (permalink / raw)
  To: git

Jakub Narebski wrote:

> [...] Sam's "prior (3)" example
> link would point to the toplevel project (gathering all subprojects)
> commit, and it would probably be named/noted "toplevel", not "prior".

Or "master" (like "master document" in DTP).

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [OT] Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
  2006-04-26  5:22       ` Jakub Narebski
@ 2006-04-26  5:36         ` Junio C Hamano
  2006-04-26  6:35           ` Jakub Narebski
  0 siblings, 1 reply; 63+ messages in thread
From: Junio C Hamano @ 2006-04-26  5:36 UTC (permalink / raw)
  To: git; +Cc: jnareb

Jakub Narebski <jnareb@gmail.com> writes:

> Jakub Narebski wrote:
>
>> [...] Sam's "prior (3)" example
>> link would point to the toplevel project (gathering all subprojects)
>> commit, and it would probably be named/noted "toplevel", not "prior".
>
> Or "master" (like "master document" in DTP).

(Offtopic) isn't "master" in DTP more like template?

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [OT] Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
  2006-04-26  5:36         ` [OT] " Junio C Hamano
@ 2006-04-26  6:35           ` Jakub Narebski
  2006-04-26  6:50             ` Junio C Hamano
  0 siblings, 1 reply; 63+ messages in thread
From: Jakub Narebski @ 2006-04-26  6:35 UTC (permalink / raw)
  To: git

Junio C Hamano wrote:

> Jakub Narebski <jnareb@gmail.com> writes:
> 
>> Jakub Narebski wrote:
>>
>>> [...] Sam's "prior (3)" example
>>> link would point to the toplevel project (gathering all subprojects)
>>> commit, and it would probably be named/noted "toplevel", not "prior".
>>
>> Or "master" (like "master document" in DTP).
> 
> (Offtopic) isn't "master" in DTP more like template?

Well, in (La)TeX "master document" is a document on it's own rights,
subdocuments are transcluded using some kind of "include"-like command.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
  2006-04-26  6:35           ` Jakub Narebski
@ 2006-04-26  6:50             ` Junio C Hamano
  2006-04-26  7:22               ` Jakub Narebski
  0 siblings, 1 reply; 63+ messages in thread
From: Junio C Hamano @ 2006-04-26  6:50 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> Junio C Hamano wrote:
>
>> Jakub Narebski <jnareb@gmail.com> writes:
>> 
>>> Jakub Narebski wrote:
>>>
>>>> [...] Sam's "prior (3)" example
>>>> link would point to the toplevel project (gathering all subprojects)
>>>> commit, and it would probably be named/noted "toplevel", not "prior".
>>>
>>> Or "master" (like "master document" in DTP).
>> 
>> (Offtopic) isn't "master" in DTP more like template?
>
> Well, in (La)TeX "master document" is a document on it's own rights,
> subdocuments are transcluded using some kind of "include"-like command.

(Offtopic) Ah, the hard-core stuff.  I had something else in
mind ("master page" in "DTP for dummies"), sorry for the
confusion.

(On topic again)

Link from subproject commit back to the toplevel might work for
some kind of subprojects, but it would not work for the
subproject support that frequently comes up on this list.  The
development of an embedded Linux device, where a Linux kernel
source tree is grafted at kernel/ subdirectory of the toplevel
project.  The "prior" link would be placed in the commit that
belong to the kernel subproject, but that would never be merged
to the Linus kernel (why should he care about one particular
embedded device's development history).  The link must go from
the toplevel to generic parts reusable out of the context of the
combined project.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
  2006-04-26  6:50             ` Junio C Hamano
@ 2006-04-26  7:22               ` Jakub Narebski
  2006-04-26  7:50                 ` Junio C Hamano
  0 siblings, 1 reply; 63+ messages in thread
From: Jakub Narebski @ 2006-04-26  7:22 UTC (permalink / raw)
  To: git

Junio C Hamano wrote:

> (On topic again)
> 
> Link from subproject commit back to the toplevel might work for
> some kind of subprojects, but it would not work for the
> subproject support that frequently comes up on this list.  The
> development of an embedded Linux device, where a Linux kernel
> source tree is grafted at kernel/ subdirectory of the toplevel
> project.  The "prior" link would be placed in the commit that
> belong to the kernel subproject, but that would never be merged
> to the Linus kernel (why should he care about one particular
> embedded device's development history).  The link must go from
> the toplevel to generic parts reusable out of the context of the
> combined project.

Yes, I guess subproject support is most needed for the "third-party embedded
(sub)project", when one sometimes have to modify (sub)project files, and
perhaps have to watch for the (sub)project version. Hmmm... if one used
Tailor (to allow for projects not managed under GIT, though I wonder if it
would be possible to link up project without [externally available] SCM)
one could use this approach for managing distribution packages, like RPMS
or debs...

Do I understand correctly that toplevel (master project) commits have tree
which points to combined tree, and "bind" links which points to the
subprojects commits whose trees make up the overall tree, or does the
master tree points to tree containing only toplevel files (overall Makefile
for example, INSTALL or README for the whole project including
subprojects,...)?

BTW. I have lately stumbled upon (somewhat Vault and Subversion biased)
 http://software.ericsink.com/Beyond_CheckOut_and_CheckIn.html
Read about Share and Pin -- it's about subprojects (when you edit out the
flawed "branch as folder" approach of author). I wonder if it could be
easily implemented in "subprojects for GIT" proposal... Of course we can do
better, i.e. original subproject repository doesn't need to be on the same
machine, we can use remote repository.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
  2006-04-26  7:22               ` Jakub Narebski
@ 2006-04-26  7:50                 ` Junio C Hamano
  2006-04-26  8:44                   ` Jakub Narebski
  2006-04-26  9:28                   ` Jakub Narebski
  0 siblings, 2 replies; 63+ messages in thread
From: Junio C Hamano @ 2006-04-26  7:50 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> Do I understand correctly that toplevel (master project) commits have tree
> which points to combined tree, and "bind" links which points to the
> subprojects commits whose trees make up the overall tree, or does the
> master tree points to tree containing only toplevel files (overall Makefile
> for example, INSTALL or README for the whole project including
> subprojects,...)?

The plan for "bind commit" was to have the toplevel commit to
contain:

	tree -- this covers the whole tree including subprojects
        parent -- list of parents in the toplevel project
        bind -- commit object name of subproject, plus which
	        directory to graft its tree onto.

And a subproject commit, unless it contains subsubproject, would
look like just an ordinary commit.  Its tree would match the
entry in the tree the toplevel commit at the path in "bind" line
of the top-level commit.

Some reading material, from newer to older:

  * http://www.kernel.org/git/?p=git/git.git;a=blob;hb=todo;f=Subpro.txt

  This talks about the overall "vision" on how the user-level
  interaction might look like, with a sketch on how the core-level
  would help Porcelain to implement that interaction.  Most of the
  core-level support described there is in the "bind commit"
  changes, except "update-index --bind/-unbind" to record the
  information on bound subprojects in the index file.

  * http://thread.gmane.org/gmane.comp.version-control.git/15072

  This was the thread that led to the above proposal.

  * http://thread.gmane.org/gmane.comp.version-control.git/14486

  This is older.  It touches an alternative "gitlink" approach,
  which I meant to prototype but never got around to.

  Surprisingly, these two threads are mostly noise-free and
  literally every message is worth reading.

Some old but working core-side code is available at jc/bind
branch of public git.git repository.

> BTW. I have lately stumbled upon (somewhat Vault and Subversion biased)
>  http://software.ericsink.com/Beyond_CheckOut_and_CheckIn.html
> Read about Share and Pin -- it's about subprojects (when you edit out the
> flawed "branch as folder" approach of author).

Not really.  You can easily do that by checking out another
project in a separate subdirectory.

My private working area for git.git is structured like this:

	/home/junio/git.junio/.git
        		      Makefile
                              COPYING
                              Documentation/
                              ...
                              Meta/.git
                              Meta/TODO
                              Meta/Make
                              Meta/TO
                              Meta/WI
                              ...

Notice two .git directories?  That's right.  

The top-level .git repository has the familiar branches like
"maint", "master", "next", "pu", in addition to various topic
branches.

Meta/.git is a separate repository that is a clone of "todo"
branch of git.git repository.  The top-level .git repository
does not even have "todo" branch.  I just happen to push into
the same public repository git.git at kernel.org from these two
separate repositories.

The Meta/ repository is "pinned" to a specific version, without
having any funky "Pin feature", no thank you, because I have
full control of when I update what is checked out in the Meta/
directory.

What you _might_ want is a reverse of Pinning.  Sometimes, you
would want to make sure subproject part is at least this version
or later to build other parts of the whole.

But for my particular "Meta/" directory, I do not need such a
linkage.  The major reason I do not keep TODO in the main
project is because it is supposed to be a task list for me
across "maint", "master" and "next".  I do not want it to
fluctuate whenever I work on different branches.

-jc

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
  2006-04-26  7:50                 ` Junio C Hamano
@ 2006-04-26  8:44                   ` Jakub Narebski
  2006-04-26  9:21                     ` Junio C Hamano
  2006-04-26  9:28                   ` Jakub Narebski
  1 sibling, 1 reply; 63+ messages in thread
From: Jakub Narebski @ 2006-04-26  8:44 UTC (permalink / raw)
  To: git

Junio C Hamano wrote:

> Jakub Narebski <jnareb@gmail.com> writes:
> 
>> BTW. I have lately stumbled upon (somewhat Vault and Subversion biased)
>>  http://software.ericsink.com/Beyond_CheckOut_and_CheckIn.html
>> Read about Share and Pin -- it's about subprojects (when you edit out the
>> flawed "branch as folder" approach of author).

By the way I mentioned this link only because it *might* be interesting what
others need subproject support for and how others think of it and implement
it.

> Not really.  You can easily do that by checking out another
> project in a separate subdirectory.
> 
> My private working area for git.git is structured like this:
> 
> /home/junio/git.junio/.git
>         Makefile
>                               COPYING
>                               Documentation/
>                               ...
>                               Meta/.git
>                               Meta/TODO
>                               Meta/Make
>                               Meta/TO
>                               Meta/WI
>                               ...
> 
> Notice two .git directories?  That's right.
[...] 
> Meta/.git is a separate repository that is a clone of "todo"
> branch of git.git repository.  The top-level .git repository
> does not even have "todo" branch.  I just happen to push into
> the same public repository git.git at kernel.org from these two
> separate repositories.

And top-level .git repository is told to ignore Meta directory?

Interesting idea...

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
  2006-04-26  8:44                   ` Jakub Narebski
@ 2006-04-26  9:21                     ` Junio C Hamano
  0 siblings, 0 replies; 63+ messages in thread
From: Junio C Hamano @ 2006-04-26  9:21 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

>> Notice two .git directories?  That's right.
> [...] 
>> Meta/.git is a separate repository that is a clone of "todo"
>> branch of git.git repository.  The top-level .git repository
>> does not even have "todo" branch.  I just happen to push into
>> the same public repository git.git at kernel.org from these two
>> separate repositories.
>
> And top-level .git repository is told to ignore Meta directory?

Yes, I have .git/info/exclude that says something like this:

/.mailmap
*~
/Meta
+*

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
  2006-04-26  7:50                 ` Junio C Hamano
  2006-04-26  8:44                   ` Jakub Narebski
@ 2006-04-26  9:28                   ` Jakub Narebski
  1 sibling, 0 replies; 63+ messages in thread
From: Jakub Narebski @ 2006-04-26  9:28 UTC (permalink / raw)
  To: git

Junio C Hamano wrote:

> And a subproject commit, unless it contains subsubproject, would
> look like just an ordinary commit.  Its tree would match the
> entry in the tree the toplevel commit at the path in "bind" line
> of the top-level commit.
> 
> Some reading material, from newer to older:
> 
>   * http://www.kernel.org/git/?p=git/git.git;a=blob;hb=todo;f=Subpro.txt
> 
>   This talks about the overall "vision" on how the user-level
>   interaction might look like, with a sketch on how the core-level
>   would help Porcelain to implement that interaction.  Most of the
>   core-level support described there is in the "bind commit"
>   changes, except "update-index --bind/-unbind" to record the
>   information on bound subprojects in the index file.

By the way, this file talks about (1) "using"/"userspace"/"embedder"
subproject holding 'appliance/', and toplevel (master) holding toplevel
Makefile, or (2) 'using' subproject holding both 'appliance/' and toplevel
Makefile with the help of --exclude. 

Another option would be to have only "embedded"/"used"/"requirement" be
subproject holding 'kernel-2.6', and 'appliance/' hold by toplevel (master)
commit.  Perhaps not the best solution for 'kernel + userspace tools'
example, but might be better workflow for 'application + library' or
'application + engine' example. 

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
  2006-04-26  5:06     ` Jakub Narebski
  2006-04-26  5:22       ` Jakub Narebski
@ 2006-04-26  6:51       ` Sam Vilain
  1 sibling, 0 replies; 63+ messages in thread
From: Sam Vilain @ 2006-04-26  6:51 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski wrote:

>>It would still support that. Each commit to the sub-project involves a
>>change to the tree of the "main" commit line (a copy of the commit into
>>a sub-directory of it). The advantage is that the "tree" in the main
>>commit is the combined tree, you don't need to treat the case specially
>>to just get the contents out.
>>    
>>
>
>As far as I understand, for subproject commit "bind" link (and perhaps the
>keyword/name "link" or "ref" would be better than "related") point to other
>subprojects commits (trees), while the Sam's "prior (3)" example link would
>point to the toplevel project (gathering all subprojects) commit, and it
>would probably be named/noted "toplevel", not "prior".
>
>Am I correct?
>  
>

I don't think you quite get my meaning.

What I'm saying is that with the right kind of general purpose relation
between commits, you don't need "bind" at all.

Firstly, you would have your sub-project as its own commit line. That is
a fairly straightforward thing.

Secondly, the project that includes it has a corresponding commit for
each commit on the sub-project. This commit changes the portion of the
outer project's tree where the sub-project is bound.

This means that you don't need to understand this "bind" relation to be
able to extract the tree, and keeps the model simple at the expense of
an extra tree object or three per commit. It also does not restrict the
manner of the "binding", porcelains or users are free to do it
selectively, for instance.

Actually there is large similarity to this and cherry-picking. In
essence you're cherry picking every single commit from a different
commit heirarchy, except that you are applying the patches into a
sub-directory.

Sam.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25  3:54 [RFC] [PATCH 0/5] Implement 'prior' commit object links Sam Vilain
                   ` (6 preceding siblings ...)
  2006-04-25  5:16 ` Junio C Hamano
@ 2006-04-25  6:44 ` Jakub Narebski
  2006-04-25  7:29   ` Junio C Hamano
  2006-04-25 15:10 ` [RFC] [PATCH 0/5] Implement 'prior' commit object links Linus Torvalds
  8 siblings, 1 reply; 63+ messages in thread
From: Jakub Narebski @ 2006-04-25  6:44 UTC (permalink / raw)
  To: git

Sam Vilain wrote:

> This patch series implements "prior" links in commit objects.  A
> 'prior' link on a commit represents its historical precedent, as
> opposed to the previous commit(s) that this commit builds upon.
> 
> This is a proof of concept only; there is an outstanding bug (I put
> the prior header right after parent, when it should really go after
> author/committer), and room for improvement no doubt remain elsewhere.
> Not to mention my shocking C coding style ;)

I think "prior" link concept is to generic and is used for quite unrelated
things

> Examples of use cases this helps:
> 
>  1. heads that represent topic branch merges
> 
>     This is the "pu" branch case, where the head is a merge of several
>     topic branches that is continually moved forward.
> 
>     topic branches     head
>       ,___.   ,___.
>      | TA1 | | TB1 |
>       `---'   `---'    ,__.
>          ^\_____^\____| H1 |
>                        `--'
> 
>     + some topic branch changes and a republish:
> 
>       ,___.   ,___.
>      | TA1 | | TB1 |
>       `---'   `---'^   ,__.
>         |^\_____^\____| H1 |
>         |       |      `--'
>       ,_|_.   ,_|_.      P
>      | TA2 | | TB2 |     |
>       `---'   `---'^     |
>         ^       ^        |
>       ,_|_.     |        |
>      | TA3 |    |        |
>       `---'     |      ,__.
>          ^\______\____| H2 |
>                        `--'
> 
>     key:  ^ = parent   P = prior

This case is clear. You want to record previous head of "pu"-like branch,
but you also want to drop the history, so you don't want to record it as
one of parents. I'm not sure if this link would be informative only, or if
it could be usefull e.g. in merge computing.

>  2. revising published commits / re-basing
> 
>     This is what "stg" et al do.  The tools allow you to commit,
>     rewind, revise, recommit, fast forward, etc.
> 
>     In this case, the "prior" link would point to the last revision of
>     a patch.  Tools would probably support only doing this for selected, 
>     "published" patch chains 

This case is quite different. If I understand it correctly prior either
points to the previous patch in patch stack, or the bottom of the
stack/patch stack attachment point. If this cannot be computed easily, it
could I guess be added, but perhaps using other name for link.

>  3. sub-projects
> 
>     In this case, the commit on the "main" commit line would have a
>     "prior" link to the commit on the sub-project.  The sub-project
>     would effectively be its own head with copied commits objects on
>     the main head.
>
>  4. tracking cherry picking
> 
>     In this case, the "prior" link just points to the commit that was
>     cherry picked.  This is perhaps a little different, but an idea
>     that somebody else had for this feature.

Those two are yet another case altogether, the "prior" link pointing to "the
same" commit in another history line. I agree with Junio that for (3)
"bind" proposal (if I understand correctly it points to tree rather than to
commit) is more clean way to go. As to cherry picking (and perhaps
"cherry-pick on steroids" aka rebase), there is truly 0-1 relation (either
this link is not needed at all, or there is only one commit to link to),
but I don't think it should have the same name as in case (1), as this is
very different. And there is a problem that the link might be dangling if
we deleted the branch we cherry-picked commit from, or did some history
rewrite. Perhaps "cherry" would be better name for this link :-)

Additionally for each of those cases we have to consider how to compute the
link and which commands should be modified, which commands can make use of
the link and should be modified, should the link be to commit, tag, tree or
blob, what we want to do with link when pulling/pushing/cloning into
another repository and which commands should be modified. Not only use case
scenarios.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25  6:44 ` [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas) Jakub Narebski
@ 2006-04-25  7:29   ` Junio C Hamano
  2006-04-25  7:43     ` Jakub Narebski
                       ` (2 more replies)
  0 siblings, 3 replies; 63+ messages in thread
From: Junio C Hamano @ 2006-04-25  7:29 UTC (permalink / raw)
  To: git; +Cc: jnareb

Jakub Narebski <jnareb@gmail.com> writes:

> Additionally for each of those cases we have to consider how to compute the
> link and which commands should be modified, which commands can make use of
> the link and should be modified, should the link be to commit, tag, tree or
> blob, what we want to do with link when pulling/pushing/cloning into
> another repository and which commands should be modified. Not only use case
> scenarios.

This last paragraph is a very good suggestion.  The alleged "use
cases" are just laudary list of wishes, if they are not
accompanied by descriptions on what the modified data structure
and added attribute _means_ and how they are _used_.

Here is a related but not necessarily competing idle thought.

How about an ability to "attach" arbitrary objects to commit
objects?  The commit object would look like:

    tree 0aaa3fecff73ab428999cb9156f8abc075516abe
    parent 5a6a8c0e012137a3f0059be40ec7b2f4aa614355
    parent e1cbc46d12a0524fd5e710cbfaf3f178fc3da504
    related a0e7d36193b96f552073558acf5fcc1f10528917 key
    related 0032d548db56eac9ea09b4ba05843365f6325b85 cherrypick
    author Junio C Hamano <junkio@cox.net> 1145943079 -0700
    committer Junio C Hamano <junkio@cox.net> 1145943079 -0700

    Merge branch 'pb/config' into next

    * pb/config:
      Deprecate usage of git-var -l for getting config vars list
      git-repo-config --list support

The format of "related" attribute is, keyword "related", SP, 40-byte
hexadecimal object name, SP, and arbitrary sequence of bytes
except LF and NUL.  Let's call this arbitrary sequence of bytes
"the nature of relation".

The semantics I would attach to these "related" links are as
follows:

 * To the "core" level git, they do not mean anything other than
   "you must to have these objects, and objects reachable from
   them, if you are going to have this commit and claim your
   repository is without missing objects".

That means "git-rev-list --objects" needs to list these objects
(and if they are tags, commits, and trees, then what are
reachable from them), and "git-fsck" needs to consider these
related objects and objects reachable from them are reachable
from this commit.  NOTHING ELSE NEEDS TO BE DONE by the core
(obviously, cat-file needs to show them, and commit-tree needs to
record them, but that goes without saying).

Then porcelains can agree on what different kinds of nature of
relation mean and do sensible things.  The earlier "omit the
cherry-picked ones" example I gave can examine "cherrypick".

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25  7:29   ` Junio C Hamano
@ 2006-04-25  7:43     ` Jakub Narebski
       [not found]       ` <20060425043436.2ff53318.seanlkml@sympatico.ca>
  2006-04-25 15:21     ` Linus Torvalds
  2006-04-25 23:18     ` Sam Vilain
  2 siblings, 1 reply; 63+ messages in thread
From: Jakub Narebski @ 2006-04-25  7:43 UTC (permalink / raw)
  To: git

Junio C Hamano wrote:

> Here is a related but not necessarily competing idle thought.
> 
> How about an ability to "attach" arbitrary objects to commit
> objects?  The commit object would look like:
> 
>     tree 0aaa3fecff73ab428999cb9156f8abc075516abe
>     parent 5a6a8c0e012137a3f0059be40ec7b2f4aa614355
>     parent e1cbc46d12a0524fd5e710cbfaf3f178fc3da504
>     related a0e7d36193b96f552073558acf5fcc1f10528917 key
>     related 0032d548db56eac9ea09b4ba05843365f6325b85 cherrypick
>     author Junio C Hamano <junkio@cox.net> 1145943079 -0700
>     committer Junio C Hamano <junkio@cox.net> 1145943079 -0700
> 
>     Merge branch 'pb/config' into next
> 
>     * pb/config:
>       Deprecate usage of git-var -l for getting config vars list
>       git-repo-config --list support
> 
> The format of "related" attribute is, keyword "related", SP, 40-byte
> hexadecimal object name, SP, and arbitrary sequence of bytes
> except LF and NUL.  Let's call this arbitrary sequence of bytes
> "the nature of relation".
> 
> The semantics I would attach to these "related" links are as
> follows:
> 
>  * To the "core" level git, they do not mean anything other than
>    "you must to have these objects, and objects reachable from
>    them, if you are going to have this commit and claim your
>    repository is without missing objects".
> 
> That means "git-rev-list --objects" needs to list these objects
> (and if they are tags, commits, and trees, then what are
> reachable from them), and "git-fsck" needs to consider these
> related objects and objects reachable from them are reachable
> from this commit.  NOTHING ELSE NEEDS TO BE DONE by the core
> (obviously, cat-file needs to show them, and commit-tree needs to
> record them, but that goes without saying).

Perhaps there should be an option to specify that the link is optional, and
the object pointed can be gone missing. For example for cherrypick the
original cherry-picked commit can either be removed completely, e.g. when
the original branch is deleted, or it can be modified breaking link when we
rewrite history up to original commit on original branch.

Also all other commands which show commit (commit messsage at least) should
be considered for including "related" links...

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 63+ messages in thread

[parent not found: <20060425043436.2ff53318.seanlkml@sympatico.ca>]

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
       [not found]       ` <20060425043436.2ff53318.seanlkml@sympatico.ca>
@ 2006-04-25  8:34         ` sean
       [not found]         ` <20060425045752.0c6fbc21.seanlkml@sympatico.ca>
  1 sibling, 0 replies; 63+ messages in thread
From: sean @ 2006-04-25  8:34 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

On Tue, 25 Apr 2006 09:43:33 +0200
Jakub Narebski <jnareb@gmail.com> wrote:

> Perhaps there should be an option to specify that the link is optional, and
> the object pointed can be gone missing. For example for cherrypick the
> original cherry-picked commit can either be removed completely, e.g. when
> the original branch is deleted, or it can be modified breaking link when we
> rewrite history up to original commit on original branch.
> 
> Also all other commands which show commit (commit messsage at least) should
> be considered for including "related" links...

If you're cherry-picking from a disposable branch, then you don't want to 
include a link to it in your new commit.  Once you include the link, the 
source commit should be protected from pruning just like any other piece 
of history.  Otherwise there's no way for fsck-objects to know if a missing 
object means corruption or not.  So you need a way at commit time to
request the explicit linkage.

This might be useful for bug tracking front ends that could automatically 
show a hot fix migrating from devel, to testing, to release branches.  With 
Junio's proposal, perhaps there's even a better keyword for these particular 
linkages.

Sean.

^ permalink raw reply	[flat|nested] 63+ messages in thread

[parent not found: <20060425045752.0c6fbc21.seanlkml@sympatico.ca>]

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
       [not found]         ` <20060425045752.0c6fbc21.seanlkml@sympatico.ca>
@ 2006-04-25  8:57           ` sean
  2006-04-25  9:10             ` Jakub Narebski
  0 siblings, 1 reply; 63+ messages in thread
From: sean @ 2006-04-25  8:57 UTC (permalink / raw)
  To: jnareb, git

On Tue, 25 Apr 2006 04:34:36 -0400
sean <seanlkml@sympatico.ca> wrote:

> If you're cherry-picking from a disposable branch, then you don't want to 
> include a link to it in your new commit.  Once you include the link, the 
> source commit should be protected from pruning just like any other piece 
> of history.  Otherwise there's no way for fsck-objects to know if a missing 
> object means corruption or not.  So you need a way at commit time to
> request the explicit linkage.

Actually this implies that anyone pulling just this branch would potentially
also end up pulling large portions of other branches too.   So maybe making
them optional is The Right Thing.  In which case, we'd just have to accept 
these as weaker than the parentage links and fsck-objects et. al. would have 
to tolerate such missing commits.

So now that i've clearly come down in favor of both sides of this argument,
i'll leave the decision to smarter people than me.

Sean

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25  8:57           ` sean
@ 2006-04-25  9:10             ` Jakub Narebski
  2006-04-25  9:58               ` Junio C Hamano
  0 siblings, 1 reply; 63+ messages in thread
From: Jakub Narebski @ 2006-04-25  9:10 UTC (permalink / raw)
  To: git

sean wrote:

> On Tue, 25 Apr 2006 04:34:36 -0400
> sean <seanlkml@sympatico.ca> wrote:
> 
>> If you're cherry-picking from a disposable branch, then you don't want to
>> include a link to it in your new commit.  Once you include the link, the
>> source commit should be protected from pruning just like any other piece
>> of history.  Otherwise there's no way for fsck-objects to know if a
>> missing
>> object means corruption or not.  So you need a way at commit time to
>> request the explicit linkage.
> 
> Actually this implies that anyone pulling just this branch would
> potentially
> also end up pulling large portions of other branches too.   So maybe
> making
> them optional is The Right Thing.  In which case, we'd just have to accept
> these as weaker than the parentage links and fsck-objects et. al. would
> have to tolerate such missing commits.

Actually, this can be resolved using automatic history grafts to the remote
repository we pulled from, if the commit is not present on local side (and
removing graft when commit appears on local side).

I was more concerned about size of repository required by keeping some parts
of history which would be purged without those "related" links. But your
concern (pulling) is more important.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25  9:10             ` Jakub Narebski
@ 2006-04-25  9:58               ` Junio C Hamano
  2006-04-25 10:08                 ` Jakub Narebski
  0 siblings, 1 reply; 63+ messages in thread
From: Junio C Hamano @ 2006-04-25  9:58 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: jnareb, git

Jakub Narebski <jnareb@gmail.com> writes:

> Actually, this can be resolved using automatic history grafts to the remote
> repository we pulled from, if the commit is not present on local side (and
> removing graft when commit appears on local side).

You do not even need history grafts.  The "cherry-pick source"
was a bad example.  Maybe using "related" as a way to implement
"bind" would have been a better example -- we want inter-commit
relationship that requires connectivity but without ancestry for
them.

You can just have two kinds of 'related'.  One that means
connectivity, the other that does not.

At that point, the latter does not even have to belong to the
core.  The Porcelains can make use of it as long as they agree
on a common convention and use that information consistently.
It does not even have to be "related" (which implies what comes
after "related" is an object name) -- it could be an arbitrary
metainformation that the core does not have to care.  So an
updated suggestion is to have optional 0-or-more "note" and
"related" fields.  'note' is followed by one token and
additional information.  'related' is followed by an object name
that needs the additional connectivity, and and additional
information.  For example:

    tree 0aaa3fecff73ab428999cb9156f8abc075516abe
    parent 5a6a8c0e012137a3f0059be40ec7b2f4aa614355
    parent e1cbc46d12a0524fd5e710cbfaf3f178fc3da504
    related a0e7d36193b96f552073558acf5fcc1f10528917 bind linux-2.6
    note cherrypick v1.3.0~12
    note origin "next" branch at junio's repository
    note rename "foobar" to "barboz"
    author Junio C Hamano <junkio@cox.net> 1145943079 -0700
    committer Junio C Hamano <junkio@cox.net> 1145943079 -0700

    Merge branch 'pb/config' into next

The core side can say "Oh, this is a 'note' so I do not care
what it is -- I'd just skip to the end of line", while
Porcelains that "cat-file commit" this object can grep for
"note" and look at the first token to figure out what to do with
it.  The core needs to be aware of the 'related' ones and does
the connectivity crud using the object name, and Porcelains can
use the rest of the line to do intelligent things.

Now, it is debatable that such an extra information like 'note'
belongs to the header that the core deals with.  IIRC, Linus
argued that he does not want to have arbitrary cruft in the
header and instead to have it as a comment in the message part
when somebody talked about recording renames in the commit.

We have the author and the committer fields that is not used by
the core (only half of the committer field is used by the core
to date-order the commit list).  But I suspect most of the time
such metainformation is useless to the end-user humans, so if I
have to vote I'd rather put them in the header, have the UI
layer filter them out unless asked when presenting the commit to
the humans, and give Porcelains freedom to do whatever they
wish.

Things are easier to filter out when they properly follow some
structure, so I'd rather have "cruft" in the header.  Right now,
git-cherry-pick ends the commit message with "(cherry picked
from $commit commit)".  In theory, rebase can notice by parsing
commit log message, but it certainly would be easier and more
robust if we had a 'note' facility and a well established
convention to use it.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25  9:58               ` Junio C Hamano
@ 2006-04-25 10:08                 ` Jakub Narebski
  2006-04-29 14:59                   ` Jakub Narebski
  0 siblings, 1 reply; 63+ messages in thread
From: Jakub Narebski @ 2006-04-25 10:08 UTC (permalink / raw)
  To: git

Junio C Hamano wrote:

> Jakub Narebski <jnareb@gmail.com> writes:
> 
>> Actually, this can be resolved using automatic history grafts to the
>> remote repository we pulled from, if the commit is not present on local
>> side (and removing graft when commit appears on local side).
> 
> You do not even need history grafts.  The "cherry-pick source"
> was a bad example.  Maybe using "related" as a way to implement
> "bind" would have been a better example -- we want inter-commit
> relationship that requires connectivity but without ancestry for
> them.
> 
> You can just have two kinds of 'related'.  One that means
> connectivity, the other that does not.

Good idea.

Another problem for core git, but I think orthogonal to the "related"/"note"
distinction is if the relation (or note) should be used as helper in
merges, perhaps by some agreed upon convention on the
comment/description/value part (e.g. "mergehelper" or "mergeinfo").

BTW. in your first example, what "key" relation should mean?
"cherrypick" (which should be "note" as we don't need connectivity) is
quite obvious (or equivalent "origin" if rebase wouldn't destroy the branch
picked from).

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 10:08                 ` Jakub Narebski
@ 2006-04-29 14:59                   ` Jakub Narebski
  0 siblings, 0 replies; 63+ messages in thread
From: Jakub Narebski @ 2006-04-29 14:59 UTC (permalink / raw)
  To: git

Jakub Narebski wrote:

> Junio C Hamano wrote:
> 
>> Jakub Narebski <jnareb@gmail.com> writes:
>> 
>>> Actually, this can be resolved using automatic history grafts to the
>>> remote repository we pulled from, if the commit is not present on local
>>> side (and removing graft when commit appears on local side).
>> 
>> You do not even need history grafts.  The "cherry-pick source"
>> was a bad example.  Maybe using "related" as a way to implement
>> "bind" would have been a better example -- we want inter-commit
>> relationship that requires connectivity but without ancestry for
>> them.
>> 
>> You can just have two kinds of 'related'.  One that means
>> connectivity, the other that does not.
> 
> Good idea.
> 
> Another problem for core git, but I think orthogonal to the
> "related"/"note" distinction is if the relation (or note) should be used
> as helper in merges, perhaps by some agreed upon convention on the
> comment/description/value part (e.g. "mergehelper" or "mergeinfo").

Scratch that. It would be better for merge strategy just to check for
defined set of "links" and "notes", e.g. "prior" (pu-prior) and
"original" (cherrypick).

But there would be problem with connectivity provided by "link" relations,
namely info/grafts file, which deal only with parents. For example when we
cauterize history using grafts (e.g. for shallow clone) the "link" like
"prior" reaching to the cut-off part of the history might make your day ;-)

Well, we could always drop all the connectivity, and make
  link sha1 description...
shortcut for
  note link sha1 description...

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25  7:29   ` Junio C Hamano
  2006-04-25  7:43     ` Jakub Narebski
@ 2006-04-25 15:21     ` Linus Torvalds
  2006-04-25 15:40       ` Linus Torvalds
  2006-04-25 23:18     ` Sam Vilain
  2 siblings, 1 reply; 63+ messages in thread
From: Linus Torvalds @ 2006-04-25 15:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, jnareb

On Tue, 25 Apr 2006, Junio C Hamano wrote:
> 
> How about an ability to "attach" arbitrary objects to commit
> objects?  The commit object would look like:
> 
>     tree 0aaa3fecff73ab428999cb9156f8abc075516abe
>     parent 5a6a8c0e012137a3f0059be40ec7b2f4aa614355
>     parent e1cbc46d12a0524fd5e710cbfaf3f178fc3da504
>     related a0e7d36193b96f552073558acf5fcc1f10528917 key
>     related 0032d548db56eac9ea09b4ba05843365f6325b85 cherrypick

This would at the face of it seem a bit better, but the fact is, it's not.

Without _semantics_ for the different cases, it's just random crud.

What does any of the fields _mean_ to git? In particular, if you cannot 
come up with an _exact_ definition of what they mean for fsck, pull, push, 
and any other random thing (how to show them for logging? How do they 
affect merge bases?), then it's still just random free-form text, and it 
should go into the random free-form section.

> The semantics I would attach to these "related" links are as
> follows:
> 
>  * To the "core" level git, they do not mean anything other than
>    "you must to have these objects, and objects reachable from
>    them, if you are going to have this commit and claim your
>    repository is without missing objects".

Ok, a real semantic meaning. However:

THAT IS COMPLETELY USELESS.

It sure isn't useful for cherry-picking, which so far is one of the only 
"real examples" of where this would actually be used. 

It isn't useful for much anything else either, because you really have two 
cases:

 - the "related" commit is an indirect parent _anyway_ (for things like 
   "revert", this would obviously be the case, since it doesn't generally 
   make a lot of sense to revert something that has never touched your 
   history). In this case, the git semantics end up being NULL, and you 
   just have another relationship that doesn't actually add any new 
   information to the tree.

 - the "related" commit is not actually in the set of _real_ parenthood at 
   all, and actually points to a different branch (or possibly even 
   different project).

   This case I'd sure as hell hate to have for the kernel, at least. I 
   would have to add crap to my workflow to make sure that people do _not_ 
   have these kinds of linkages that link in random parts of their project 
   that doesn't actually have anything to do with the history I'm pulling.

Those are the only two possible cases. Either it's an indirect parent, or 
it isn't. Neither one makes any sense: the first one is a no-op from your 
semantic definition, and the second one is just crazy and you'll just find 
that people have to protect themselves from other developers doing 
something crazy by mistake.

I want the git objects to have clear and unambiguous semantics. I want 
people to be able to explain exactly what the fields _mean_. No "this 
random field could be used this random way" crud, please.

			Linus

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 15:21     ` Linus Torvalds
@ 2006-04-25 15:40       ` Linus Torvalds
       [not found]         ` <20060425121700.2d1a0032.seanlkml@sympatico.ca>
  2006-04-25 16:27         ` Jakub Narebski
  0 siblings, 2 replies; 63+ messages in thread
From: Linus Torvalds @ 2006-04-25 15:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, jnareb

On Tue, 25 Apr 2006, Linus Torvalds wrote:
> 
> I want the git objects to have clear and unambiguous semantics. I want 
> people to be able to explain exactly what the fields _mean_. No "this 
> random field could be used this random way" crud, please.

Btw, if the whole point is a "leave random porcelain a field that they can 
use any way they want", then I say "Hell NO!".

Random porcelain can already just maintain their own lists of "related" 
stuff, any way they want: you can keep it in a file in ".git/porcelain", 
called "list-commit-relationships", or you could use a git blob for it and 
have a reference to it in .git/refs/porcelain/relationships or whatever. 

If it has no clear and real semantic meaning for core git, then it 
shouldn't be in the core git objects.

The absolute last thing we want is a "random out" that starts to mean 
different things to different people, groups and porcelains.

That's just crazy, and it's how you end up with a backwards compatibility 
mess five years from now that is totally unresolvable, because different 
projects end up having different meanings or uses for the fields, so 
converting the database (if we ever find a better format, or somebody 
notices that SHA1 can be broken by a five-year-old-with-a-crayon).

There's a reason "minimalist" actually ends up _working_. I'll take a UNIX 
"system calls have meanings" approach over a Windows "there's fifteen 
different flavors of 'open()', and we also support magic filenames with 
specific meaning" kind of thing.

			Linus

^ permalink raw reply	[flat|nested] 63+ messages in thread

[parent not found: <20060425121700.2d1a0032.seanlkml@sympatico.ca>]

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
       [not found]         ` <20060425121700.2d1a0032.seanlkml@sympatico.ca>
@ 2006-04-25 16:17           ` sean
  2006-04-25 17:04             ` Linus Torvalds
  2006-04-26 11:25             ` Andreas Ericsson
  0 siblings, 2 replies; 63+ messages in thread
From: sean @ 2006-04-25 16:17 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: junkio, git, jnareb

On Tue, 25 Apr 2006 08:40:25 -0700 (PDT)
Linus Torvalds <torvalds@osdl.org> wrote:

> On Tue, 25 Apr 2006, Linus Torvalds wrote:
> > 
> > I want the git objects to have clear and unambiguous semantics. I want 
> > people to be able to explain exactly what the fields _mean_. No "this 
> > random field could be used this random way" crud, please.
> 
> Btw, if the whole point is a "leave random porcelain a field that they can 
> use any way they want", then I say "Hell NO!".
> 
> Random porcelain can already just maintain their own lists of "related" 
> stuff, any way they want: you can keep it in a file in ".git/porcelain", 
> called "list-commit-relationships", or you could use a git blob for it and 
> have a reference to it in .git/refs/porcelain/relationships or whatever. 
> 
> If it has no clear and real semantic meaning for core git, then it 
> shouldn't be in the core git objects.
> 
> The absolute last thing we want is a "random out" that starts to mean 
> different things to different people, groups and porcelains.
> 
> That's just crazy, and it's how you end up with a backwards compatibility 
> mess five years from now that is totally unresolvable, because different 
> projects end up having different meanings or uses for the fields, so 
> converting the database (if we ever find a better format, or somebody 
> notices that SHA1 can be broken by a five-year-old-with-a-crayon).
> 
> There's a reason "minimalist" actually ends up _working_. I'll take a UNIX 
> "system calls have meanings" approach over a Windows "there's fifteen 
> different flavors of 'open()', and we also support magic filenames with 
> specific meaning" kind of thing.
> 

It's a fair point.  But adding a separate database to augment the core 
information has some downsides.  That is, that information isn't pulled, 
cloned, or pushed automatically; it doesn't get to ride for free on top 
of the core.

Accommodating extra git headers (or "note"'s in Junio's example) would allow
a developer to record the fact that he is integrating a patch taken 
from a commit in the devel branch and backporting it to the release 
branch.   Either by adding a note that references the bug tracking #, or 
a commit sha1 from the devel branch that is already associated with the bug.

Of course that information could be embedded in the free text area, but 
you yourself have argued vigorously that it is brain damaged to try and rely
on parsing free form text for these types of situations.  Most of the potential 
uses aren't really meant for a human to read while looking at the log anyway, 
they just get in the way.  Another option that you alluded to, was to 
stuff the information in another git object.   But such an object would have 
to embed a reference to the original commit, thus you haven't really made 
changing the SHA1 algorithm any easier.  And then you also then have to jump 
through hoops to make sure that you pull the proper extra blobs that contain 
information about the real commits you just pulled.

But if the information is in the actual commit header it gets to tag along
for free with never any worry it will be separated from the commit in question.
So when the developer above updates his official repo the bug tracker system 
can notice that the bug referenced in its system has had a patch backported 
and take whatever action is desired.  

Of course there are other ways to do this, but integrating it into git means it
gets a free ride on the core, and it shouldn't really get in the way of core 
any more than email X- headers get in the way of email flowing.

Sean

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 16:17           ` sean
@ 2006-04-25 17:04             ` Linus Torvalds
  2006-04-26 11:25             ` Andreas Ericsson
  1 sibling, 0 replies; 63+ messages in thread
From: Linus Torvalds @ 2006-04-25 17:04 UTC (permalink / raw)
  To: sean; +Cc: junkio, git, jnareb

On Tue, 25 Apr 2006, sean wrote:
> 
> It's a fair point.  But adding a separate database to augment the core 
> information has some downsides.  That is, that information isn't pulled, 
> cloned, or pushed automatically; it doesn't get to ride for free on top 
> of the core.

But the point is, we don't generally _want_ to pull, push, or clone this 
crud.

I for one would literally have to add code to say "if any commit we poll 
has this random field, I refuse to pull". 

There's two ways to have true interoperability (and in a distributed 
system, that's the thing that matters):

 - keep on piling on the sh*t
 - keep it simple so that people know exactly what the rules are.

Guess which one I am religiously in favour of.

That's my whole point: the "rules" for this suggested "prior" or "related" 
field simply don't exist, and it doesn't even seem to be the case that 
people can agree what it _means_ in that nobody has actually explained 
what the thing would do and why you would use it.

If you cannot explain to the other side what a field is used for, then 
that field - by definition - is not useful for the other side. It will 
just result in confusion, because different users will have different 
notions of what to do with the field (if anything).

So some users might consider it to have meaning, and actually do different 
things when it exists. Others would ignore it entirely. Yet thirds would 
ignore it, but consider it a link that must exist - which would break 
whenever those people would interact with the people who ignore it, and 
think that it's superfluous.

This is why it has to have real meaning. If there are no rules, things 
will break. Some things will pull them, others won't, yet third things 
will do random things.

If you just want to have something that "follows" an archive, it's easy 
enough to do: have a totally separate ref, that is a real branch, but may 
not even contain any files at all. You can - perfectly validly - have a 
chain of commits where all the information is in the "free-form" text area 
as far as git is concerned, but where the trees are all empty.

You'll find that all git users can pull such a commit, and you can use all 
the normal git ops on them, and you can hide your own metadata in there. 
And it would still be a valid git tree - your metadata would be your 
private thing, and you can keep it along-side the "normal" git data, and 
you can have your own "extended fsck", and "git pull/push" still continues 
to work. 

Junio does something like that with the "todo" branch, for example (it's 
human-readable, not automated, but that doesn't really change anything). 
You can do

	git ls-tree todo
	git cat-file blob todo:Porcelainistas | less -S

and in general do anything you damn well please there. WITHOUT making 
up any new (and unnecessary) format semantics that nobody else cares 
about and that don't have very well-specified meaning.

		Linus

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 16:17           ` sean
  2006-04-25 17:04             ` Linus Torvalds
@ 2006-04-26 11:25             ` Andreas Ericsson
  2006-04-26 12:01               ` Jakub Narebski
  1 sibling, 1 reply; 63+ messages in thread
From: Andreas Ericsson @ 2006-04-26 11:25 UTC (permalink / raw)
  To: sean; +Cc: Linus Torvalds, junkio, git, jnareb

sean wrote:
> On Tue, 25 Apr 2006 08:40:25 -0700 (PDT)
> Linus Torvalds <torvalds@osdl.org> wrote:
> 
> 
>>On Tue, 25 Apr 2006, Linus Torvalds wrote:
>>
>>>I want the git objects to have clear and unambiguous semantics. I want 
>>>people to be able to explain exactly what the fields _mean_. No "this 
>>>random field could be used this random way" crud, please.
>>
>>Btw, if the whole point is a "leave random porcelain a field that they can 
>>use any way they want", then I say "Hell NO!".
>>
>>Random porcelain can already just maintain their own lists of "related" 
>>stuff, any way they want: you can keep it in a file in ".git/porcelain", 
>>called "list-commit-relationships", or you could use a git blob for it and 
>>have a reference to it in .git/refs/porcelain/relationships or whatever. 
>>
>>If it has no clear and real semantic meaning for core git, then it 
>>shouldn't be in the core git objects.
>>
>>The absolute last thing we want is a "random out" that starts to mean 
>>different things to different people, groups and porcelains.
>>
>>That's just crazy, and it's how you end up with a backwards compatibility 
>>mess five years from now that is totally unresolvable, because different 
>>projects end up having different meanings or uses for the fields, so 
>>converting the database (if we ever find a better format, or somebody 
>>notices that SHA1 can be broken by a five-year-old-with-a-crayon).
>>
>>There's a reason "minimalist" actually ends up _working_. I'll take a UNIX 
>>"system calls have meanings" approach over a Windows "there's fifteen 
>>different flavors of 'open()', and we also support magic filenames with 
>>specific meaning" kind of thing.
>>
> 
> 
> It's a fair point.  But adding a separate database to augment the core 
> information has some downsides.  That is, that information isn't pulled, 
> cloned, or pushed automatically; it doesn't get to ride for free on top 
> of the core.
> 
> Accommodating extra git headers (or "note"'s in Junio's example) would allow
> a developer to record the fact that he is integrating a patch taken 
> from a commit in the devel branch and backporting it to the release 
> branch.   Either by adding a note that references the bug tracking #, or 
> a commit sha1 from the devel branch that is already associated with the bug.
> 

This information is something I, as a human, would definitely want to 
read. What's the point of recording it in the commit-header if we're not 
going to show it to users anyway? I'm with Linus on this one. Keep 
headers as simple as possible.

> Of course that information could be embedded in the free text area, but 
> you yourself have argued vigorously that it is brain damaged to try and rely
> on parsing free form text for these types of situations.

Why would there be a need to parse it? The entire *point* of history is 
to present it to readers in an as accessible and understandable way as 
possible. Git's sha1 hashes mean absolutely nothing, so a note saying 
something was cherry-picked from commit 
"89987987ad987aef987987aff987987d" on branch "devel" will be pointless 
unless the one doing the committing states the why as well as the what 
in the commit-message anyways.

Besides, only developers will likely ever look at the commit-messages, 
and they will likely only ever do it when they are bisecting or looking 
for the implementation date of a certain feature or other.

>  Most of the potential 
> uses aren't really meant for a human to read while looking at the log anyway, 
> they just get in the way.

I still fail to see a use case for this. Could you give me some examples 
to when information recorded isn't meant for being presented to the user?

> 
> But if the information is in the actual commit header it gets to tag along
> for free with never any worry it will be separated from the commit in question.
> So when the developer above updates his official repo the bug tracker system 
> can notice that the bug referenced in its system has had a patch backported 
> and take whatever action is desired.  
> 

We already have something like this. All commits with a top-line message 
containing "bug #" followed by a number automatically updates our 
bugtracking system with the commit-message in its entirety. If the word 
before "bug #" matches "fix.*" then the status of the bug is set to that.

This might seem cumbersome to some but it's really very straightforward, 
and for a couple of reasons it's a very good solution:
1. Devs who Do It Right don't have to fiddle with their browser just to 
enter the info twice, so they learn fast. :)
2. BT history (viewed by non-devs too) gets updated with accurate 
information promptly.
3. No matter how you solve the problem you're going to need to write a 
custom commit/update hook anyway, so this is as good as having the info 
in the note.
4. The info going to the BT is easily modifiable, so if someone screws 
up they can fix it later. Fixing an already written git commit takes 
some doing if there are commits on top.

> Of course there are other ways to do this, but integrating it into git means it
> gets a free ride on the core, and it shouldn't really get in the way of core 
> any more than email X- headers get in the way of email flowing.
> 

True. I've suggested before that arbitrary headers could be added to git 
commits by prefixing them with X- (preferrably followed by an abbrev of 
the porcelain name adding the note). This way it's easy to filter, you 
get the free ride, and porcelains can do whatever they want while core 
git can strip everything following the sequence "\nX-" up to and 
including the next newline.

This way you have only one special byte-sequence with special meaning 
that the plumbing has to know it should ignore, which is a lot more 
extensible (not to mention easier to code).

In addition, if those X- lines aren't included in the sha1 computation 
they can easily be removed and added to without affecting the ancestry 
chain. This would probably have quite a performance impact though.

That said, I don't think even "X-" headers is a very good idea. Perhaps 
i've just got poor imagination but I can't think of a good use for them.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-26 11:25             ` Andreas Ericsson
@ 2006-04-26 12:01               ` Jakub Narebski
  0 siblings, 0 replies; 63+ messages in thread
From: Jakub Narebski @ 2006-04-26 12:01 UTC (permalink / raw)
  To: git

Andreas Ericsson wrote:

> I've suggested before that arbitrary headers could be added to git
> commits by prefixing them with X- (preferrably followed by an abbrev of
> the porcelain name adding the note). This way it's easy to filter, you
> get the free ride, and porcelains can do whatever they want while core
> git can strip everything following the sequence "\nX-" up to and
> including the next newline.
> 
> This way you have only one special byte-sequence with special meaning
> that the plumbing has to know it should ignore, which is a lot more
> extensible (not to mention easier to code).
> 
> In addition, if those X- lines aren't included in the sha1 computation
> they can easily be removed and added to without affecting the ancestry
> chain. This would probably have quite a performance impact though.
> 
> That said, I don't think even "X-" headers is a very good idea. Perhaps
> i've just got poor imagination but I can't think of a good use for them.

Well, the "note" headers are just that, but instead of prefixing 'extra'
headers with "X-" you prefix them with "note ".

I think that the "note" (or X-) headers should be included in calculating
sha1, as the free-form of commit (the comment) is.

As to use: for now 'git cherry-pick' and 'git revert' records the commit
picked or commit reverted in free form. It could be recorded in "note"
header, or additionally as "note" header. 'git rebase' could also record
the original commit e.g. as "note original <branchname> <sha1-of-commit>".

And it would be the place for Porcelain to record simple information which
is of use to them, but usualy not interesting to user, so it would be
better if it wouldn't pollute free-form/comment area.

The "prior" (for saving "pu"-like branches previous state) and "bind" (for
managing subprojects) I think should be rather of "related"/"link" kind.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 15:40       ` Linus Torvalds
       [not found]         ` <20060425121700.2d1a0032.seanlkml@sympatico.ca>
@ 2006-04-25 16:27         ` Jakub Narebski
  2006-04-25 17:11           ` Linus Torvalds
  1 sibling, 1 reply; 63+ messages in thread
From: Jakub Narebski @ 2006-04-25 16:27 UTC (permalink / raw)
  To: git

Linus Torvalds wrote:

> On Tue, 25 Apr 2006, Linus Torvalds wrote:
>> 
>> I want the git objects to have clear and unambiguous semantics. I want
>> people to be able to explain exactly what the fields _mean_. No "this
>> random field could be used this random way" crud, please.
> 
> Btw, if the whole point is a "leave random porcelain a field that they can
> use any way they want", then I say "Hell NO!".

The generic commit links "related" which is fsck-able at least and "note"
which is not. It is idea somewhat on the level of providing _extended
attributes_ in VFS in Linux kernel, IMVHO.

"note" can be considere cruft, "related" is fsck-able and pull-able so has
meaning for core (even if not all "note" and/or "related" links have any
repercussion for merging for example).

So far there are following core git ideas of using this feature (akin to
using extended attributes for ACL, or SELinux properties):

1. "related" link "bind" for better support of subprojects. Useful if some
parts of project are developed independently (e.g. lm_sensors or ALSA was
in Linux kernel, xdiff for git, somelibrary or somemodule for someproject
etc.).
2. "note" link "cherrypicked" for cherry-picking, rebase etc., for example
to not apply the same commit twice. Useful in merging after cherry picking.

Additionally there are following less certain ideas

3. "prior" link in the sense of prior state of frequently rebased branch
like git's "pu" (case (1) in first post in this thread)
4. "depend" link for creating darc-esque dependency partial ordering of
commits (patches), for better merge perhaps
5. "note" link "rename" (or more generic "contents related") for remembering
renames/file moving, file splitting, contents moving and copying, including
correcting automatic "rename" detection at merge (i.e. remembering false
positives and false negatives). Useful in subsequent merges and information
commands (log, whatchanged, annotate/blame, diff).
6. "note" link "origin" to remember for where the commit was pulled.

Note that none of those are non-core Porcelain ideas.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 16:27         ` Jakub Narebski
@ 2006-04-25 17:11           ` Linus Torvalds
  2006-04-25 17:36             ` Jakub Narebski
       [not found]             ` <20060425135250.5fd889f4.seanlkml@sympatico.ca>
  0 siblings, 2 replies; 63+ messages in thread
From: Linus Torvalds @ 2006-04-25 17:11 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

On Tue, 25 Apr 2006, Jakub Narebski wrote:
> 
> The generic commit links "related" which is fsck-able at least and "note"
> which is not. It is idea somewhat on the level of providing _extended
> attributes_ in VFS in Linux kernel, IMVHO.

And nobody actually uses extended attributes either, do they?

Plus it's _not_ fsck'able, since the thing doesn't even have any valid 
semantics. You guys can't even agree on whether the object must exist or 
not. 

Anyway, I'm not interested. I'm violently opposed to the mess that is 
darcs and other crapola. The WHOLE point of git is to have well-defined 
semantics and get away from the horrors that other systems have done, 
where they have allowed any random crap to "make sense". 

If you want darcs-like semantics where there are no rules, just use darcs, 
for chrissake! And if you want to base it on git because you've noticed 
that git is (a) stable, (b) fast and (c) has developed remarkably well, 
then think for a second _why_ git is stable, fast, and well-developed. 
It's that exactly because it has clear semantics, and no room for random 
crud.

Git tracks contents, and the well-defined history of how those contents 
came to be. Git does NOT track "additional notes" left by the developer 
that have weak semantics. Git does not track when a developer says "I 
renamed a file".

For exactly the same reason, git should not track it when a developer says 
"I think this commit is related to that commit". It's not hard data, that 
has hard and clear semantics.

Once you start adding data that has no clear semantics, you're screwed. At 
that point, it's a "track guesses" game, not a "track contents" game.

			Linus

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 17:11           ` Linus Torvalds
@ 2006-04-25 17:36             ` Jakub Narebski
  2006-04-25 17:57               ` Linus Torvalds
       [not found]             ` <20060425135250.5fd889f4.seanlkml@sympatico.ca>
  1 sibling, 1 reply; 63+ messages in thread
From: Jakub Narebski @ 2006-04-25 17:36 UTC (permalink / raw)
  To: git

Linus Torvalds wrote:

> On Tue, 25 Apr 2006, Jakub Narebski wrote:
>> 
>> The generic commit links "related" which is fsck-able at least and "note"
>> which is not. It is idea somewhat on the level of providing _extended
>> attributes_ in VFS in Linux kernel, IMVHO.
> 
> And nobody actually uses extended attributes either, do they?

Fedora's SELinux does use them, IIRC.

Well, people do use X-* headers in mail (sean's example), and some of them
got promoted from X-* to ordinary mail header status.

> Plus it's _not_ fsck'able, since the thing doesn't even have any valid
> semantics. You guys can't even agree on whether the object must exist or
> not.

Erm, further on we did agree 
  http://permalink.gmane.org/gmane.comp.version-control.git/19142
  (Message-Id: <7vmzeax9gj.fsf@assigned-by-dhcp.cox.net>). 
"related" links means that object must exist. "note" is what name says, just
note and doesn't even need to point to object.

> For exactly the same reason, git should not track it when a developer says
> "I think this commit is related to that commit". It's not hard data, that
> has hard and clear semantics.
> 
> Once you start adding data that has no clear semantics, you're screwed. At
> that point, it's a "track guesses" game, not a "track contents" game.

Well, the best example, i.e. remembering cherry picking has well defined
semantic (added when cherry-picking, used when merging, object does need
not to exist) but not well defined form. Currently the convention for
free-form is used, which has its advantages and disadvantages as pointed
out by Junio.

[somewhat unrelated note]
> Git tracks contents, and the well-defined history of how those contents
> came to be. Git does NOT track "additional notes" left by the developer
> that have weak semantics. Git does not track when a developer says "I
> renamed a file".

But I'd like Git to remember when I corrected false positives in "rename"
detection during merge, and added undetected automatically renames/file
contents copying and/or moving. Whether it would be done by saving the
information in commit header, commit free-for, or somewhere else...

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 17:36             ` Jakub Narebski
@ 2006-04-25 17:57               ` Linus Torvalds
  2006-04-25 18:06                 ` Linus Torvalds
  0 siblings, 1 reply; 63+ messages in thread
From: Linus Torvalds @ 2006-04-25 17:57 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

On Tue, 25 Apr 2006, Jakub Narebski wrote:
> 
> Erm, further on we did agree 

Hell no "we" didn't.

Since I totally refuse to touch anything like that.

I even told you exactly why, for things like the suggested "cherry-pick" 
thing.

Which still remains the "best" example. And I say "best", because as an 
example it totally sucks. Again, for reasons I made very clear.

The fact is, there is _zero_ reason for this field to exist. Nobody has 
actually mentioned a single use that is really valid and that people can 
agree on across different uses.

So here's the challenge: name _one_ thing that people actually can agree 
on, and that adds real measurable _value_ from a core git standpoint. 
Something where the semantics actually change what git does.

The "track it with pull/push" thing is NOT one such thing, however much 
you protest. We already _have_ that thing. It's called a "ref", and it's 
really really easy to create anywhere in .git/refs/, and the tools already 
know how to use it.

		Linus

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 17:57               ` Linus Torvalds
@ 2006-04-25 18:06                 ` Linus Torvalds
  2006-04-25 18:24                   ` Jakub Narebski
  0 siblings, 1 reply; 63+ messages in thread
From: Linus Torvalds @ 2006-04-25 18:06 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

On Tue, 25 Apr 2006, Linus Torvalds wrote:
> 
> The "track it with pull/push" thing is NOT one such thing, however much 
> you protest. We already _have_ that thing. It's called a "ref", and it's 
> really really easy to create anywhere in .git/refs/, and the tools already 
> know how to use it.

Btw, there are other cases for that. For example, "parent" is a 
well-specified thing that actually has very clear and unambiguous meaning. 

And we had a much better proposals (in the sense that it had real 
suggested _meaning_ and semantics) over the last few months for things 
like sub-projects (trees that point to other commits) or last year a 
discussion about "container objects" (like the current tags, but listing 
multiple objects instead of just one).

All of which had clear and unambiguous semantics (but were not done for 
other reasons - maybe the sub-project still remains on the horizon, the 
"container objects" thing doesn't seem to have gone anywhere).

			Linus

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 18:06                 ` Linus Torvalds
@ 2006-04-25 18:24                   ` Jakub Narebski
  0 siblings, 0 replies; 63+ messages in thread
From: Jakub Narebski @ 2006-04-25 18:24 UTC (permalink / raw)
  To: git

Linus Torvalds wrote:

> On Tue, 25 Apr 2006, Linus Torvalds wrote:
>> 
>> The "track it with pull/push" thing is NOT one such thing, however much
>> you protest. We already _have_ that thing. It's called a "ref", and it's
>> really really easy to create anywhere in .git/refs/, and the tools
>> already know how to use it.

I agree(d) that tracking pull/push with extra commit header fields is not a
good example.
 
> Btw, there are other cases for that. For example, "parent" is a
> well-specified thing that actually has very clear and unambiguous meaning.

In single parent case, "parent" means that we modified tree pointed by the
parent. Multiple parent case suggests that we combined trees pointed by
parents, most probable by merge. I'd rather we not use parent for anything
else.

> And we had a much better proposals (in the sense that it had real
> suggested _meaning_ and semantics) over the last few months for things
> like sub-projects (trees that point to other commits)

Wasn't it commits pointing to other trees (or to commits)? "bind" field
proposal suggests it. And it could be implemented using 'X-*' "related"
headers in commit.

   related a0e7d36193b96f552073558acf5fcc1f10528917 bind linux-2.6

vs. proposed

   bind f6a8248420395bc9febd66194252fc9957b0052d linux/

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 63+ messages in thread

[parent not found: <20060425135250.5fd889f4.seanlkml@sympatico.ca>]

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
       [not found]             ` <20060425135250.5fd889f4.seanlkml@sympatico.ca>
@ 2006-04-25 17:52               ` sean
  2006-04-25 18:08                 ` Linus Torvalds
  0 siblings, 1 reply; 63+ messages in thread
From: sean @ 2006-04-25 17:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: jnareb, git

On Tue, 25 Apr 2006 10:11:13 -0700 (PDT)
Linus Torvalds <torvalds@osdl.org> wrote:

> Once you start adding data that has no clear semantics, you're screwed. At 
> that point, it's a "track guesses" game, not a "track contents" game.

Then shouldn't Git stop tracking commit comments; they're just developer
guesses. ;o)   Adding a free-form header is no different than adding a 
few more lines of free form text at the bottom of the commit message, in 
neither case does it change the nice clean git semantics.

Sean

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 17:52               ` sean
@ 2006-04-25 18:08                 ` Linus Torvalds
       [not found]                   ` <20060425141412.5c115f51.seanlkml@sympatico.ca>
  0 siblings, 1 reply; 63+ messages in thread
From: Linus Torvalds @ 2006-04-25 18:08 UTC (permalink / raw)
  To: sean; +Cc: jnareb, git

On Tue, 25 Apr 2006, sean wrote:

> On Tue, 25 Apr 2006 10:11:13 -0700 (PDT)
> Linus Torvalds <torvalds@osdl.org> wrote:
> 
> > Once you start adding data that has no clear semantics, you're screwed. At 
> > that point, it's a "track guesses" game, not a "track contents" game.
> 
> Then shouldn't Git stop tracking commit comments; they're just developer
> guesses. ;o)

No, they are pure content, and git doesn't actually give them any semantic 
meaning.

WHICH IS OK. I even suggested that you put this thing into that "pure 
content" part.

> Adding a free-form header is no different than adding a few more lines 
> of free form text at the bottom of the commit message, in neither case 
> does it change the nice clean git semantics.

Which is exactly what I told you to do. Just don't make it a git header. 

We do that already. Look at "git revert". Ooh. Aah. It works today.

Just don't make it something that changes semantics, and that git parses 
and "understands". Because git clearly doesn't understand it at all, since 
you didn't define it to have any meaning that _can_ be understood.

		Linus

^ permalink raw reply	[flat|nested] 63+ messages in thread

[parent not found: <20060425141412.5c115f51.seanlkml@sympatico.ca>]

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
       [not found]                   ` <20060425141412.5c115f51.seanlkml@sympatico.ca>
@ 2006-04-25 18:14                     ` sean
  2006-04-25 18:26                       ` Linus Torvalds
  2006-04-25 18:34                       ` Jakub Narebski
  0 siblings, 2 replies; 63+ messages in thread
From: sean @ 2006-04-25 18:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: jnareb, git

On Tue, 25 Apr 2006 11:08:31 -0700 (PDT)
Linus Torvalds <torvalds@osdl.org> wrote:

> Which is exactly what I told you to do. Just don't make it a git header. 

Well I just don't see how making it a header, or plopping it at the
end of a commit message makes an iota of difference to git, while it 
can help porcelain.

> We do that already. Look at "git revert". Ooh. Aah. It works today.

Nice.  Gotta love git.

> Just don't make it something that changes semantics, and that git parses 
> and "understands". Because git clearly doesn't understand it at all, since 
> you didn't define it to have any meaning that _can_ be understood.

But that's exactly the point, it's no different than extending git to be
able to store more than one comment.   Comment1 Comment2 Comment3.  
Pure content that git need not give any semantic meaning.  Git has a 
limitation of only a single comment today, there's no semantic damage
to extending git to allow multiple comments.   And there are a few 
applications, like bug tracking etc, which could use such a feature 
to good effect.

Sean

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 18:14                     ` sean
@ 2006-04-25 18:26                       ` Linus Torvalds
  2006-04-25 18:41                         ` Jakub Narebski
                                           ` (2 more replies)
  2006-04-25 18:34                       ` Jakub Narebski
  1 sibling, 3 replies; 63+ messages in thread
From: Linus Torvalds @ 2006-04-25 18:26 UTC (permalink / raw)
  To: sean; +Cc: jnareb, git



On Tue, 25 Apr 2006, sean wrote:

> On Tue, 25 Apr 2006 11:08:31 -0700 (PDT)
> Linus Torvalds <torvalds@osdl.org> wrote:
> 
> > Which is exactly what I told you to do. Just don't make it a git header. 
> 
> Well I just don't see how making it a header, or plopping it at the
> end of a commit message makes an iota of difference to git, while it 
> can help porcelain.

It can't help porcelain.

If we have undefined or bad semantics for it, the only thing it can do is 
_hurt_ porcelain, because it will cause confusion down the line.

Semantics for data objects are _the_ most important part of a SCM. Pretty 
much any project, in fact. 

And bad or weakly defined semantics will invariably cause problems later.

> But that's exactly the point, it's no different than extending git to be
> able to store more than one comment.

So why argue for it?

Just use the existing comment field.

		Linus

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 18:26                       ` Linus Torvalds
@ 2006-04-25 18:41                         ` Jakub Narebski
  2006-04-25 18:52                           ` Linus Torvalds
       [not found]                         ` <20060425144525.3ef957cf.seanlkml@sympatico.ca>
  2006-04-25 19:00                         ` Junio C Hamano
  2 siblings, 1 reply; 63+ messages in thread
From: Jakub Narebski @ 2006-04-25 18:41 UTC (permalink / raw)
  To: git

Linus Torvalds wrote:

> So why argue for it?
> 
> Just use the existing comment field.

For the same reason there exist X-* _header_ fields in email.

Additionally, in "related" links we require that object exist (core git),
regardless of detailed semantics.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 18:41                         ` Jakub Narebski
@ 2006-04-25 18:52                           ` Linus Torvalds
  2006-04-25 19:00                             ` Jakub Narebski
  0 siblings, 1 reply; 63+ messages in thread
From: Linus Torvalds @ 2006-04-25 18:52 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git



On Tue, 25 Apr 2006, Jakub Narebski wrote:
> 
> Additionally, in "related" links we require that object exist (core git),
> regardless of detailed semantics.

And as I've now mentioned a hundred times, that's just unacceptable to me. 
No suggested use of this has actually been useful, that I can tell.

		Linus

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 18:52                           ` Linus Torvalds
@ 2006-04-25 19:00                             ` Jakub Narebski
  2006-04-25 22:17                               ` Jason Riedy
  0 siblings, 1 reply; 63+ messages in thread
From: Jakub Narebski @ 2006-04-25 19:00 UTC (permalink / raw)
  To: git

Linus Torvalds wrote:

> 
> 
> On Tue, 25 Apr 2006, Jakub Narebski wrote:
>> 
>> Additionally, in "related" links we require that object exist (core git),
>> regardless of detailed semantics.

And history browsers (gitk, qgit) can use it, drawing line, regardless of
semantics.

> And as I've now mentioned a hundred times, that's just unacceptable to me.
> No suggested use of this has actually been useful, that I can tell.

I don't mean we shouldn't define semantic for each use of "related" or
"note" header. Just like email X-* headres have detailed form and semantic
(long, long time ago Sender was X-Sender for example ;-). It's just a
toolkit.

As to suggested "related" (requiring object to exists) headers: "bind",
"prior", and perhaps "revert".

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 19:00                             ` Jakub Narebski
@ 2006-04-25 22:17                               ` Jason Riedy
  0 siblings, 0 replies; 63+ messages in thread
From: Jason Riedy @ 2006-04-25 22:17 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

And Jakub Narebski writes:
 - I don't mean we shouldn't define semantic for each use of "related" or
 - "note" header. Just like email X-* headres have detailed form and semantic
 - (long, long time ago Sender was X-Sender for example ;-). It's just a
 - toolkit.

You just proved Linus's point.  Ever have to parse
archives of old mail?  There are many different ways
of saying the same thing, and many of the same way
of saying different things.  It's pure hell.

And people expect you to get the X-* headers correct
for whatever definition of correct they happen to have
at the moment.  ugh.  You have many de-facto semantics
for the same headers, and no way to disambiguate them.

People will need to parse and understand git archives
thirty+ years from now.  Don't place this curse on
them.

Jason

^ permalink raw reply	[flat|nested] 63+ messages in thread

[parent not found: <20060425144525.3ef957cf.seanlkml@sympatico.ca>]

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
       [not found]                         ` <20060425144525.3ef957cf.seanlkml@sympatico.ca>
@ 2006-04-25 18:45                           ` sean
  2006-04-25 19:00                             ` Linus Torvalds
  0 siblings, 1 reply; 63+ messages in thread
From: sean @ 2006-04-25 18:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: jnareb, git

On Tue, 25 Apr 2006 11:26:25 -0700 (PDT)
Linus Torvalds <torvalds@osdl.org> wrote:

> It can't help porcelain.
> 
> If we have undefined or bad semantics for it, the only thing it can do is 
> _hurt_ porcelain, because it will cause confusion down the line.
> 
> Semantics for data objects are _the_ most important part of a SCM. Pretty 
> much any project, in fact. 
> 
> And bad or weakly defined semantics will invariably cause problems later.

Take your example of how git-revert works today, it copies the comment from 
the original, thus keeping this semantic-free meta-data intact between
related commits.  However, you'd have to jump through hoops to accomplish
this same simple task with any third party meta data, unless it was 
burried inside the commit message text.

> So why argue for it?
> 
> Just use the existing comment field.

The last argument you and I had was me taking the other side, saying that 
it was fine for git to parse the free form text area to extract information; 
you rightfully showed me why that was wrong.

It's no different for a bug tracker or other 3rd party software that wants
to interface with git, it's bad design to force them to parse a single
free form text comment into individual pieces to extract their meta data.
Especially when git could easily add the ability to add multple comments
to each commit.  

Sean

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 18:45                           ` sean
@ 2006-04-25 19:00                             ` Linus Torvalds
  2006-04-25 19:18                               ` Junio C Hamano
  0 siblings, 1 reply; 63+ messages in thread
From: Linus Torvalds @ 2006-04-25 19:00 UTC (permalink / raw)
  To: sean; +Cc: jnareb, git

On Tue, 25 Apr 2006, sean wrote:
> 
> It's no different for a bug tracker or other 3rd party software that wants
> to interface with git, it's bad design to force them to parse a single
> free form text comment into individual pieces to extract their meta data.
> Especially when git could easily add the ability to add multple comments
> to each commit.  

Git _does_ make that easy. It's called the "tree". It's where you add any 
arbitrary files to a commit.

The point here is that core git should do one thing, and one thing only. 
You can then build up any policy you want on top of that. But in order for 
core git to be stable, it has to have nice rules about what it cares 
about, and what it does not.

And the rule is: git cares about the commit header, but not about the 
free-form. Which means that anything it doesn't care about, it goes into 
the free-form section, not into some "X-header" section.

Whatever you build on TOP of git can have its own rules in that free-form 
section. For example, the kernel project has this "X-header" thing called 
the "sign-off", and git itself picked it up. There's even some support to 
add it automatically to commits (the same way we add the "revert" info 
automatically to commits), but nobody claims that git should "parse" that 
information, or that it should be part of the "header".

		Linus

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 19:00                             ` Linus Torvalds
@ 2006-04-25 19:18                               ` Junio C Hamano
  2006-04-25 19:34                                 ` Linus Torvalds
  2006-04-26 12:42                                 ` Jakub Narebski
  0 siblings, 2 replies; 63+ messages in thread
From: Junio C Hamano @ 2006-04-25 19:18 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> And the rule is: git cares about the commit header, but not about the 
> free-form. Which means that anything it doesn't care about, it goes into 
> the free-form section, not into some "X-header" section.
>
> Whatever you build on TOP of git can have its own rules in that free-form 
> section. For example, the kernel project has this "X-header" thing called 
> the "sign-off", and git itself picked it up. There's even some support to 
> add it automatically to commits (the same way we add the "revert" info 
> automatically to commits), but nobody claims that git should "parse" that 
> information, or that it should be part of the "header".

Then we should drop the author header and make it part of free
form text.  The core does not give any meaning to it.  And the
name <email> part of the commit header as well.  The only thing
used by the core is the timestamp of the commit.

My initial 'related' without 'note' was flawed - it used
cherry-pick as an example of 'related' when it clearly should
have been 'note' (no connectivitiy required).

Having said what I wanted to say about 'note', let's clarify
what I have in mind about the 'related' that _means_
connectivity.  As I said, I am far less convinced it is a good
thing than I am about 'note' by now, but just for the sake of
completeness of the discussion.

I tend to agree with you that ability to misuse 'related' (I'd
call it 'link' to make it clear that it means connectivity) to
fetch/push "related" objects, with an unclear definition of
related-ness, is a bad thing.  Even if we fetched the objects
that are claimed to be related to the main project, if we do not
know what to do with them, it is not useful.

And for well defined connectivity, we could give separate names,
just like we have 'tree' and 'parent' in the commit header.
That's how "bind commit" was initially proposed.  It was not
'link bind'.

The suggestion of 'link bind' came primarily from the pain I
experienced when I taught rev-list --objects and fsck-objects
about it in the jc/bind branch.  If the only thing asked to the
core by 'link' is to make sure the related objects are made
available, and Porcelains take responsibility after they are
made available, we would be better off teaching the commit
parser how to parse 'link' (regardless of its nature of linkage)
and teach rev-list --objects and fsck-objects to do connectivity
just once, rather than adding 'bind' now and then having to do
the same backward incompatible change when adding something else
that requires connectivity.

There definitely needs to be an ability to specify a list of
"nature of links this repository accepts", if we were to do
'link'.  It probably should default to an empty set.  rev-list
--objects would include objects pointed by 'link' only when the
repository wants such links to be honored.  fsck-objects will
declare an object that is reachable only by a 'link' that is not
accepted by the repository "uninteresting" and let git-prune
remove it.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 19:18                               ` Junio C Hamano
@ 2006-04-25 19:34                                 ` Linus Torvalds
  2006-04-25 19:51                                   ` Junio C Hamano
  2006-04-26 12:42                                 ` Jakub Narebski
  1 sibling, 1 reply; 63+ messages in thread
From: Linus Torvalds @ 2006-04-25 19:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Tue, 25 Apr 2006, Junio C Hamano wrote:
> 
> Then we should drop the author header and make it part of free
> form text.  The core does not give any meaning to it.

Sure it does. It's an integral part of logging: we not only verify the 
format, we also have multiple different ways of showing it. So it 
definitely changes the way we "act", very fundamentally.

		Linus

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 19:34                                 ` Linus Torvalds
@ 2006-04-25 19:51                                   ` Junio C Hamano
  2006-04-25 19:58                                     ` Linus Torvalds
  0 siblings, 1 reply; 63+ messages in thread
From: Junio C Hamano @ 2006-04-25 19:51 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> On Tue, 25 Apr 2006, Junio C Hamano wrote:
>> 
>> Then we should drop the author header and make it part of free
>> form text.  The core does not give any meaning to it.
>
> Sure it does. It's an integral part of logging: we not only verify the 
> format, we also have multiple different ways of showing it. So it 
> definitely changes the way we "act", very fundamentally.

Unfair ;-).  I'd consider "git log" semi-Porcelain and consider
rev-list and cat-file the true core level.

But you already made it clear that you are not opposed to 'note'
with a clear semantics "we _ignore_ it", the point was moot.

Sorry for the noise.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 19:51                                   ` Junio C Hamano
@ 2006-04-25 19:58                                     ` Linus Torvalds
  0 siblings, 0 replies; 63+ messages in thread
From: Linus Torvalds @ 2006-04-25 19:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Tue, 25 Apr 2006, Junio C Hamano wrote:
> >
> > Sure it does. It's an integral part of logging: we not only verify the 
> > format, we also have multiple different ways of showing it. So it 
> > definitely changes the way we "act", very fundamentally.
> 
> Unfair ;-).  I'd consider "git log" semi-Porcelain and consider
> rev-list and cat-file the true core level.

Well, "git log" is really just "git-rev-list --pretty", so whichever way 
you turn, it's there.

I come from a slightly different background, where "core git" in many ways 
originally was about "what I use" and the whole "porcelain" side ends up 
being "what people who need hand-holding use" ;)

Of course, it expanded a bit from that original definition ;)

		Linus

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 19:18                               ` Junio C Hamano
  2006-04-25 19:34                                 ` Linus Torvalds
@ 2006-04-26 12:42                                 ` Jakub Narebski
  1 sibling, 0 replies; 63+ messages in thread
From: Jakub Narebski @ 2006-04-26 12:42 UTC (permalink / raw)
  To: git

Junio C Hamano wrote:

> My initial 'related' without 'note' was flawed - it used
> cherry-pick as an example of 'related' when it clearly should
> have been 'note' (no connectivitiy required).
[...]
> There definitely needs to be an ability to specify a list of
> "nature of links this repository accepts", if we were to do
> 'link'.  It probably should default to an empty set.  rev-list
> --objects would include objects pointed by 'link' only when the
> repository wants such links to be honored.  fsck-objects will
> declare an object that is reachable only by a 'link' that is not
> accepted by the repository "uninteresting" and let git-prune
> remove it.

I think that perhaps connectivity should be more fine-grained than this.
Namely we might want links which are not fsck-able nor pulled (and can be
dangling), but will prevent object pointed from being pruned. The
"original" (or "cherrypick") relation comes to mind.

Of course that can be configured per repository...

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 18:26                       ` Linus Torvalds
  2006-04-25 18:41                         ` Jakub Narebski
       [not found]                         ` <20060425144525.3ef957cf.seanlkml@sympatico.ca>
@ 2006-04-25 19:00                         ` Junio C Hamano
  2006-04-25 19:09                           ` Linus Torvalds
  2 siblings, 1 reply; 63+ messages in thread
From: Junio C Hamano @ 2006-04-25 19:00 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> On Tue, 25 Apr 2006, sean wrote:
>
>> On Tue, 25 Apr 2006 11:08:31 -0700 (PDT)
>> Linus Torvalds <torvalds@osdl.org> wrote:
>> 
>> > Which is exactly what I told you to do. Just don't make it a git header. 
>> 
>> Well I just don't see how making it a header, or plopping it at the
>> end of a commit message makes an iota of difference to git, while it 
>> can help porcelain.
>
> It can't help porcelain.
>
> If we have undefined or bad semantics for it, the only thing it can do is 
> _hurt_ porcelain, because it will cause confusion down the line.
>
> Semantics for data objects are _the_ most important part of a SCM. Pretty 
> much any project, in fact. 
>
> And bad or weakly defined semantics will invariably cause problems later.
>
>> But that's exactly the point, it's no different than extending git to be
>> able to store more than one comment.
>
> So why argue for it?
>
> Just use the existing comment field.

Actually, it does help Porcelain to be able to mark unrelated
crud as 'note'.  Sane people (including git barebone
Porcelainish) would just ignore it.  Unless --pretty=raw is used
the 'note' headers will not be shown.  It would unclutter
things for us.

If different Porcelains use "the existing comment field" by
defining certain mark-up to embed their own data, it has the
same "weak semantics causing confusion down the line" issue,
_and_ the crud will be shown to the end user by "git log".

So I am starting to be actually in favor of the 'note' header.

Earlier somebody wondered if that has impact on merge semantics.
I think we do _not_ care.  The core level does not track how
things changed (the operation to make preimage to postimage),
but tracks what the results of changes are (the content).

Some "misguided" set of Porcelains may come up with a convention
to record renames and token-replaces in the 'note' header to
say:

	tree 0000000000000000000000000000000000000000
        parent 0000000000000000000000000000000000000000
	author A U Thor <author@example.com> 000000000 +0000
	committer C O Mitter <comitter@example.com> 000000000 +0000
	note rename hello.c world.c
        note token-replace s/cache/index/

        Replaced old nomenclature 'cache' to 'index'.  Oh, while
        at it, I renamed hello.c to world.c.

But unlike systems that records the transformation from preimage
to postimage, we record the postimage (on "tree" header) and
preimage (by the way of "parent" header).  We (as the core and
Porcelain that do not use "note") do not even need to look at
what 'note' says.  The Porcelains that _do_ look at the note may
try to take advantage of it, and if they make better result that
would be a good thing.  I suspect such 'note rename' provided by
the end user is not trustworthy at times, so a Porcelain that
relies on that may make silent mismerge.  You may claim that is
the reason why you do not want to pull from a tree managed with
such a Porcelain.

But at the end of the day what matters is the content, and
people.

You will not be using such a Porcelain yourself, but when you
fetch the above commit, which records its tree and its parents,
git barebone Porcelainish merge will just do what it has always
done, without even looking at 'note'.  It's not like use of
'note' on the other end is forcing you to take a note on them.

Refusing to merge from a tree that is managed with a Porcelain
that uses the information in 'note rename' for its own operation
(maybe because we believe such Porcelain tends to make silent
mismerges more often) does not make much more sense than
refusing to merge from a tree whose developer uses vi (because
it tends to lose "missing LF at the end of file").  The content
matters, so you would check the merge result; and 'note' thing
is opt-in, which we opt out.

Also you ultimately trust people -- "I will pull from his tree,
because I know he is careful and has good taste".  Now the tool
they use _may_ be part of their taste, but any tool can be
misused (remember you stayed away from pulling things that have
Octopus?)

I am less (a lot less) sure about the 'related' header now,
which will be the topic of a separate message.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 19:00                         ` Junio C Hamano
@ 2006-04-25 19:09                           ` Linus Torvalds
  0 siblings, 0 replies; 63+ messages in thread
From: Linus Torvalds @ 2006-04-25 19:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, 25 Apr 2006, Junio C Hamano wrote:
> 
> Actually, it does help Porcelain to be able to mark unrelated
> crud as 'note'. 

A "note" header that explicitly has no meaning _what-so-ever_ for git 
would be fine. Then the semantics are well-defined, and they really do 
boil down to: random strings that git will ignore, and that won't normally 
be shown by "git log".

Those are actually real semantics, the same way the current "content" is 
real semantics: we don't care about it at all, and we _guarantee_ that we 
don't care about it.

The problem with the proposed "related" thing was that it was somethign 
that git was supposed to care about, but since it had no sane semantics, 
there was no way to _make_ git care about it sanely. That was the problem.

So I'm not objecting to adding headers. I'm objecting to adding headers 
that have insane or badly defined semantics where we might be asked to do 
something for them and different versions of git migth do different 
things. 

			Linus

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25 18:14                     ` sean
  2006-04-25 18:26                       ` Linus Torvalds
@ 2006-04-25 18:34                       ` Jakub Narebski
  1 sibling, 0 replies; 63+ messages in thread
From: Jakub Narebski @ 2006-04-25 18:34 UTC (permalink / raw)
  To: git

sean wrote:

> On Tue, 25 Apr 2006 11:08:31 -0700 (PDT)
> Linus Torvalds <torvalds@osdl.org> wrote:
> 
>> Which is exactly what I told you to do. Just don't make it a git header.
> 
> Well I just don't see how making it a header, or plopping it at the
> end of a commit message makes an iota of difference to git, while it 
> [storing information in X-* like header] can help porcelain.

And [graphical] history browsers like gitk or qgit.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
  2006-04-25  7:29   ` Junio C Hamano
  2006-04-25  7:43     ` Jakub Narebski
  2006-04-25 15:21     ` Linus Torvalds
@ 2006-04-25 23:18     ` Sam Vilain
  2 siblings, 0 replies; 63+ messages in thread
From: Sam Vilain @ 2006-04-25 23:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, jnareb

Junio C Hamano wrote:

>Here is a related but not necessarily competing idle thought.
>
>How about an ability to "attach" arbitrary objects to commit
>objects?  The commit object would look like:
>
>    tree 0aaa3fecff73ab428999cb9156f8abc075516abe
>    parent 5a6a8c0e012137a3f0059be40ec7b2f4aa614355
>    parent e1cbc46d12a0524fd5e710cbfaf3f178fc3da504
>    related a0e7d36193b96f552073558acf5fcc1f10528917 key
>    related 0032d548db56eac9ea09b4ba05843365f6325b85 cherrypick
>    author Junio C Hamano <junkio@cox.net> 1145943079 -0700
>    committer Junio C Hamano <junkio@cox.net> 1145943079 -0700
>  
>

I agree with the criticisms of the patchset, and I think this is
probably a more comprehensive and less ambiguous solution. I originally
thought that the use cases were close enough together that they could be
called the same thing, but I see now that they are not.

IMHO one important goal is to stop "parent" from meaning anything other
than:

1. for a regular commit, the base for this change. The change consists
of the differences between the two trees.
2. for a "merge", the merge parents for this change. The change consists
of all differences between the index merges (allowing duplicate blobs at
each location) and the final merged tree.

If you were to, for a moving merge head, just record the previous merge
as a "parent", then it would make it difficult to look at the commit
history to figure out which parent links represent the last merge, and
which represent the merge bases.

This suggestion fixes that problem nicely, while being nice and flexible
for solving the other problems too.

>    Merge branch 'pb/config' into next
>
>    * pb/config:
>      Deprecate usage of git-var -l for getting config vars list
>      git-repo-config --list support
>
>The format of "related" attribute is, keyword "related", SP, 40-byte
>hexadecimal object name, SP, and arbitrary sequence of bytes
>except LF and NUL.  Let's call this arbitrary sequence of bytes
>"the nature of relation".
>
>The semantics I would attach to these "related" links are as
>follows:
>
> * To the "core" level git, they do not mean anything other than
>   "you must to have these objects, and objects reachable from
>   them, if you are going to have this commit and claim your
>   repository is without missing objects".
>  
>

This is essentially correct, however you have already described a use
case where you want the behaviour to be to lose the previous commit chain:

>The reason I do not include the previous head when I reconstruct
>"pu" is because I explicitly *want* to drop history -- not
>having to carry forward a failed experiment is what is desired
>there.  Otherwise I would manage "pu" just like I currently do
>"next" and "master".  So this is not a justification to add
>something new.
>  
>

In this case, I think that there are types of relations that are more
along the lines of "don't bother following this link by default, but
warn/fail if it is unavailable depending on the user preferences".

git-fsck could then have options to prune (or archive) certain types of
optional relations. This way people can still record complete history if
they like. And people who want to mark portions of history as bad (such
as, violating copyright law) have a clear way to state that intent.

>That means "git-rev-list --objects" needs to list these objects
>(and if they are tags, commits, and trees, then what are
>reachable from them), and "git-fsck" needs to consider these
>related objects and objects reachable from them are reachable
>from this commit.  NOTHING ELSE NEEDS TO BE DONE by the core
>(obviously, cat-file needs to show them, and commit-tree needs to
>record them, but that goes without saying).
>  
>

Ok, I'll investigate that.

>Then porcelains can agree on what different kinds of nature of
>relation mean and do sensible things.  The earlier "omit the
>cherry-picked ones" example I gave can examine "cherrypick".
>  
>

Sounds good. Let things evolve.

Sam.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
  2006-04-25  3:54 [RFC] [PATCH 0/5] Implement 'prior' commit object links Sam Vilain
                   ` (7 preceding siblings ...)
  2006-04-25  6:44 ` [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas) Jakub Narebski
@ 2006-04-25 15:10 ` Linus Torvalds
  8 siblings, 0 replies; 63+ messages in thread
From: Linus Torvalds @ 2006-04-25 15:10 UTC (permalink / raw)
  To: Sam Vilain; +Cc: git

On Tue, 25 Apr 2006, Sam Vilain wrote:
>
> This patch series implements "prior" links in commit objects.  A
> 'prior' link on a commit represents its historical precedent, as
> opposed to the previous commit(s) that this commit builds upon.

I really don't think this is worth it.

We already have a very useful notion of "prior" commit that is used daily 
(well, weekly) for the Linux kernel, and it's used for one of the few 
places where this really makes unequivocal sense. "git revert".

It's also implemented in the only way that has clear and unambiguous 
semantics: by putting the prior link into the free-form part. The reason 
this is clear and unambiguous is that it makes it clear that it has no 
actual technical impact on any serious git strategy, ie there is never any 
question of "What does it _mean_?".

At the same time, it gives exactly what you actually _want_ for a prior 
link: it makes it easy to look up the commit that was replaced, or fixed, 
or that is related, or just any random semantics that you can explain 
easily in the text.

Both gitk and qgit already support it, and it's trivially 
cut-and-pasteable from any log message to see what it is when you work on 
the command line too.

In contrast, adding a new header is serious trouble:

 - What does it _mean_ from a technical angle? 

   Does it matter for merging? One of your patches seems to make it so, 
   which is _really_ confusing. Why should it? And does it affect anything 
   else that git does?

   Does "prior" have any meaning for "git-fsck-objects" and/or for object 
   pruning? For "git fetch/pull"?

 - What does it mean from a semantic standpoint?

   Is "prior" a note that something was reverted? Fixed? Changed? 
   Cherry-picked? And if it is Cherry-picked, than I would flat-out refuse 
   to ever merge with a tree that has it, because it pretty much by 
   definition means that the object that "prior" points to simply doesn't 
   _exist_ in my tree (since it was cherry-picked from somebody elses 
   tree). Or that it means that my history got tangled up with the history 
   of the failed branch that needed cherry-picking to clean up..

 - You say that there is just one "prior" parent, but why just one? 
   There's no way to even _think_ about this, since it seems to have no 
   actual semantic meaning.

I think all the problems really boil down to "What does this mean?"

Without an answer to that question, it's just a random feature. It's 
something that you can use and mis-use, but that has no "meaning". It only 
has whatever meaning you personally assign to it, but that implies that 
git shouldn't parse it, and shouldn't care about it.

Which again says that it should act like the current free-form thing does 
so well - it has no meaning, but it allows easy lookups.

		Linus

^ permalink raw reply	[flat|nested] 63+ messages in thread

end of thread, other threads:[~2006-04-29 14:58 UTC | newest]

Thread overview: 63+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-25  3:54 [RFC] [PATCH 0/5] Implement 'prior' commit object links Sam Vilain
2006-04-25  4:31 ` [PATCH 2/5] git-merge-base: follow 'prior' links to find merge bases Sam Vilain
2006-04-25  5:19   ` Junio C Hamano
2006-04-25  4:31 ` [PATCH 1/5] add 'prior' link in commit structure Sam Vilain
2006-04-25  5:18   ` Junio C Hamano
2006-04-25  4:31 ` [PATCH 3/5] commit.c: parse 'prior' link Sam Vilain
2006-04-25  4:31 ` [PATCH 5/5] git-commit: add --prior to set prior link Sam Vilain
2006-04-25  4:31 ` [PATCH 4/5] git-commit-tree: add support for prior Sam Vilain
2006-04-25  4:34 ` [RFC] [PATCH 0/5] Implement 'prior' commit object links Sam Vilain
2006-04-25  5:16 ` Junio C Hamano
2006-04-25 23:19   ` Sam Vilain
2006-04-26  5:06     ` Jakub Narebski
2006-04-26  5:22       ` Jakub Narebski
2006-04-26  5:36         ` [OT] " Junio C Hamano
2006-04-26  6:35           ` Jakub Narebski
2006-04-26  6:50             ` Junio C Hamano
2006-04-26  7:22               ` Jakub Narebski
2006-04-26  7:50                 ` Junio C Hamano
2006-04-26  8:44                   ` Jakub Narebski
2006-04-26  9:21                     ` Junio C Hamano
2006-04-26  9:28                   ` Jakub Narebski
2006-04-26  6:51       ` Sam Vilain
2006-04-25  6:44 ` [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas) Jakub Narebski
2006-04-25  7:29   ` Junio C Hamano
2006-04-25  7:43     ` Jakub Narebski
     [not found]       ` <20060425043436.2ff53318.seanlkml@sympatico.ca>
2006-04-25  8:34         ` sean
     [not found]         ` <20060425045752.0c6fbc21.seanlkml@sympatico.ca>
2006-04-25  8:57           ` sean
2006-04-25  9:10             ` Jakub Narebski
2006-04-25  9:58               ` Junio C Hamano
2006-04-25 10:08                 ` Jakub Narebski
2006-04-29 14:59                   ` Jakub Narebski
2006-04-25 15:21     ` Linus Torvalds
2006-04-25 15:40       ` Linus Torvalds
     [not found]         ` <20060425121700.2d1a0032.seanlkml@sympatico.ca>
2006-04-25 16:17           ` sean
2006-04-25 17:04             ` Linus Torvalds
2006-04-26 11:25             ` Andreas Ericsson
2006-04-26 12:01               ` Jakub Narebski
2006-04-25 16:27         ` Jakub Narebski
2006-04-25 17:11           ` Linus Torvalds
2006-04-25 17:36             ` Jakub Narebski
2006-04-25 17:57               ` Linus Torvalds
2006-04-25 18:06                 ` Linus Torvalds
2006-04-25 18:24                   ` Jakub Narebski
     [not found]             ` <20060425135250.5fd889f4.seanlkml@sympatico.ca>
2006-04-25 17:52               ` sean
2006-04-25 18:08                 ` Linus Torvalds
     [not found]                   ` <20060425141412.5c115f51.seanlkml@sympatico.ca>
2006-04-25 18:14                     ` sean
2006-04-25 18:26                       ` Linus Torvalds
2006-04-25 18:41                         ` Jakub Narebski
2006-04-25 18:52                           ` Linus Torvalds
2006-04-25 19:00                             ` Jakub Narebski
2006-04-25 22:17                               ` Jason Riedy
     [not found]                         ` <20060425144525.3ef957cf.seanlkml@sympatico.ca>
2006-04-25 18:45                           ` sean
2006-04-25 19:00                             ` Linus Torvalds
2006-04-25 19:18                               ` Junio C Hamano
2006-04-25 19:34                                 ` Linus Torvalds
2006-04-25 19:51                                   ` Junio C Hamano
2006-04-25 19:58                                     ` Linus Torvalds
2006-04-26 12:42                                 ` Jakub Narebski
2006-04-25 19:00                         ` Junio C Hamano
2006-04-25 19:09                           ` Linus Torvalds
2006-04-25 18:34                       ` Jakub Narebski
2006-04-25 23:18     ` Sam Vilain
2006-04-25 15:10 ` [RFC] [PATCH 0/5] Implement 'prior' commit object links Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).