Git development

Git development
 help / color / mirror / Atom feed

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
From: Junio C Hamano @ 2006-04-25  5:16 UTC (permalink / raw)
  To: Sam Vilain; +Cc: git
In-Reply-To: <20060425035421.18382.51677.stgit@localhost.localdomain>

Sam Vilain <sam.vilain@catalyst.net.nz> writes:

> Examples of use cases this helps:

My reaction to this patch series is that you try to cover quite
different and unrelated things, without thinking things through,
and end up covering nothing usefully.  What is missing in these
"use cases" is a coherent semantics.

What the "prior" means to humans and tools.  And my *guess* of
what they mean suggests you are trying to make it mean many
unrelated concepts.

>  1. heads that represent topic branch merges
>
>     This is the "pu" branch case, where the head is a merge of several
>     topic branches that is continually moved forward.

For usage like "pu", the previous "pu" head could be recorded as
one of the parents; you do not need anything special.

The reason I do not include the previous head when I reconstruct
"pu" is because I explicitly *want* to drop history -- not
having to carry forward a failed experiment is what is desired
there.  Otherwise I would manage "pu" just like I currently do
"next" and "master".  So this is not a justification to add
something new.

>  2. revising published commits / re-basing
>
>     This is what "stg" et al do.  The tools allow you to commit,
>     rewind, revise, recommit, fast forward, etc.

stg wants to have a link to the fork-point commit.  I do not
know if it is absolutely necessary (you might be able to figure
it out using merge-base, I dunno).

>     In this case, the "prior" link would point to the last revision of
>     a patch.  Tools would probably

Probably what...???

>  3. sub-projects
>
>     In this case, the commit on the "main" commit line would have a
>     "prior" link to the commit on the sub-project.  The sub-project
>     would effectively be its own head with copied commits objects on
>     the main head.

You say you can have only one "prior" per commit, which makes
this unsuitable to bind multiple subprojects into a larger
project (the earlier "bind" proposal allows zero or more).

When you, a human, see a "prior" link in "git cat-file commit"
output, what does that tell you?  Is it "the previous commit
this thing replaces?"  Or is it a commit in a different line of
development which is its subproject?  Or is it a commit that was
cherry-picked from a different line?  How would you tell?  And
assuming you _could_ somehow tell, how would it help you to know
it?

When the Plumbing and the Porcelain sees a "prior" link, what
should they do?  It hugely depends on what that link means.  You
have a patch to merge-base to include the prior commit of the
commit in question in the ancestry chain, but that is probably
valid only for case 1. and perhaps 2. If the link points at a
commit of otherwise unrelated subproject head, you would _never_
want to include that in the merge-base computation.  Neither the
"this commit was taken out of context from otherwise unrelated
branch" link you envision to use for 4.  I think including
"prior" to ancestry list for case 1. and 2. makes some sense in
the merge-base example only because (1) it does not have to be any
different from an ordinary "parent" to begin with for case 1.,
and (2) it points at fork-point which is sort of a merge-base
already.

There may be some narrower concrete use case for which you can
devise coherent semantics, and teach tools and humans how to
interpret such inter-commit relationship that are _not_
parent-child ancestry.  For example, if you have one special
link to point at a "cherry-picked" commit, rebasing _could_ take
advantage of it.  When your side branch tip is at D, and commit
D has "this was cherry-picked from commit E" note, and if you
are rebasing your work on top of F:

        A---B---C---D
       /
  o---o---E---F

the tool can notice that F can reach E and carry forward only A,
B, and C on top of F, omitting D.  So having such a link might
be useful.  But if that is what you are going to do, I do not
think you would want to conflate that with other inter-commit
relationships, such as "previous hydra cap".

Oh, and you would need an update to rev-list --objects and
fsck-objects if you are to add any new link to commit objects.
Otherwise fetch/push would not get the related commits prior
points at, and prune will happily discard them.  But before even
bothering it, you need to come up with a semantics first.

^ permalink raw reply

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
From: Sam Vilain @ 2006-04-25  4:34 UTC (permalink / raw)
  To: git
In-Reply-To: <20060425035421.18382.51677.stgit@localhost.localdomain>

Sam Vilain wrote:

>    In this case, the "prior" link would point to the last revision of
>    a patch.  Tools would probably
>  
>
... support only doing this for selected, "published" patch chains

^ permalink raw reply

* [RFC] [PATCH 0/5] Implement 'prior' commit object links
From: Sam Vilain @ 2006-04-25  3:54 UTC (permalink / raw)
  To: git

This patch series implements "prior" links in commit objects.  A
'prior' link on a commit represents its historical precedent, as
opposed to the previous commit(s) that this commit builds upon.

This is a proof of concept only; there is an outstanding bug (I put
the prior header right after parent, when it should really go after
author/committer), and room for improvement no doubt remain elsewhere.
Not to mention my shocking C coding style ;)

Examples of use cases this helps:

 1. heads that represent topic branch merges

    This is the "pu" branch case, where the head is a merge of several
    topic branches that is continually moved forward.

    topic branches     head
      ,___.   ,___.
     | TA1 | | TB1 |
      `---'   `---'    ,__.
         ^\_____^\____| H1 |
                       `--'

    + some topic branch changes and a republish:

      ,___.   ,___.
     | TA1 | | TB1 |
      `---'   `---'^   ,__.
        |^\_____^\____| H1 |
        |       |      `--'
      ,_|_.   ,_|_.      P
     | TA2 | | TB2 |     |
      `---'   `---'^     |
        ^       ^        |
      ,_|_.     |        |
     | TA3 |    |        |
      `---'     |      ,__.
         ^\______\____| H2 |
                       `--'

    key:  ^ = parent   P = prior

 2. revising published commits / re-basing

    This is what "stg" et al do.  The tools allow you to commit,
    rewind, revise, recommit, fast forward, etc.

    In this case, the "prior" link would point to the last revision of
    a patch.  Tools would probably

 3. sub-projects

    In this case, the commit on the "main" commit line would have a
    "prior" link to the commit on the sub-project.  The sub-project
    would effectively be its own head with copied commits objects on
    the main head.

 4. tracking cherry picking

    In this case, the "prior" link just points to the commit that was
    cherry picked.  This is perhaps a little different, but an idea
    that somebody else had for this feature.

Sam.

^ permalink raw reply

* [PATCH 4/5] git-commit-tree: add support for prior
From: Sam Vilain @ 2006-04-25  4:31 UTC (permalink / raw)
  To: git
In-Reply-To: <20060425035421.18382.51677.stgit@localhost.localdomain>

From: Sam Vilain <sam.vilain@catalyst.net.nz>

Add support in git-commit-tree for -r as well as associated
documentation.
---

 Documentation/git-commit-tree.txt |    6 ++++++
 commit-tree.c                     |   26 +++++++++++++++++++++-----
 2 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-commit-tree.txt b/Documentation/git-commit-tree.txt
index 27b3d12..e11ba1f 100644
--- a/Documentation/git-commit-tree.txt
+++ b/Documentation/git-commit-tree.txt
@@ -20,6 +20,9 @@ A commit object usually has 1 parent (a 
 to 16 parents.  More than one parent represents a merge of branches
 that led to them.
 
+A commit object can have 1 prior commit.  This represents the previous
+commit that this one replaces (including history).
+
 While a tree represents a particular directory state of a working
 directory, a commit represents that state in "time", and explains how
 to get there.
@@ -38,6 +41,8 @@ OPTIONS
 -p <parent commit>::
 	Each '-p' indicates the id of a parent commit object.
 	
+-r <other commit>::
+	One '-r' indicates the id of a prior commit object.
 
 Commit Information
 ------------------
@@ -45,6 +50,7 @@ Commit Information
 A commit encapsulates:
 
 - all parent object ids
+- a prior object id (optional)
 - author name, email and date
 - committer name and email and the commit time.
 
diff --git a/commit-tree.c b/commit-tree.c
index 2d86518..6660b01 100644
--- a/commit-tree.c
+++ b/commit-tree.c
@@ -61,8 +61,9 @@ static void check_valid(unsigned char *s
  */
 #define MAXPARENT (16)
 static unsigned char parent_sha1[MAXPARENT][20];
+static unsigned char prior_sha1[21] = "\0";
 
-static const char commit_tree_usage[] = "git-commit-tree <sha1> [-p <sha1>]* < changelog";
+static const char commit_tree_usage[] = "git-commit-tree <sha1> [-p <sha1>]* [-r <sha1>] < changelog";
 
 static int new_parent(int idx)
 {
@@ -99,11 +100,22 @@ int main(int argc, char **argv)
 	for (i = 2; i < argc; i += 2) {
 		char *a, *b;
 		a = argv[i]; b = argv[i+1];
-		if (!b || strcmp(a, "-p") || get_sha1(b, parent_sha1[parents]))
+		if (!b)
 			usage(commit_tree_usage);
-		check_valid(parent_sha1[parents], commit_type);
-		if (new_parent(parents))
-			parents++;
+		if (!strcmp(a, "-p")) {
+			if (get_sha1(b, parent_sha1[parents]) < 0)
+				usage(commit_tree_usage);
+			check_valid(parent_sha1[parents], commit_type);
+			if (new_parent(parents))
+				parents++;
+		}
+		else if (!strcmp(a, "-r")) {
+			if (strcmp(&prior_sha1, "") || get_sha1(b, &prior_sha1) < 0)
+				usage(commit_tree_usage);
+		}
+		else {
+			usage(commit_tree_usage);
+		}
 	}
 	if (!parents)
 		fprintf(stderr, "Committing initial tree %s\n", argv[1]);
@@ -118,6 +130,10 @@ int main(int argc, char **argv)
 	 */
 	for (i = 0; i < parents; i++)
 		add_buffer(&buffer, &size, "parent %s\n", sha1_to_hex(parent_sha1[i]));
+	if (strcmp(&prior_sha1, "")) {
+		fprintf(stderr, "Setting prior to %s\n", sha1_to_hex(&prior_sha1));
+		add_buffer(&buffer, &size, "prior %s\n", sha1_to_hex(&prior_sha1));
+	}
 
 	/* Person/date information */
 	add_buffer(&buffer, &size, "author %s\n", git_author_info(1));

^ permalink raw reply related

* [PATCH 1/5] add 'prior' link in commit structure
From: Sam Vilain @ 2006-04-25  4:31 UTC (permalink / raw)
  To: git
In-Reply-To: <20060425035421.18382.51677.stgit@localhost.localdomain>

From: Sam Vilain <sam.vilain@catalyst.net.nz>

Add a space in the commit for a prior commit that forms this commit's
historical, not substantial, precedent.

For now this is just recorded as a char* pointer, as it is not an
error condition for the commit not to be present locally.
---

 commit.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/commit.h b/commit.h
index de142af..b00a6b9 100644
--- a/commit.h
+++ b/commit.h
@@ -13,6 +13,7 @@ struct commit {
 	struct object object;
 	unsigned long date;
 	struct commit_list *parents;
+	char *prior;
 	struct tree *tree;
 	char *buffer;
 };

^ permalink raw reply related

* [PATCH 3/5] commit.c: parse 'prior' link
From: Sam Vilain @ 2006-04-25  4:31 UTC (permalink / raw)
  To: git
In-Reply-To: <20060425035421.18382.51677.stgit@localhost.localdomain>

From: Sam Vilain <sam.vilain@catalyst.net.nz>

Parse for the 'prior' link in a commit
---

 commit.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/commit.c b/commit.c
index 2717dd8..e4bc396 100644
--- a/commit.c
+++ b/commit.c
@@ -260,6 +260,18 @@ int parse_commit_buffer(struct commit *i
 			n_refs++;
 		}
 	}
+	if (!memcmp(bufptr, "prior ", 6)) {
+		unsigned char prior[20];
+		if (get_sha1_hex(bufptr + 6, prior) || bufptr[46] != '\n')
+			return error("bad prior in commit %s", sha1_to_hex(item->object.sha1));
+		bufptr += 47;
+
+		item->prior = xmalloc(21);
+		strncpy(item->prior, (char*)&prior, 20);
+		item->prior[20] = '\0';
+	} else {
+		item->prior = 0;
+	}
 	if (graft) {
 		int i;
 		struct commit *new_parent;

^ permalink raw reply related

* [PATCH 2/5] git-merge-base: follow 'prior' links to find merge bases
From: Sam Vilain @ 2006-04-25  4:31 UTC (permalink / raw)
  To: git
In-Reply-To: <20060425035421.18382.51677.stgit@localhost.localdomain>

From: Sam Vilain <sam.vilain@catalyst.net.nz>

It is possible that a good merge base may be found looking via "prior"
links as well.  We follow them where possible.
---

 merge-base.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/merge-base.c b/merge-base.c
index 07f5ab4..ed6d18c 100644
--- a/merge-base.c
+++ b/merge-base.c
@@ -207,6 +207,18 @@ static int merge_base(struct commit *rev
 			p->object.flags |= flags;
 			insert_by_date(p, &list);
 		}
+		/* If the commit has a "prior" reference, add it */
+		if (commit->prior) {
+			struct commit *prior;
+			prior = lookup_commit_reference_gently(commit->prior, 1);
+			if (prior) {
+				if ((prior->object.flags & flags) != flags) {
+					parse_commit(prior);
+					prior->object.flags |= flags;
+					insert_by_date(prior, &list);
+				}
+			}
+		}
 	}
 
 	if (!result)

^ permalink raw reply related

* [PATCH 5/5] git-commit: add --prior to set prior link
From: Sam Vilain @ 2006-04-25  4:31 UTC (permalink / raw)
  To: git
In-Reply-To: <20060425035421.18382.51677.stgit@localhost.localdomain>

From: Sam Vilain <sam.vilain@catalyst.net.nz>

Add command-line support for --prior and add a description to the
ASCIIDOC
---

 Documentation/git-commit.txt |   10 ++++++++++
 git-commit.sh                |   19 +++++++++++++++++--
 2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-commit.txt b/Documentation/git-commit.txt
index 6f2c495..ca5073c 100644
--- a/Documentation/git-commit.txt
+++ b/Documentation/git-commit.txt
@@ -10,6 +10,7 @@ SYNOPSIS
 [verse]
 'git-commit' [-a] [-s] [-v] [(-c | -C) <commit> | -F <file> | -m <msg>]
 	   [--no-verify] [--amend] [-e] [--author <author>]
+           [-p <commit>]
 	   [--] [[-i | -o ]<file>...]
 
 DESCRIPTION
@@ -106,6 +107,15 @@ but can be used to amend a merge commit.
 	index and the latest commit does not match on the
 	specified paths to avoid confusion.
 
+-p|--prior <commit>::
+	Specify a commit that this new commit is the next version of.
+        Use when you want a branch to supercede another branch, but
+        with a new commit history.  It is also use for sub-projects,
+        where commits on the parent tree mirror commits in the
+        sub-project.  <commit> does not have to exist in the local
+        repository, if it is specified as a full 40-digit hex SHA1
+        sum.  Otherwise it is parsed as a local revision.
+
 --::
 	Do not interpret any more arguments as options.
 
diff --git a/git-commit.sh b/git-commit.sh
index 26cd7ca..3feb60d 100755
--- a/git-commit.sh
+++ b/git-commit.sh
@@ -3,7 +3,7 @@ #
 # Copyright (c) 2005 Linus Torvalds
 # Copyright (c) 2006 Junio C Hamano
 
-USAGE='[-a] [-s] [-v] [--no-verify] [-m <message> | -F <logfile> | (-C|-c) <commit>) [--amend] [-e] [--author <author>] [[-i | -o] <path>...]'
+USAGE='[-a] [-s] [-v] [--no-verify] [-m <message> | -F <logfile> | (-C|-c) <commit>) [--amend] [-e] [--author <author>] [-p <commit>] [[-i | -o] <path>...]'
 SUBDIRECTORY_OK=Yes
 . git-sh-setup
 
@@ -200,6 +200,7 @@ log_given=
 log_message=
 verify=t
 verbose=
+prior=
 signoff=
 force_author=
 only_include_assumed=
@@ -344,6 +345,19 @@ do
       shift
       break
       ;;
+  -p|--p|--pr|--pri|--prio|--prior)
+      shift
+      prior="$1"
+      if echo $prior | perl -ne 'exit 1 unless /^[0-9a-f]{40}$/i'
+      then
+          prior=`echo "$prior" | tr '[A-Z]' '[a-z]'`
+      else
+	  prior=`git-rev-parse "$prior"`
+	  [ -n "$prior" ] || exit 1
+      fi
+      PRIOR="-r $prior"
+      shift
+      ;;
   -*)
       usage
       ;;
@@ -602,6 +616,7 @@ then
 		PARENTS=$(git-cat-file commit HEAD |
 			sed -n -e '/^$/q' -e 's/^parent /-p /p')
 	fi
+	
 	current=$(git-rev-parse --verify HEAD)
 else
 	if [ -z "$(git-ls-files)" ]; then
@@ -673,7 +688,7 @@ then
 		tree=$(GIT_INDEX_FILE="$TMP_INDEX" git-write-tree) &&
 		rm -f "$TMP_INDEX"
 	fi &&
-	commit=$(cat "$GIT_DIR"/COMMIT_MSG | git-commit-tree $tree $PARENTS) &&
+	commit=$(cat "$GIT_DIR"/COMMIT_MSG | git-commit-tree $tree $PARENTS $PRIOR) &&
 	git-update-ref HEAD $commit $current &&
 	rm -f -- "$GIT_DIR/MERGE_HEAD" &&
 	if test -f "$NEXT_INDEX"

^ permalink raw reply related

* [PATCH] split the diff-delta interface
From: Nicolas Pitre @ 2006-04-25  3:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

This patch splits the diff-delta interface into index creation and delta
generation.  A wrapper is provided to preserve the diff-delta() call.

This will allow for an optimization in pack-objects.c where the source 
object could be fixed and a full window of objects tentatively tried 
against 
that same source object without recomputing the source index each time.

This patch only restructure things, plus a couple cleanups for good 
measure. There is no performance change yet.

Signed-off-by: Nicolas Pitre <nico@cam.org>
---


diff --git a/delta.h b/delta.h
index 9464f3e..9ef44c1 100644
--- a/delta.h
+++ b/delta.h
@@ -1,12 +1,73 @@
 #ifndef DELTA_H
 #define DELTA_H
 
-/* handling of delta buffers */
-extern void *diff_delta(void *from_buf, unsigned long from_size,
-			void *to_buf, unsigned long to_size,
-		        unsigned long *delta_size, unsigned long max_size);
-extern void *patch_delta(void *src_buf, unsigned long src_size,
-			 void *delta_buf, unsigned long delta_size,
+/* opaque object for delta index */
+struct delta_index;
+
+/*
+ * create_delta_index: compute index data from given buffer
+ *
+ * This returns a pointer to a struct delta_index that should be passed to
+ * subsequent create_delta() calls, or to free_delta_index().  A NULL pointer
+ * is returned on failure.  The given buffer must not be freed nor altered
+ * before free_delta_index() is called.  The returned pointer must be freed
+ * using free_delta_index().
+ */
+extern struct delta_index *
+create_delta_index(const void *buf, unsigned long bufsize);
+
+/*
+ * free_delta_index: free the index created by create_delta_index()
+ */
+extern void free_delta_index(struct delta_index *index);
+
+/*
+ * create_delta: create a delta from given index for the given buffer
+ *
+ * This function may be called multiple times with different buffers using
+ * the same delta_index pointer.  If max_delta_size is non-zero and the
+ * resulting delta is to be larger than max_delta_size then NULL is returned.
+ * On success, a non-NULL pointer to the buffer with the delta data is
+ * returned and *delta_size is updated with its size.  The returned buffer
+ * must be freed by the caller.
+ */
+extern void *
+create_delta(const struct delta_index *index,
+	     const void *buf, unsigned long bufsize,
+	     unsigned long *delta_size, unsigned long max_delta_size);
+
+/*
+ * diff_delta: create a delta from source buffer to target buffer
+ *
+ * If max_delta_size is non-zero and the resulting delta is to be larger
+ * than max_delta_size then NULL is returned.  On success, a non-NULL
+ * pointer to the buffer with the delta data is returned and *delta_size is
+ * updated with its size.  The returned buffer must be freed by the caller.
+ */
+static inline void *
+diff_delta(const void *src_buf, unsigned long src_bufsize,
+	   const void *trg_buf, unsigned long trg_bufsize,
+	   unsigned long *delta_size, unsigned long max_delta_size)
+{
+	struct delta_index *index = create_delta_index(src_buf, src_bufsize);
+	if (index) {
+		void *delta = create_delta(index, trg_buf, trg_bufsize, 
+					   delta_size, max_delta_size);
+		free_delta_index(index);
+		return delta;
+	}
+	return NULL;
+}
+
+/*
+ * patch_delta: recreate target buffer given source buffer and delta data
+ *
+ * On success, a non-NULL pointer to the target buffer is returned and
+ * *trg_bufsize is updated with its size.  On failure a NULL pointer is
+ * returned.  The returned buffer must be freed by the caller.
+ */
+extern void *patch_delta(const void *src_buf, unsigned long src_size,
+			 const void *delta_buf, unsigned long delta_size,
 			 unsigned long *dst_size);
 
 /* the smallest possible delta size is 4 bytes */
@@ -14,7 +75,7 @@ #define DELTA_SIZE_MIN	4
 
 /*
  * This must be called twice on the delta data buffer, first to get the
- * expected reference buffer size, and again to get the result buffer size.
+ * expected source buffer size, and again to get the target buffer size.
  */
 static inline unsigned long get_delta_hdr_size(const unsigned char **datap,
 					       const unsigned char *top)
diff --git a/diff-delta.c b/diff-delta.c
index 1188b31..fdedf94 100644
--- a/diff-delta.c
+++ b/diff-delta.c
@@ -27,53 +27,70 @@ #include "delta.h"
 /* block size: min = 16, max = 64k, power of 2 */
 #define BLK_SIZE 16
 
-#define MIN(a, b) ((a) < (b) ? (a) : (b))
+/* maximum hash entry list for the same hash bucket */
+#define HASH_LIMIT 64
 
 #define GR_PRIME 0x9e370001
 #define HASH(v, shift) (((unsigned int)(v) * GR_PRIME) >> (shift))
 
-struct index {
+struct index_entry {
 	const unsigned char *ptr;
 	unsigned int val;
-	struct index *next;
+	struct index_entry *next;
 };
 
-static struct index ** delta_index(const unsigned char *buf,
-				   unsigned long bufsize,
-				   unsigned long trg_bufsize,
-				   unsigned int *hash_shift)
+struct delta_index {
+	const void *src_buf;
+	unsigned long src_size;
+	unsigned int hash_shift;
+	struct index_entry *hash[0];
+};
+
+struct delta_index * create_delta_index(const void *buf, unsigned long bufsize)
 {
-	unsigned int i, hsize, hshift, hlimit, entries, *hash_count;
-	const unsigned char *data;
-	struct index *entry, **hash;
+	unsigned int i, hsize, hshift, entries, *hash_count;
+	const unsigned char *data, *buffer = buf;
+	struct delta_index *index;
+	struct index_entry *entry, **hash;
 	void *mem;
 
+	if (!buf || !bufsize)
+		return NULL;
+
 	/* determine index hash size */
 	entries = bufsize  / BLK_SIZE;
 	hsize = entries / 4;
 	for (i = 4; (1 << i) < hsize && i < 31; i++);
 	hsize = 1 << i;
 	hshift = 32 - i;
-	*hash_shift = hshift;
 
 	/* allocate lookup index */
-	mem = malloc(hsize * sizeof(*hash) + entries * sizeof(*entry));
+	mem = malloc(sizeof(*index) +
+		     sizeof(*hash) * hsize +
+		     sizeof(*entry) * entries);
 	if (!mem)
 		return NULL;
+	index = mem;
+	mem = index + 1;
 	hash = mem;
-	entry = mem + hsize * sizeof(*hash);
+	mem = hash + hsize;
+	entry = mem;
+
+	index->src_buf = buf;
+	index->src_size = bufsize;
+	index->hash_shift = hshift;
 	memset(hash, 0, hsize * sizeof(*hash));
 
 	/* allocate an array to count hash entries */
 	hash_count = calloc(hsize, sizeof(*hash_count));
 	if (!hash_count) {
-		free(hash);
+		free(index);
 		return NULL;
 	}
 
 	/* then populate the index */
-	data = buf + entries * BLK_SIZE - BLK_SIZE;
-	while (data >= buf) {
+	data = buffer + entries * BLK_SIZE - BLK_SIZE;
+	while (data >= buffer) {
 		unsigned int val = adler32(0, data, BLK_SIZE);
 		i = HASH(val, hshift);
 		entry->ptr = data;
@@ -91,27 +108,18 @@ static struct index ** delta_index(const
 	 * bucket that would bring us to O(m*n) computing costs (m and n
 	 * corresponding to reference and target buffer sizes).
 	 *
-	 * The more the target buffer is large, the more it is important to
-	 * have small entry lists for each hash buckets.  With such a limit
-	 * the cost is bounded to something more like O(m+n).
-	 */
-	hlimit = (1 << 26) / trg_bufsize;
-	if (hlimit < 4*BLK_SIZE)
-		hlimit = 4*BLK_SIZE;
-
-	/*
-	 * Now make sure none of the hash buckets has more entries than
+	 * Make sure none of the hash buckets has more entries than
 	 * we're willing to test.  Otherwise we cull the entry list
 	 * uniformly to still preserve a good repartition across
 	 * the reference buffer.
 	 */
 	for (i = 0; i < hsize; i++) {
-		if (hash_count[i] < hlimit)
+		if (hash_count[i] < HASH_LIMIT)
 			continue;
 		entry = hash[i];
 		do {
-			struct index *keep = entry;
-			int skip = hash_count[i] / hlimit / 2;
+			struct index_entry *keep = entry;
+			int skip = hash_count[i] / HASH_LIMIT / 2;
 			do {
 				entry = entry->next;
 			} while(--skip && entry);
@@ -120,7 +128,12 @@ static struct index ** delta_index(const
 	}
 	free(hash_count);
 
-	return hash;
+	return index;
+}
+
+void free_delta_index(struct delta_index *index)
+{
+	free(index);
 }
 
 /* provide the size of the copy opcode given the block offset and size */
@@ -131,21 +144,17 @@ #define COPYOP_SIZE(o, s) \
 /* the maximum size for any opcode */
 #define MAX_OP_SIZE COPYOP_SIZE(0xffffffff, 0xffffffff)
 
-void *diff_delta(void *from_buf, unsigned long from_size,
-		 void *to_buf, unsigned long to_size,
-		 unsigned long *delta_size,
-		 unsigned long max_size)
+void *
+create_delta(const struct delta_index *index,
+	     const void *trg_buf, unsigned long trg_size,
+	     unsigned long *delta_size, unsigned long max_size)
 {
 	unsigned int i, outpos, outsize, hash_shift;
 	int inscnt;
 	const unsigned char *ref_data, *ref_top, *data, *top;
 	unsigned char *out;
-	struct index *entry, **hash;
 
-	if (!from_size || !to_size)
-		return NULL;
-	hash = delta_index(from_buf, from_size, to_size, &hash_shift);
-	if (!hash)
+	if (!trg_buf || !trg_size)
 		return NULL;
 
 	outpos = 0;
@@ -153,60 +162,55 @@ void *diff_delta(void *from_buf, unsigne
 	if (max_size && outsize >= max_size)
 		outsize = max_size + MAX_OP_SIZE + 1;
 	out = malloc(outsize);
-	if (!out) {
-		free(hash);
+	if (!out)
 		return NULL;
-	}
-
-	ref_data = from_buf;
-	ref_top = from_buf + from_size;
-	data = to_buf;
-	top = to_buf + to_size;
 
 	/* store reference buffer size */
-	out[outpos++] = from_size;
-	from_size >>= 7;
-	while (from_size) {
-		out[outpos - 1] |= 0x80;
-		out[outpos++] = from_size;
-		from_size >>= 7;
+	i = index->src_size;
+	while (i >= 0x80) {
+		out[outpos++] = i | 0x80;
+		i >>= 7;
 	}
+	out[outpos++] = i;
 
 	/* store target buffer size */
-	out[outpos++] = to_size;
-	to_size >>= 7;
-	while (to_size) {
-		out[outpos - 1] |= 0x80;
-		out[outpos++] = to_size;
-		to_size >>= 7;
+	i = trg_size;
+	while (i >= 0x80) {
+		out[outpos++] = i | 0x80;
+		i >>= 7;
 	}
+	out[outpos++] = i;
 
+	ref_data = index->src_buf;
+	ref_top = ref_data + index->src_size;
+	data = trg_buf;
+	top = trg_buf + trg_size;
+	hash_shift = index->hash_shift;
 	inscnt = 0;
 
 	while (data < top) {
 		unsigned int moff = 0, msize = 0;
-		if (data + BLK_SIZE <= top) {
-			unsigned int val = adler32(0, data, BLK_SIZE);
-			i = HASH(val, hash_shift);
-			for (entry = hash[i]; entry; entry = entry->next) {
-				const unsigned char *ref = entry->ptr;
-				const unsigned char *src = data;
-				unsigned int ref_size = ref_top - ref;
-				if (entry->val != val)
-					continue;
-				if (ref_size > top - src)
-					ref_size = top - src;
-				if (ref_size > 0x10000)
-					ref_size = 0x10000;
-				if (ref_size <= msize)
-					break;
-				while (ref_size-- && *src++ == *ref)
-					ref++;
-				if (msize < ref - entry->ptr) {
-					/* this is our best match so far */
-					msize = ref - entry->ptr;
-					moff = entry->ptr - ref_data;
-				}
+		struct index_entry *entry;
+		unsigned int val = adler32(0, data, BLK_SIZE);
+		i = HASH(val, hash_shift);
+		for (entry = index->hash[i]; entry; entry = entry->next) {
+			const unsigned char *ref = entry->ptr;
+			const unsigned char *src = data;
+			unsigned int ref_size = ref_top - ref;
+			if (entry->val != val)
+				continue;
+			if (ref_size > top - src)
+				ref_size = top - src;
+			if (ref_size > 0x10000)
+				ref_size = 0x10000;
+			if (ref_size <= msize)
+				break;
+			while (ref_size-- && *src++ == *ref)
+				ref++;
+			if (msize < ref - entry->ptr) {
+				/* this is our best match so far */
+				msize = ref - entry->ptr;
+				moff = entry->ptr - ref_data;
 			}
 		}
 
@@ -271,7 +275,6 @@ void *diff_delta(void *from_buf, unsigne
 				out = realloc(out, outsize);
 			if (!out) {
 				free(tmp);
-				free(hash);
 				return NULL;
 			}
 		}
@@ -280,7 +283,6 @@ void *diff_delta(void *from_buf, unsigne
 	if (inscnt)
 		out[outpos - inscnt - 1] = inscnt;
 
-	free(hash);
 	*delta_size = outpos;
 	return out;
 }
diff --git a/patch-delta.c b/patch-delta.c
index d95f0d9..8f318ed 100644
--- a/patch-delta.c
+++ b/patch-delta.c
@@ -13,8 +13,8 @@ #include <stdlib.h>
 #include <string.h>
 #include "delta.h"
 
-void *patch_delta(void *src_buf, unsigned long src_size,
-		  void *delta_buf, unsigned long delta_size,
+void *patch_delta(const void *src_buf, unsigned long src_size,
+		  const void *delta_buf, unsigned long delta_size,
 		  unsigned long *dst_size)
 {
 	const unsigned char *data, *top;

^ permalink raw reply related

* Re: [BUG] gitk draws a wrong line
From: Paul Mackerras @ 2006-04-25  1:31 UTC (permalink / raw)
  To: Uwe Zeisberger; +Cc: git
In-Reply-To: <20060418104014.GA2299@informatik.uni-freiburg.de>

Uwe Zeisberger writes:

> and then going to commit 10c2df65060e1ab57b2f75e0749de0ee9b8f4810, 
> I see a small superfluous line between the two commits under 10c2df.
> 
> But still worse, if I select the line going down from 10c2df and then
> select it's parent (i.e c76b6b) I get a big line ending in the commit
> descriptions and four lines ending in midair.

That is an X server bug, it seems.  Tk already clips vertices that it
sends to the X server to be within a box that is no more than 32000
pixels wide or high, but that seems not to be enough with some X
servers.  What X server version are you using and what sort of video
card?

If you're feeling adventurous, you can rebuild Tk with the patch below
(courtesy of D. Richard Hipp) and see if that fixes it.  If it does it
proves that it is an X server bug.

Paul.

--- tkCanvUtil.c.orig   2006-02-08 08:51:31.859761208 -0500
+++ tkCanvUtil.c        2006-02-08 08:57:11.744090936 -0500
@@ -1657,25 +1657,27 @@
 
     /*
     ** Constrain all vertices of the path to be within a box that is no
-    ** larger than 32000 pixels wide or height.  The top-left corner of
+    ** larger than 16000 pixels wide or height.  The top-left corner of
     ** this clipping box is 1000 pixels above and to the left of the top
     ** left corner of the window on which the canvas is displayed.
     **
     ** This means that a canvas will not display properly on a canvas
-    ** window that is larger than 31000 pixels wide or high.  That is no
+    ** window that is larger than 14000 pixels wide or high.  That is no
     ** a problem today, but might someday become a factor for ultra-high
     ** resolutions displays.
     **
     ** The X11 protocol allows us (in theory) to expand the size of the
     ** clipping box to 32767 pixels.  But we have found experimentally that
-    ** XFree86 sometimes fails to draw lines correctly if they are longe
-    ** than about 32500 pixels.  So we have left a little margin in the
-    ** size to mask that bug.
+    ** XFree86 has problems with sizes bigger than 32500 pixels and we
+    ** have received reports of other X servers running in to trouble
+    ** at around 29000 pixels.  So we are going to play it safe a limit
+    ** pixel values to 14 bytes: 16384.  That is still sufficient for
+    ** a 4x4 ft display at 300 dpi.
     */
     lft = canvPtr->xOrigin - 1000.0;
     top = canvPtr->yOrigin - 1000.0;
-    rgh = lft + 32000.0;
-    btm = top + 32000.0;
+    rgh = lft + 16383.0;
+    btm = top + 16383.0;
 
     /* Try the common case first - no clipping.  Loop over the input
     ** coordinates and translate them into appropriate output coordinates.

^ permalink raw reply

* [PATCH] rev-parse: better error message for ambiguous arguments
From: Paul Mackerras @ 2006-04-25  0:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, torvalds

Currently, if git-rev-parse encounters an argument that is neither a
recognizable revision name nor the name of an existing file or
directory, and it hasn't encountered a "--" argument, it prints an
error message saying "No such file or directory".  This can be
confusing for users, including users of programs such as gitk that
use git-rev-parse, who may then think that they can't ask about the
history of files that no longer exist.

This makes it print a better error message, one that points out the
ambiguity and tells the user what to do to fix it.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
diff --git a/rev-parse.c b/rev-parse.c
index e956cd5..7f66ae2 100644
--- a/rev-parse.c
+++ b/rev-parse.c
@@ -160,6 +160,14 @@ static int show_file(const char *arg)
 	return 0;
 }

+static void die_badfile(const char *arg)
+{
+	if (errno != ENOENT)
+		die("'%s': %s", arg, strerror(errno));
+	die("'%s' is ambiguous - revision name or file/directory name?\n"
+	    "Please put '--' before the list of filenames.", arg);
+}
+
 int main(int argc, char **argv)
 {
 	int i, as_is = 0, verify = 0;
@@ -176,7 +184,7 @@ int main(int argc, char **argv)
 		if (as_is) {
 			if (show_file(arg) && as_is < 2)
 				if (lstat(arg, &st) < 0)
-					die("'%s': %s", arg, strerror(errno));
+					die_badfile(arg);
 			continue;
 		}
 		if (!strcmp(arg,"-n")) {
@@ -343,7 +351,7 @@ int main(int argc, char **argv)
 		if (verify)
 			die("Needed a single revision");
 		if (lstat(arg, &st) < 0)
-			die("'%s': %s", arg, strerror(errno));
+			die_badfile(arg);
 	}
 	show_default();
 	if (verify && revs_count != 1)

^ permalink raw reply related

* Re: lstat() call in rev-parse.c
From: Paul Mackerras @ 2006-04-24 23:23 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0604230906370.3701@g5.osdl.org>

Linus Torvalds writes:

> So the rule is: if you don't give that "--", then we have to be able to 
> confirm that the filenames are really files. Not a misspelled revision 
> name, or a revision name that was correctly spelled, but for the wrong 
> project, because you were in the wrong subdirectory ;)

OK, fair enough.  In that case we need a better error message, so I
don't get people complaining that gitk can't show the history of files
that don't exist any more.  How about something like:

Argument "foo" is ambiguous - revision name or file/directory name?
Please put "--" before the list of filenames.

I'll hack up a patch to this effect.

Paul.

^ permalink raw reply

* [PATCH 4/4] Document the configuration file
From: Petr Baudis @ 2006-04-24 22:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <20060424225925.14086.97825.stgit@machine.or.cz>

This patch adds a Documentation/config.txt file included by git-repo-config
and currently aggregating hopefully all the available git plumbing / core
porcelain configuration variables, as well as briefly describing the format.

It also updates an outdated bit of the example in git-repo-config(1).

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 Documentation/config.txt          |  181 +++++++++++++++++++++++++++++++++++++
 Documentation/git-repo-config.txt |   29 +++---
 config.c                          |    2 
 3 files changed, 198 insertions(+), 14 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
new file mode 100644
index 0000000..6cd9670
--- /dev/null
+++ b/Documentation/config.txt
@@ -0,0 +1,181 @@
+CONFIGURATION FILE
+------------------
+
+The git configuration file contains a number of variables that affect
+the git commands behaviour. They can be used by both the git plumbing
+and the porcelains. The variables are divided to sections, where
+in the fully qualified variable name the variable itself is the last
+dot-separated segment and the section name is everything before the last
+dot. The variable names are case-insensitive and only alphanumeric
+characters are allowed. Some variables may appear multiple times.
+
+The syntax is fairly flexible and permissive; whitespaces are mostly
+ignored. The '#' and ';' characters begin commends to the end of line,
+blank lines are ignored, lines containing strings enclosed in square
+brackets start sections and all the other lines are recognized
+as setting variables, in the form 'name = value'. If there is no equal
+sign on the line, the entire line is taken as 'name' and the variable
+is recognized as boolean "true". String values may be entirely or partially
+enclosed in double quotes; some variables may require special value format.
+
+Example
+~~~~~~~
+
+	# Core variables
+	[core]
+		; Don't trust file modes
+		filemode = false
+	
+	# Our diff algorithm
+	[diff]
+		external = "/usr/local/bin/gnu-diff -u"
+		renames = true
+
+Variables
+~~~~~~~~~
+
+Note that this list is non-comprehensive and not necessarily complete.
+For command-specific variables, you will find more detailed description
+in the appropriate manual page. You will find description of non-core
+porcelain configuration variables in the respective porcelain documentation.
+
+core.fileMode::
+	If false, the executable bit differences between the index and
+	the working copy are ignored; useful on broken filesystems like FAT.
+	See gitlink:git-update-index[1]. True by default.
+
+core.gitProxy::
+	A "proxy command" to execute (as 'command host port') instead
+	of establishing direct connection to the remote server when
+	using the git protocol for fetching. If the variable value is
+	in the "COMMAND for DOMAIN" format, the command is applied only
+	on hostnames ending with the specified domain string. This variable
+	may be set multiple times and is matched in the given order;
+	the first match wins.
+
+	Can be overriden by the 'GIT_PROXY_COMMAND' environment variable
+	(which always applies universally, without the special "for"
+	handling).
+
+core.ignoreStat::
+	The working copy files are assumed to stay unchanged until you
+	mark them otherwise manually - Git will not detect the file changes
+	by lstat() calls. This is useful on systems where those are very
+	slow, such as Microsoft Windows.  See gitlink:git-update-index[1].
+	False by default.
+
+core.onlyUseSymrefs::
+	Always use the "symref" format instead of symbolic links for HEAD
+	and other symbolic reference files. True by default.
+
+core.repositoryFormatVersion::
+	Internal variable identifying the repository format and layout
+	version.
+
+core.sharedRepository::
+	If true, the repository is made shareable between several users
+	in a group (making sure all the files and objects are group-writable).
+	See gitlink:git-init-db[1]. False by default.
+
+core.warnAmbiguousRefs::
+	If true, git will warn you if the ref name you passed it is ambiguous
+	and might match multiple refs in the .git/refs/ tree. True by default.
+
+apply.whitespace::
+	Tells `git-apply` how to handle whitespaces, in the same way
+	as the '--whitespace' option. See gitlink:git-apply[1].
+
+diff.renameLimit::
+	The number of files to consider when performing the copy/rename
+	detection; equivalent to the git diff option '-l'.
+
+format.headers::
+	Additional email headers to include in a patch to be submitted
+	by mail.  See gitlink:git-format-patch[1].
+
+gitcvs.enabled::
+	Whether the cvs pserver interface is enabled for this repository.
+	See gitlink:git-cvsserver[1].
+
+gitcvs.logfile::
+	Path to a log file where the cvs pserver interface well... logs
+	various stuff. See gitlink:git-cvsserver[1].
+
+http.sslVerify::
+	Whether to verify the SSL certificate when fetching or pushing
+	over HTTPS. Can be overriden by the 'GIT_SSL_NO_VERIFY' environment
+	variable.
+
+http.sslCert::
+	File containing the SSL certificate when fetching or pushing
+	over HTTPS. Can be overriden by the 'GIT_SSL_CERT' environment
+	variable.
+
+http.sslKey::
+	File containing the SSL private key when fetching or pushing
+	over HTTPS. Can be overriden by the 'GIT_SSL_KEY' environment
+	variable.
+
+http.sslCAInfo::
+	File containing the certificates to verify the peer with when
+	fetching or pushing over HTTPS. Can be overriden by the
+	'GIT_SSL_CAINFO' environment variable.
+
+http.sslCAPath::
+	Path containing files with the CA certificates to verify the peer
+	with when fetching or pushing over HTTPS. Can be overriden
+	by the 'GIT_SSL_CAPATH' environment variable.
+
+http.maxRequests::
+	How many HTTP requests to launch in parallel. Can be overriden
+	by the 'GIT_HTTP_MAX_REQUESTS' environment variable. Default is 5.
+
+http.lowSpeedLimit, http.lowSpeedTime::
+	If the HTTP transfer speed is less than 'http.lowSpeedLimit'
+	for longer than 'http.lowSpeedTime' seconds, the transfer is aborted.
+	Can be overriden by the 'GIT_HTTP_LOW_SPEED_LIMIT' and
+	'GIT_HTTP_LOW_SPEED_TIME' environment variables.
+
+i18n.commitEncoding::
+	Character encoding the commit messages are stored in; git itself
+	does not care per se, but this information is necessary e.g. when
+	importing commits from emails or in the gitk graphical history
+	browser (and possibly at other places in the future or in other
+	porcelains). See e.g. gitlink:git-mailinfo[1]. Defaults to 'utf-8'.
+
+merge.summary::
+	Whether to include summaries of merged commits in newly created
+	merge commit messages. False by default.
+
+pull.octopus::
+	The default merge strategy to use when pulling multiple branches
+	at once.
+
+pull.twohead::
+	The default merge strategy to use when pulling a single branch.
+
+show.difftree::
+	The default gitlink:git-diff-tree[1] arguments to be used
+	for gitlink:git-show[1].
+
+showbranch.default::
+	The default set of branches for gitlink:git-show-branch[1].
+	See gitlink:git-show-branch[1].
+
+user.email::
+	Your email address to be recorded in any newly created commits.
+	Can be overriden by the 'GIT_AUTHOR_EMAIL' and 'GIT_COMMITTER_EMAIL'
+	environment variables.  See gitlink:git-commit-tree[1].
+
+user.name::
+	Your full name to be recorded in any newly created commits.
+	Can be overriden by the 'GIT_AUTHOR_NAME' and 'GIT_COMMITTER_NAME'
+	environment variables.  See gitlink:git-commit-tree[1].
+
+whatchanged.difftree::
+	The default gitlink:git-diff-tree[1] arguments to be used
+	for gitlink:git-whatchanged[1].
+
+imap::
+	The configuration variables in the 'imap' section are described
+	in gitlink:git-imap-send[1].
diff --git a/Documentation/git-repo-config.txt b/Documentation/git-repo-config.txt
index c08ab77..566cfa1 100644
--- a/Documentation/git-repo-config.txt
+++ b/Documentation/git-repo-config.txt
@@ -91,11 +91,11 @@ Given a .git/config like this:
 		renames = true
 
 	; Proxy settings
-	[proxy]
-		command="ssh" for "ssh://kernel.org/"
-		command="proxy-command" for kernel.org
-		command="myprotocol-command" for "my://"
-		command=default-proxy ; for all the rest
+	[core]
+		gitproxy="ssh" for "ssh://kernel.org/"
+		gitproxy="proxy-command" for kernel.org
+		gitproxy="myprotocol-command" for "my://"
+		gitproxy=default-proxy ; for all the rest
 
 you can set the filemode to true with
 
@@ -108,7 +108,7 @@ to what URL they apply. Here is how to c
 to "ssh".
 
 ------------
-% git repo-config proxy.command '"ssh" for kernel.org' 'for kernel.org$'
+% git repo-config core.gitproxy '"ssh" for kernel.org' 'for kernel.org$'
 ------------
 
 This makes sure that only the key/value pair for kernel.org is replaced.
@@ -119,7 +119,7 @@ To delete the entry for renames, do
 % git repo-config --unset diff.renames
 ------------
 
-If you want to delete an entry for a multivar (like proxy.command above),
+If you want to delete an entry for a multivar (like core.gitproxy above),
 you have to provide a regex matching the value of exactly one line.
 
 To query the value for a given key, do
@@ -137,27 +137,27 @@ or
 or, to query a multivar:
 
 ------------
-% git repo-config --get proxy.command "for kernel.org$"
+% git repo-config --get core.gitproxy "for kernel.org$"
 ------------
 
 If you want to know all the values for a multivar, do:
 
 ------------
-% git repo-config --get-all proxy.command
+% git repo-config --get-all core.gitproxy
 ------------
 
-If you like to live dangerous, you can replace *all* proxy.commands by a
+If you like to live dangerous, you can replace *all* core.gitproxy by a
 new one with
 
 ------------
-% git repo-config --replace-all proxy.command ssh
+% git repo-config --replace-all core.gitproxy ssh
 ------------
 
 However, if you really only want to replace the line for the default proxy,
 i.e. the one without a "for ..." postfix, do something like this:
 
 ------------
-% git repo-config proxy.command ssh '! for '
+% git repo-config core.gitproxy ssh '! for '
 ------------
 
 To actually match only values with an exclamation mark, you have to
@@ -167,13 +167,16 @@ To actually match only values with an ex
 ------------
 
 
+include::config.txt[]
+
+
 Author
 ------
 Written by Johannes Schindelin <Johannes.Schindelin@gmx.de>
 
 Documentation
 --------------
-Documentation by Johannes Schindelin.
+Documentation by Johannes Schindelin, Petr Baudis and the git-list <git@vger.kernel.org>.
 
 GIT
 ---
diff --git a/config.c b/config.c
index 7ea8a73..4e1f0c2 100644
--- a/config.c
+++ b/config.c
@@ -252,7 +252,7 @@ int git_default_config(const char *var, 
 		return 0;
 	}
 
-	/* Add other config variables here.. */
+	/* Add other config variables here and to Documentation/config.txt. */
 	return 0;
 }
 

^ permalink raw reply related

* [PATCH 3/4] Deprecate usage of git-var -l for getting config vars list
From: Petr Baudis @ 2006-04-24 22:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <20060424225925.14086.97825.stgit@machine.or.cz>

This has been an unfortunate sideway in the git API evolution.
We use git-repo-config for all the other .git/config interaction
so let's also use git-repo-config -l for the variable listing.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 Documentation/git-var.txt |    3 ++-
 git-cvsserver.perl        |    6 +++---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-var.txt b/Documentation/git-var.txt
index 379571e..a5b1a0d 100644
--- a/Documentation/git-var.txt
+++ b/Documentation/git-var.txt
@@ -19,7 +19,8 @@ OPTIONS
 -l::
 	Cause the logical variables to be listed. In addition, all the
 	variables of the git configuration file .git/config are listed
-	as well.
+	as well. (However, the configuration variables listing functionality
+	is deprecated in favor of `git-repo-config -l`.)
 
 EXAMPLE
 --------
diff --git a/git-cvsserver.perl b/git-cvsserver.perl
index 7d3f78e..0b37d26 100755
--- a/git-cvsserver.perl
+++ b/git-cvsserver.perl
@@ -171,11 +171,11 @@ sub req_Root
        return 0;
     }
 
-    my @gitvars = `git-var -l`;
+    my @gitvars = `git-repo-config -l`;
     if ($?) {
-       print "E problems executing git-var on the server -- this is not a git repository or the PATH is not set correcly.\n";
+       print "E problems executing git-repo-config on the server -- this is not a git repository or the PATH is not set correcly.\n";
         print "E \n";
-        print "error 1 - problem executing git-var\n";
+        print "error 1 - problem executing git-repo-config\n";
        return 0;
     }
     foreach my $line ( @gitvars )

^ permalink raw reply related

* [PATCH 2/4] Document git-var -l listing also configuration variables
From: Petr Baudis @ 2006-04-24 22:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <20060424225925.14086.97825.stgit@machine.or.cz>

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 Documentation/git-var.txt |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/Documentation/git-var.txt b/Documentation/git-var.txt
index 90cb157..379571e 100644
--- a/Documentation/git-var.txt
+++ b/Documentation/git-var.txt
@@ -17,7 +17,9 @@ Prints a git logical variable.
 OPTIONS
 -------
 -l::
-	Cause the logical variables to be listed.
+	Cause the logical variables to be listed. In addition, all the
+	variables of the git configuration file .git/config are listed
+	as well.
 
 EXAMPLE
 --------
@@ -46,6 +48,7 @@ See Also
 --------
 gitlink:git-commit-tree[1]
 gitlink:git-tag[1]
+gitlink:git-repo-config[1]
 
 Author
 ------

^ permalink raw reply related

* [PATCH 1/4] git-repo-config --list support
From: Petr Baudis @ 2006-04-24 22:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

This adds git-repo-config --list (or git-repo-config -l) support,
similar to what git-var -l does now (to be phased out so that we
have a single sane interface to the config file instead of fragmented
and confused API).

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 Documentation/git-repo-config.txt |    4 ++++
 repo-config.c                     |   16 ++++++++++++++--
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-repo-config.txt b/Documentation/git-repo-config.txt
index 26759a8..c08ab77 100644
--- a/Documentation/git-repo-config.txt
+++ b/Documentation/git-repo-config.txt
@@ -15,6 +15,7 @@ SYNOPSIS
 'git-repo-config' [type] --get-all name [value_regex]
 'git-repo-config' [type] --unset name [value_regex]
 'git-repo-config' [type] --unset-all name [value_regex]
+'git-repo-config' -l | --list
 
 DESCRIPTION
 -----------
@@ -64,6 +65,9 @@ OPTIONS
 --unset-all::
 	Remove all matching lines from .git/config.
 
+-l, --list::
+	List all variables set in .git/config.
+
 
 EXAMPLE
 -------
diff --git a/repo-config.c b/repo-config.c
index c5ebb76..fa8aba7 100644
--- a/repo-config.c
+++ b/repo-config.c
@@ -2,7 +2,7 @@ #include "cache.h"
 #include <regex.h>
 
 static const char git_config_set_usage[] =
-"git-repo-config [ --bool | --int ] [--get | --get-all | --replace-all | --unset | --unset-all] name [value [value_regex]]";
+"git-repo-config [ --bool | --int ] [--get | --get-all | --replace-all | --unset | --unset-all] name [value [value_regex]] | --list";
 
 static char* key = NULL;
 static char* value = NULL;
@@ -12,6 +12,15 @@ static int do_not_match = 0;
 static int seen = 0;
 static enum { T_RAW, T_INT, T_BOOL } type = T_RAW;
 
+static int show_all_config(const char *key_, const char *value_)
+{
+	if (value_)
+		printf("%s=%s\n", key_, value_);
+	else
+		printf("%s\n", key_);
+	return 0;
+}
+
 static int show_config(const char* key_, const char* value_)
 {
 	if (value_ == NULL)
@@ -67,7 +76,7 @@ static int get_value(const char* key_, c
 		}
 	}
 
-	i = git_config(show_config);
+	git_config(show_config);
 	if (value) {
 		printf("%s\n", value);
 		free(value);
@@ -99,6 +108,9 @@ int main(int argc, const char **argv)
 		argv++;
 	}
 
+	if (!strcmp(argv[1], "--list") || !strcmp(argv[1], "-l"))
+		return git_config(show_all_config);
+
 	switch (argc) {
 	case 2:
 		return get_value(argv[1], NULL);

^ permalink raw reply related

* Re: maintenance of cache-tree data
From: Junio C Hamano @ 2006-04-24 22:34 UTC (permalink / raw)
  To: git; +Cc: Linus Torvalds
In-Reply-To: <7vodyq64p7.fsf_-_@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> writes:

> The number one reason was because the current index file format
> is pretty dense, and I did not find an obvious hole in the
> front, in the middle or at the tail to sneak extra data in
> without upsetting existing code and without updating index file
> version.

Well, I was blind ;-).  As long as the whole-file SHA1 matches,
read_cache() does not care if we have extra data after the
series of active_nr cache entry data in the index file.

I'm working on a patch now.

^ permalink raw reply

* maintenance of cache-tree data
From: Junio C Hamano @ 2006-04-24 21:31 UTC (permalink / raw)
  To: git; +Cc: Linus Torvalds
In-Reply-To: <7vvesz8r8o.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> writes:

>  (1) When git-write-tree writes out trees from the index, we
>      store <directory, tree SHA1> pair for all the trees we
>      compute, in .git/index.aux file.

There are two reasons I did not make this extra information part
of the index file.

The number one reason was because the current index file format
is pretty dense, and I did not find an obvious hole in the
front, in the middle or at the tail to sneak extra data in
without upsetting existing code and without updating index file
version.  If this were a change to add a great new feature, that
might have warranted bumping the version up, but the cache-tree
is an optimization and if you lose that information, all you
have lost is that now your write-tree needs to recompute the
whole tree as before (IOW, not much).

The second reason was I wanted to do this step-by-step, and
wanted to do a demonstration of an end-to-end workflow that
gains from this set of changes (apply followed by write-tree
was an obvious minimum set of commands), and while doing so I
did not have to disrupt other commands that are unaware of this
extension.

Having said that, cache-tree.c has an internal interface to
serialize the cache-tree data in-core, primarily because I was
unsure if I wanted to append this information somewhere inside
the main index file or have it external when I did it; so we
could later push it into the index if we wanted to.

Ideally, everybody who writes index should be converted to
update cache-tree to help eventual write-tree.  Right now, if a
command updates index without updating a matching cache-tree,
the whole thing is invalidated; this way, you do not risk using
stale "cached" data.

Currently the command that primes the cache-tree is write-tree.
This may be counterintuitive -- even I myself would expect that
read-tree would prime it, and various index-updaters invalidate
subtrees they touch, and write-tree to use the surviving parts
to speed up what it needs to do, and write out an updated,
fully-vaild cache-tree.  I did not do it only because it was not
necessary for "apply then write-tree" cycle, but read-tree
should be taught about cache-tree to help others, _and_ to help
itself.

When "read-tree -m O A B" merges three trees, we iterate over
all index entries, even when only a small part of the tree has
changed.  This could be helped in a big way if the current index
has valid cache-tree information for the parts unaffected by the
merge.  If all three trees have identical tree in higher level
subdirectory (e.g. "fs/" in the kernel source), and if the index
has not touched anything in "fs/" since it read from our tree
(i.e. "A"), then we do not even have to descend into that
directory in the working tree to see which index entries are
dirty.  We can just keep index entries that begin with "fs/"
intact, keep the cache-tree entry for that directory as it was
read from "A".  This would be a big win -- we do not have to
read tree objects under "fs/" (there are 62 trees under "fs/",
so we save uncompressing and undeltifying 180 objects).  This
operation would need to invalidate entries in cache-tree that
are involved in the actual merge.  Branch-switching two-tree
form "read-tree -m OLD NEW" probably can benefit from the same
optimization.

Obviously, a single-tree form of read-tree should be able to
prime the cache-tree with fully valid data before writing the
index out.

Having said that, I am not touching read-tree myself for now; I
am lazy.

^ permalink raw reply

* Re: [PATCH] Add git-stash to stash the working tree to a new tagged name
From: Carl Worth @ 2006-04-24 20:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vfyk27kz4.fsf@assigned-by-dhcp.cox.net>

[-- Attachment #1: Type: text/plain, Size: 278 bytes --]

On Mon, 24 Apr 2006 13:54:39 -0700, Junio C Hamano wrote:
> If I were doing this today, I would probably do this:
...
>         git commit --amend

Oh, that's fantastic.

I hadn't picked up on this feature before, and it looks quite
useful. Thanks for pointing that out.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply

* Re: [PATCH] Add git-stash to stash the working tree to a new tagged name
From: Junio C Hamano @ 2006-04-24 20:54 UTC (permalink / raw)
  To: Carl Worth; +Cc: git
In-Reply-To: <8764kyzrwq.wl%cworth@cworth.org>

Carl Worth <cworth@cworth.org> writes:

> Stashing (on branch 'feature'):
>
> 	git commit -a -m 'snapshot WIP'
>
> Recovering:
>
> 	git checkout feature
> 	git reset --soft HEAD^
> 	git reset

If I were doing this today, I would probably do this:

	git commit -a -m 'WIP'

        git checkout elsewhere ;# interrupted
        ... hack hack hack ...

        git checkout feature ;# come back
        ... hack hack hack ...
        git commit --amend

But I wonder why the originally suggested sequence is reset soft
to the state we want and then another reset.  Without
experimenting myself or thinking hard about it, I would expect
"git reset HEAD^" should do what we want, in which case:

Stashing (on branch 'feature'):

	git commit -a -m 'snapshot WIP'

Recovering:

	git checkout feature
	git reset HEAD^

^ permalink raw reply

* Re: RFC: New diff-delta.c implementation
From: Petr Baudis @ 2006-04-24 20:37 UTC (permalink / raw)
  To: Geert Bosch; +Cc: Nicolas Pitre, Rene Scharfe, Git Mailing List, Junio C Hamano
In-Reply-To: <20060424151901.GA2663@adacore.com>

Dear diary, on Mon, Apr 24, 2006 at 05:19:01PM CEST, I got a letter
where Geert Bosch <bosch@adacore.com> said that...
> > But here comes the sad part.  Even after simplifying the code as much as 
> > I could, performance is still significantly worse than the current 
> > diff-delta.c code.  Repacking again the same Linux kernel repository 
> > with the current code:
> That's unexpected, but I can see how this could be if most files have
> very few differences and are relatively small. For such cases, almost
> any hash will do, and the more complicated hashing will be more compute
> intensive.
> 
> 
> I have benchmarked my original diff code on a set of large files with
> lots of changes. These are hardest to get right, and hardest to get
> good performance with. Just try diffing any two large (uncompressed)
> tar files, and you'll see. On many of such large files, the new code
> is orders of magnitude faster. On these cases, the resulting deltas
> are also much smaller.
> 
> The comparison is a bit between a O(n^2) sort that is fast on small
> or mostly sorted inputs (but horrible on large ones) and a more
> complex O(nlogn) algorithm that is a bit slower for the simple
> cases, but far faster for more complex cases.

Can't you just switch between different delta algorithms based on some
heuristic like the blob size?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply

* Re: weird pull behavior as of late
From: David S. Miller @ 2006-04-24 19:46 UTC (permalink / raw)
  To: vsu; +Cc: git
In-Reply-To: <20060424132922.6e634188.vsu@altlinux.ru>

From: Sergey Vlasov <vsu@altlinux.ru>
Date: Mon, 24 Apr 2006 13:29:22 +0400

> On Sun, 23 Apr 2006 17:59:53 -0700 (PDT) David S. Miller wrote:
> 
> > Fast forward
> >  MAINTAINERS |    4 ++++
> >  1 files changed, 4 insertions(+), 0 deletions(-)
> > 
> > I got 446 objects and this amounted to just a 4 line change to the
> > MAINTAINERS file? :-)
> 
> I got the same problem recently and tracked it down to a stale diff.o
> object file inside libgit.a - apparently "ar rcs" does not recreate the
> archive from scratch.  After "make clean" the problem has vanished.

Thanks, I'll give that a try.

^ permalink raw reply

* [PATCH] Add git-stash to stash the working tree to a new tagged name
From: Carl Worth @ 2006-04-24 19:37 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 4562 bytes --]

This is a really simple script which is convenient for stashing
changes in the current working tree:

	git stash <tagname>

and recovering them later with something like:

	git cherry-pick -n <tagname>
	git tag -d <tagname>	# optional cleanup

---

This could also be implemented with a branch rather than a tag, (which
would require simply removing the "git-tag" and "git-branch -D" lines
from the implementation. But I preferred to emphasize that a stashed
commit is not a natural basis for more development.

Less than seriously proposing this for inclusion as is, I'm trying to
start some discussion around a workflow issue I keep running into. The
git-reset documentation describes an "interrupted workflow" case and
suggests something like the stash and recover operations I describe
above but with the following usage:

Stashing (on branch 'feature'):

	git commit -a -m 'snapshot WIP'

Recovering:

	git checkout feature
	git reset --soft HEAD^
	git reset

My git-stash approach isn't really a lot less work, (in fact, it
requires coming up with a new temporary name which makes it less
desirable), but it's at least easier for me to remember how to do
it. For me at least, it seems I'm often having to re-consult the
git-reset documentation to decide which variant I need to use in any
given situation---and I still find myself making frustrating mistakes
with git-reset every once in a while.

I think an improved interface for interrupted workflow would look more
like this:

Stashing (on branch 'feature'):

	git stash

Recovering:

	git checkout feature

That would be quite pleasant. And I may write my own git-checkout
wrapper to do this based on some convention for tag naming (such as
<branchname>-stash). A cleaner implementation might involve some
per-branch metadata (as has been recently proposed) for storing the
stashed information rather than a naming convention. [At that point,
I'm suggesting ideas beyond my intent to code things up.]

And if all that were in place, maybe it would even make sense to do
automatic stashing by default whenever switching away from a branch
with a "dirty" working tree, (and without the -m option asking to
merge those changes into the destination). Currently, git-checkout
simply errors out in this situation, so there's room to add new
functionality there. That would give the most ideal interface of all,
which is simply:

Stashing (on branch 'somewhere'):

	git checkout elsewhere

Recovering:

	git checkout somewhere

One thing that is missing from all the schemes discussed so far is
that they are lossy with respect to differences that originally exist
between the working tree and the index. If an automatically stashing
scheme were implemented via per-branch metadata, then it seems it
would be feasible to stash the working tree and the index separately
into the branch's metadata.

That would be quite convenient for me, since conceptually, the
working tree and the index are just as much a part of my "current
branch state" as the parent commit is.

I'd be interested in any feedback or other ideas.

-Carl

 .gitignore   |    1 +
 Makefile     |    2 +-
 git-stash.sh |   24 ++++++++++++++++++++++++
 3 files changed, 26 insertions(+), 1 deletions(-)
 create mode 100755 git-stash.sh

686ceaf3444d70914c251ad857fc424c9334141c
diff --git a/.gitignore b/.gitignore
index b5959d6..16c3149 100644
--- a/.gitignore
+++ b/.gitignore
@@ -103,6 +103,7 @@ git-ssh-fetch
 git-ssh-pull
 git-ssh-push
 git-ssh-upload
+git-stash
 git-status
 git-stripspace
 git-svnimport
diff --git a/Makefile b/Makefile
index 8aed3af..2a9906e 100644
--- a/Makefile
+++ b/Makefile
@@ -125,7 +125,7 @@ SCRIPT_SH = \
 	git-applymbox.sh git-applypatch.sh git-am.sh \
 	git-merge.sh git-merge-stupid.sh git-merge-octopus.sh \
 	git-merge-resolve.sh git-merge-ours.sh git-grep.sh \
-	git-lost-found.sh
+	git-lost-found.sh git-stash.sh

 SCRIPT_PERL = \
 	git-archimport.perl git-cvsimport.perl git-relink.perl \
diff --git a/git-stash.sh b/git-stash.sh
new file mode 100755
index 0000000..988ca51
--- /dev/null
+++ b/git-stash.sh
@@ -0,0 +1,24 @@
+#!/bin/sh
+#
+# Copyright (C) 2006 Carl D. Worth
+
+USAGE='<tagname>'
+SUBDIRECTORY_OK=Yes
+. git-sh-setup
+
+if test "$#" -ne 1
+then
+	usage
+fi
+
+tagname="$1"
+
+headref=$(git-symbolic-ref HEAD | sed -e 's|^refs/heads/||')
+
+# We use tagname for the temporary branch name as well.
+git-checkout -b "$tagname" || exit
+git commit -a -m "Stash commit of $headref to $tagname"
+git tag "$tagname"
+git-checkout "$headref"
+git-branch -D "$tagname"
+
-- 
1.3.0.g85e6-dirty

[-- Attachment #2: Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply related

* Re: GIT URL of linux kernel tree
From: Rene Scharfe @ 2006-04-24 19:36 UTC (permalink / raw)
  To: Thomas Glanzmann; +Cc: GIT
In-Reply-To: <20060424192137.GK4000@cip.informatik.uni-erlangen.de>

Thomas Glanzmann schrieb:
> Hello everyone,
> I currently use
> 
> rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
> 
> to pull Linus Linux tree. Which is a bit suboptimal, I guess. So I am
> asking for the URL to use GITs own protocol. And where could I have
> looked it up?

The top of http://www.kernel.org/git/ says you can use
git://git.kernel.org/pub/scm/...

René

^ permalink raw reply

* Re: RFC: New diff-delta.c implementation
From: Geert Bosch @ 2006-04-24 19:23 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0604241155000.18685@alien.or.mcafeemobile.com>


On Apr 24, 2006, at 15:10, Davide Libenzi wrote:
> Right, but you are looking at highest equal-probability  
> distribution over your hash buckets ;)
> Anyway, thanks for bringing Rabin's polynomial fingerprint up from  
> the forgotten lands. Performance and delta size are quite amazing,  
> and I decided to add Rabin's delta to libxdiff.
> I hacked some code (attached) to generate T/U tables. Since  
> libxdiff must be portable everywhere, even on system w/out 64 bits  
> support, I use xrabin to create both 64 bits tables (poly degree  
> 61) and 32 bits tables (poly degree 31), and store them in a .c  
> file letting the build environment to pick the correct one for the  
> platform.

It might actually make sense to use the 32-bit code for GIT
as well, since it turns out that on the typical small source files
with few differences, the full 64-bit Rabin is a problem for
performance.

When diffing large files (my main interest), this is more than
offset by the better hash quality. For tiny files with few changes
it appears to be overkill...

   -Geert

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox