Git development

Git development
 help / color / mirror / Atom feed

* Re: gitview: Set the default width of graph cell
From: Aneesh Kumar @ 2006-03-01  7:15 UTC (permalink / raw)
  To: git, Junio C Hamano
In-Reply-To: <440460DC.7080307@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 305 bytes --]

On 2/28/06, Aneesh Kumar K.V <aneesh.kumar@gmail.com> wrote:
>
>
> Subject: gitview: Set the default width  of graph cell
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com>
>
> ---
>

I guess this one is better. Please apply this one . This is on top of
the previous one.

-aneesh

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: git.diff --]
[-- Type: text/x-patch; name="git.diff", Size: 810 bytes --]

diff --git a/contrib/gitview/gitview b/contrib/gitview/gitview
index ea05cd4..de9f3f3 100755
--- a/contrib/gitview/gitview
+++ b/contrib/gitview/gitview
@@ -513,7 +513,7 @@ class GitView:
 
 
 		scrollwin = gtk.ScrolledWindow()
-		scrollwin.set_policy(gtk.POLICY_NEVER, gtk.POLICY_AUTOMATIC)
+		scrollwin.set_policy(gtk.POLICY_AUTOMATIC, gtk.POLICY_AUTOMATIC)
 		scrollwin.set_shadow_type(gtk.SHADOW_IN)
 		vbox.pack_start(scrollwin, expand=True, fill=True)
 		scrollwin.show()
@@ -526,9 +526,6 @@ class GitView:
 		self.treeview.show()
 
 		cell = CellRendererGraph()
-		#  Set the default width to 265
-		#  This make sure that we have nice display with large tag names
-		cell.set_property("width", 265)
 		column = gtk.TreeViewColumn()
 		column.set_resizable(True)
 		column.pack_start(cell, expand=True)

^ permalink raw reply related

* Re: git-svn and huge data and modifying the git-svn-HEAD branch directly
From: Eric Wong @ 2006-03-01  6:51 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Martin Langhoff, git
In-Reply-To: <Pine.LNX.4.64.0602271634410.22647@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> wrote:
> 
> 
> On Tue, 28 Feb 2006, Martin Langhoff wrote:
> > 
> > git-svn-HEAD "moves" so it's really a bad idea to have it as a tag.
> > Nothing within core git prevents it from moving, but I think that
> > porcelains will start breaking. Tags and heads are the same thing,
> > except that heads are expected to change (specifically, to move
> > forward), and tags are expected to stand still.
> <snipped>
> Using a "refs/remotes" subdirectory makes tons of sense for something like 
> this. Or something even more specific, like "refs/svn-tracking/". Git 
> shouldn't care - all the tools _should_ work fine with any subdirectory 
> structure.

Git tools only work as long as the 'refs/{remotes,svn-tracking,...}/'
prefix is specified.  git-svn-HEAD (or any $GIT_SVN_ID-HEAD) does get
specified from the command-line quite often:
	
	git checkout -b mine git-svn-HEAD
	git-log git-svn-HEAD..head
	git-svn commit git-svn-HEAD..mine
	git-log mine..git-svn-HEAD

Should rev-parse be taught to be less strict and look for basenames
that can't be found in heads/ and tags/ in other directories?

-- 
Eric Wong

^ permalink raw reply

* [PATCH] Teach git-checkout-index to use file suffixes.
From: Shawn Pearce @ 2006-03-01  4:41 UTC (permalink / raw)
  To: git

Sometimes it is useful to unpack the unmerged stage entries
to the same directory as the tracked file itself, but with
a suffix indicating which stage that version came from.
In many user interface level scripts this is being done
by git-unpack-file followed by creating the necessary
directory structure and then moving the file into the
directory with the requested name.  It is now possible to
perform the same action for a larger set of files directly
through git-checkout-index.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>

---
 I think this completes the two features I've found missing from
 git-checkout-index:  --stdin and --suffix.  These two options
 should make writing a working directory based merge strategy
 a little easier.

 FYI: I built this on top of my immediately prior patch ('Teach
 git-checkout-index to read filenames from stdin.') so this one
 may not apply cleanly without that patch being applied first.

 Documentation/git-checkout-index.txt |   29 ++++++++++++++++++++++++++++-
 apply.c                              |    2 ++
 cache.h                              |    2 ++
 checkout-index.c                     |   14 ++++++++++++--
 entry.c                              |   10 +++++++---
 read-tree.c                          |    1 +
 6 files changed, 52 insertions(+), 6 deletions(-)

base df23c1119d0af1fbac6b8afd296113e155d9a878
last e1674dc0b01de5c34fada13f7cf5fcbb3be82d09
diff --git a/Documentation/git-checkout-index.txt b/Documentation/git-checkout-index.txt
index b0b6588..f0be2a0 100644
--- a/Documentation/git-checkout-index.txt
+++ b/Documentation/git-checkout-index.txt
@@ -9,7 +9,8 @@ git-checkout-index - Copy files from the
 SYNOPSIS
 --------
 [verse]
-'git-checkout-index' [-u] [-q] [-a] [-f] [-n] [--prefix=<string>]
+'git-checkout-index' [-u] [-q] [-a] [-f] [-n]
+		   [--prefix=<string>] [--suffix=<string>]
 		   [--stage=<number>]
 		   [-z] [--stdin]
 		   [--] [<file>]\*
@@ -43,6 +44,10 @@ OPTIONS
 	When creating files, prepend <string> (usually a directory
 	including a trailing /)
 
+--suffix=<string>::
+	When creating files, append <string> to the name.  The value
+	of <string> must not contain a directory separator (/).
+
 --stage=<number>::
 	Instead of checking out unmerged entries, copy out the
 	files from named stage.  <number> must be between 1 and 3.
@@ -120,6 +125,28 @@ $ git-checkout-index --prefix=.merged- M
 This will check out the currently cached copy of `Makefile`
 into the file `.merged-Makefile`.
 
+Export files with a suffix::
++
+----------------
+$ git-checkout-index --suffix=\#2 --stage=2 Makefile
+----------------
++
+If `Makefile` is unmerged and has a stage 2 entry in the index
+this will check out that version into the file `Makefile#2`.
+
+A suffix may be preferred over a prefix when checking out all
+unmerged entries:
++
+----------------
+$ git-checkout-index --suffix=\#1 --stage=1 --all
+$ git-checkout-index --suffix=\#2 --stage=2 --all
+$ git-checkout-index --suffix=\#3 --stage=3 --all
+----------------
++
+would unpack all unmerged stages into the same directory as the
+tracked file.  (Compare with --prefix=.stage1/ which would have
+created a partial directory tree within `.stage1/`.)
+
 
 Author
 ------
diff --git a/apply.c b/apply.c
index 244718c..1ec8473 100644
--- a/apply.c
+++ b/apply.c
@@ -1307,6 +1307,8 @@ static int check_patch(struct patch *pat
 				/* checkout */
 				costate.base_dir = "";
 				costate.base_dir_len = 0;
+				costate.name_suffix = "";
+				costate.name_suffix_len = 0;
 				costate.force = 0;
 				costate.quiet = 0;
 				costate.not_new = 0;
diff --git a/cache.h b/cache.h
index 58eec00..055e213 100644
--- a/cache.h
+++ b/cache.h
@@ -254,6 +254,8 @@ extern const char *git_committer_info(in
 struct checkout {
 	const char *base_dir;
 	int base_dir_len;
+	const char *name_suffix;
+	int name_suffix_len;
 	unsigned force:1,
 		 quiet:1,
 		 not_new:1,
diff --git a/checkout-index.c b/checkout-index.c
index f54c606..af7b230 100644
--- a/checkout-index.c
+++ b/checkout-index.c
@@ -47,6 +47,8 @@ static int checkout_stage; /* default to
 static struct checkout state = {
 	.base_dir = "",
 	.base_dir_len = 0,
+	.name_suffix = "",
+	.name_suffix_len = 0,
 	.force = 0,
 	.quiet = 0,
 	.not_new = 0,
@@ -180,6 +182,14 @@ int main(int argc, char **argv)
 			state.base_dir_len = strlen(state.base_dir);
 			continue;
 		}
+		if (!strncmp(arg, "--suffix=", 9)) {
+			if (strchr(arg+9, '/')) {
+				die("--suffix cannot contain /");
+			}
+			state.name_suffix = arg+9;
+			state.name_suffix_len = strlen(state.name_suffix);
+			continue;
+		}
 		if (!strncmp(arg, "--stage=", 8)) {
 			int ch = arg[8];
 			if ('1' <= ch && ch <= '3')
@@ -193,8 +203,8 @@ int main(int argc, char **argv)
 		break;
 	}
 
-	if (state.base_dir_len) {
-		/* when --prefix is specified we do not
+	if (state.base_dir_len || state.name_suffix_len) {
+		/* when --prefix or --suffix is specified we do not
 		 * want to update cache.
 		 */
 		if (state.refresh_cache) {
diff --git a/entry.c b/entry.c
index 8fb99bc..dc35a07 100644
--- a/entry.c
+++ b/entry.c
@@ -117,10 +117,14 @@ int checkout_entry(struct cache_entry *c
 {
 	struct stat st;
 	static char path[MAXPATHLEN+1];
-	int len = state->base_dir_len;
+	int len1 = state->base_dir_len;
+	int len2 = strlen(ce->name);
+	int len3 = state->name_suffix_len;
+	char *path_len1 = path + len1;
 
-	memcpy(path, state->base_dir, len);
-	strcpy(path + len, ce->name);
+	memcpy(path, state->base_dir, len1);
+	memcpy(path_len1, ce->name, len2 + 1);
+	memcpy(path_len1 + len2, state->name_suffix, len3 + 1);
 
 	if (!lstat(path, &st)) {
 		unsigned changed = ce_match_stat(ce, &st, 1);
diff --git a/read-tree.c b/read-tree.c
index f39fe5c..f223a0d 100644
--- a/read-tree.c
+++ b/read-tree.c
@@ -281,6 +281,7 @@ static void check_updates(struct cache_e
 {
 	static struct checkout state = {
 		.base_dir = "",
+		.name_suffix = "",
 		.force = 1,
 		.quiet = 1,
 		.refresh_cache = 1,
-- 
1.2.3.gdf23c-dirty

^ permalink raw reply related

* [PATCH] Teach git-checkout-index to read filenames from stdin.
From: Shawn Pearce @ 2006-03-01  2:43 UTC (permalink / raw)
  To: git

Since git-checkout-index is often used from scripts which
may have a stream of filenames they wish to checkout it is
more convenient to use --stdin than xargs.  On platforms
where fork performance is currently sub-optimal and
the length of a command line is limited (*cough* Cygwin
*cough*) running a single git-checkout-index process for
a large number of files beats spawning it multiple times
from xargs.

File names are still accepted on the command line if
--stdin is not supplied.  Nothing is performed if no files
are supplied on the command line or by stdin.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>

---
 I wonder if Linus will stop using

   find ... -print0 | xargs -0 git-checkout-index --

 with this option available.

 Eh, probably not as xargs requires less typing than -z --stdin.

 Documentation/git-checkout-index.txt |   20 +++++++++++++++--
 checkout-index.c                     |   41 ++++++++++++++++++++++++++++++++++
 2 files changed, 59 insertions(+), 2 deletions(-)

base 84434f9549d56e522a2eb4de370100f0a6e5e041
last df23c1119d0af1fbac6b8afd296113e155d9a878
diff --git a/Documentation/git-checkout-index.txt b/Documentation/git-checkout-index.txt
index 2a1e526..b0b6588 100644
--- a/Documentation/git-checkout-index.txt
+++ b/Documentation/git-checkout-index.txt
@@ -10,7 +10,9 @@ SYNOPSIS
 --------
 [verse]
 'git-checkout-index' [-u] [-q] [-a] [-f] [-n] [--prefix=<string>]
-		   [--stage=<number>] [--] <file>...
+		   [--stage=<number>]
+		   [-z] [--stdin]
+		   [--] [<file>]\*
 
 DESCRIPTION
 -----------
@@ -45,6 +47,15 @@ OPTIONS
 	Instead of checking out unmerged entries, copy out the
 	files from named stage.  <number> must be between 1 and 3.
 
+--stdin::
+	Instead of taking list of paths from the command line,
+	read list of paths from the standard input.  Paths are
+	separated by LF (i.e. one path per line) by default.
+
+-z::
+	Only meaningful with `--stdin`; paths are separated with
+	NUL character instead of LF.
+
 --::
 	Do not interpret any more arguments as options.
 
@@ -64,7 +75,12 @@ $ find . -name '*.h' -print0 | xargs -0 
 
 which will force all existing `*.h` files to be replaced with their
 cached copies. If an empty command line implied "all", then this would
-force-refresh everything in the index, which was not the point.
+force-refresh everything in the index, which was not the point.  But
+since git-checkout-index accepts --stdin it would be faster to use:
+
+----------------
+$ find . -name '*.h' -print0 | git-checkout-index -f -z --stdin
+----------------
 
 The `--` is just a good idea when you know the rest will be filenames;
 it will prevent problems with a filename of, for example,  `-a`.
diff --git a/checkout-index.c b/checkout-index.c
index 957b4a8..f54c606 100644
--- a/checkout-index.c
+++ b/checkout-index.c
@@ -22,6 +22,10 @@
  *
  *	find . -name '*.h' -print0 | xargs -0 git-checkout-index -f --
  *
+ * or:
+ *
+ *	find . -name '*.h' -print0 | git-checkout-index -f -z --stdin
+ *
  * which will force all existing *.h files to be replaced with
  * their cached copies. If an empty command line implied "all",
  * then this would force-refresh everything in the cache, which
@@ -33,6 +37,8 @@
  * but get used to it in scripting!).
  */
 #include "cache.h"
+#include "strbuf.h"
+#include "quote.h"
 
 static const char *prefix;
 static int prefix_length;
@@ -114,6 +120,8 @@ int main(int argc, char **argv)
 	int i;
 	int newfd = -1;
 	int all = 0;
+	int read_from_stdin = 0;
+	int line_termination = '\n';
 
 	prefix = setup_git_directory();
 	git_config(git_default_config);
@@ -156,6 +164,17 @@ int main(int argc, char **argv)
 				die("cannot open index.lock file.");
 			continue;
 		}
+		if (!strcmp(arg, "-z")) {
+			line_termination = 0;
+			continue;
+		}
+		if (!strcmp(arg, "--stdin")) {
+			if (i != argc - 1)
+				die("--stdin must be at the end");
+			read_from_stdin = 1;
+			i++; /* do not consider arg as a file name */
+			break;
+		}
 		if (!strncmp(arg, "--prefix=", 9)) {
 			state.base_dir = arg+9;
 			state.base_dir_len = strlen(state.base_dir);
@@ -191,9 +210,31 @@ int main(int argc, char **argv)
 
 		if (all)
 			die("git-checkout-index: don't mix '--all' and explicit filenames");
+		if (read_from_stdin)
+			die("git-checkout-index: don't mix '--stdin' and explicit filenames");
 		checkout_file(prefix_path(prefix, prefix_length, arg));
 	}
 
+	if (read_from_stdin) {
+		struct strbuf buf;
+		if (all)
+			die("git-checkout-index: don't mix '--all' and '--stdin'");
+		strbuf_init(&buf);
+		while (1) {
+			char *path_name;
+			read_line(&buf, stdin, line_termination);
+			if (buf.eof)
+				break;
+			if (line_termination && buf.buf[0] == '"')
+				path_name = unquote_c_style(buf.buf, NULL);
+			else
+				path_name = buf.buf;
+			checkout_file(prefix_path(prefix, prefix_length, path_name));
+			if (path_name != buf.buf)
+				free(path_name);
+		}
+	}
+
 	if (all)
 		checkout_all();
 
-- 
1.2.3.gdf23c

^ permalink raw reply related

* Re: Quick question: end of lines
From: Emmanuel Guerin @ 2006-03-01  0:12 UTC (permalink / raw)
  To: git
In-Reply-To: <46a038f90602281215n259066b1qe2e6421625b82e75@mail.gmail.com>

2006/3/1, Martin Langhoff <martin.langhoff@gmail.com>:
> Why is this important?
>
> (I am thinking: any reasonably good text editor will know how to deal
> with unix newlines, but you may have different reasons).

Actually, you have found the problem. My particular setup is that
Visual Studio is used on Windows. The editor will handle unix end of
lines all right, but tends to insert windows ones when modifications
are made. This leads to files with inconsistent end of lines, and
nightmares with merges. We use Subversion for the moment, and we have
to make sure that all text files are declared properly in svn to avoid
conflicts.

What I begin to realize is that the only possibility probably lies in
using a tool that converts the modified files "on the fly" before
commits. I just want to make sure that no other solution was found by
others facing a similar problem.

Anyway, thanks for the answers,

Emmanuel

^ permalink raw reply

* Re: [PATCH 3/3] Tie it all together: "git log"
From: Martin Langhoff @ 2006-02-28 23:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Linus Torvalds
In-Reply-To: <7vr75nm8cl.fsf@assigned-by-dhcp.cox.net>

On 3/1/06, Junio C Hamano <junkio@cox.net> wrote:
> I would say we should just rip merge-order out.  Who uses it,
> and why does it not work with topo-order, again?

IIRC archimport uses it, but there's no reason why topo-order wouldn't work.

cheers,


martin

^ permalink raw reply

* Re: [PATCH 3/3] Tie it all together: "git log"
From: Linus Torvalds @ 2006-02-28 23:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vr75nm8cl.fsf@assigned-by-dhcp.cox.net>



On Tue, 28 Feb 2006, Junio C Hamano wrote:
> 
> I would say we should just rip merge-order out.  Who uses it,
> and why does it not work with topo-order, again?

Well, assuming breaking --merge-order is fine, here's a patch (on top of 
the other ones) that makes

	git log <filename>

actually work, as far as I can tell. 

I didn't add the logic for --before/--after flags, but that should be 
pretty trivial, and is independent of this anyway.

Perhaps more importantly, I didn't remove the tests that now start 
failing, nor did I remove the actual code to do --merge-order ;/

		Linus

----
diff --git a/rev-list.c b/rev-list.c
index 94f22dd..6993b1a 100644
--- a/rev-list.c
+++ b/rev-list.c
@@ -8,10 +8,9 @@
 #include "diff.h"
 #include "revision.h"
 
-/* bits #0-2 in revision.h */
+/* bits #0-3 in revision.h */
 
-#define COUNTED		(1u << 3)
-#define SHOWN		(1u << 4)
+#define COUNTED		(1u << 4)
 #define TMP_MARK	(1u << 5) /* for isolated cases; clean after use */
 
 static const char rev_list_usage[] =
@@ -25,7 +24,6 @@ static const char rev_list_usage[] =
 "    --remove-empty\n"
 "    --all\n"
 "  ordering output:\n"
-"    --merge-order [ --show-breaks ]\n"
 "    --topo-order\n"
 "    --date-order\n"
 "  formatting output:\n"
@@ -47,22 +45,9 @@ static int show_parents = 0;
 static int hdr_termination = 0;
 static const char *commit_prefix = "";
 static enum cmit_fmt commit_format = CMIT_FMT_RAW;
-static int merge_order = 0;
-static int show_breaks = 0;
-static int stop_traversal = 0;
-static int no_merges = 0;
 
 static void show_commit(struct commit *commit)
 {
-	commit->object.flags |= SHOWN;
-	if (show_breaks) {
-		commit_prefix = "| ";
-		if (commit->object.flags & DISCONTINUITY) {
-			commit_prefix = "^ ";     
-		} else if (commit->object.flags & BOUNDARY) {
-			commit_prefix = "= ";
-		} 
-        }        		
 	printf("%s%s", commit_prefix, sha1_to_hex(commit->object.sha1));
 	if (show_parents) {
 		struct commit_list *parents = commit->parents;
@@ -96,73 +81,6 @@ static void show_commit(struct commit *c
 	fflush(stdout);
 }
 
-static int rewrite_one(struct commit **pp)
-{
-	for (;;) {
-		struct commit *p = *pp;
-		if (p->object.flags & (TREECHANGE | UNINTERESTING))
-			return 0;
-		if (!p->parents)
-			return -1;
-		*pp = p->parents->item;
-	}
-}
-
-static void rewrite_parents(struct commit *commit)
-{
-	struct commit_list **pp = &commit->parents;
-	while (*pp) {
-		struct commit_list *parent = *pp;
-		if (rewrite_one(&parent->item) < 0) {
-			*pp = parent->next;
-			continue;
-		}
-		pp = &parent->next;
-	}
-}
-
-static int filter_commit(struct commit * commit)
-{
-	if (stop_traversal && (commit->object.flags & BOUNDARY))
-		return STOP;
-	if (commit->object.flags & (UNINTERESTING|SHOWN))
-		return CONTINUE;
-	if (revs.min_age != -1 && (commit->date > revs.min_age))
-		return CONTINUE;
-	if (revs.max_age != -1 && (commit->date < revs.max_age)) {
-		stop_traversal=1;
-		return CONTINUE;
-	}
-	if (no_merges && (commit->parents && commit->parents->next))
-		return CONTINUE;
-	if (revs.paths && revs.dense) {
-		if (!(commit->object.flags & TREECHANGE))
-			return CONTINUE;
-		rewrite_parents(commit);
-	}
-	return DO;
-}
-
-static int process_commit(struct commit * commit)
-{
-	int action=filter_commit(commit);
-
-	if (action == STOP) {
-		return STOP;
-	}
-
-	if (action == CONTINUE) {
-		return CONTINUE;
-	}
-
-	if (revs.max_count != -1 && !revs.max_count--)
-		return STOP;
-
-	show_commit(commit);
-
-	return CONTINUE;
-}
-
 static struct object_list **process_blob(struct blob *blob,
 					 struct object_list **p,
 					 struct name_path *path,
@@ -219,8 +137,7 @@ static void show_commit_list(struct rev_
 
 	while ((commit = get_revision(revs)) != NULL) {
 		p = process_tree(commit->tree, p, NULL, "");
-		if (process_commit(commit) == STOP)
-			break;
+		show_commit(commit);
 	}
 	for (pending = revs->pending_objects; pending; pending = pending->next) {
 		struct object *obj = pending->item;
@@ -416,10 +333,6 @@ int main(int argc, const char **argv)
 				commit_prefix = "commit ";
 			continue;
 		}
-		if (!strncmp(arg, "--no-merges", 11)) {
-			no_merges = 1;
-			continue;
-		}
 		if (!strcmp(arg, "--parents")) {
 			show_parents = 1;
 			continue;
@@ -428,14 +341,6 @@ int main(int argc, const char **argv)
 			bisect_list = 1;
 			continue;
 		}
-		if (!strcmp(arg, "--merge-order")) {
-		        merge_order = 1;
-			continue;
-		}
-		if (!strcmp(arg, "--show-breaks")) {
-			show_breaks = 1;
-			continue;
-		}
 		usage(rev_list_usage);
 
 	}
@@ -456,17 +361,7 @@ int main(int argc, const char **argv)
 	save_commit_buffer = verbose_header;
 	track_object_refs = 0;
 
-	if (!merge_order) {
-		show_commit_list(&revs);
-	} else {
-#ifndef NO_OPENSSL
-		if (sort_list_in_merge_order(list, &process_commit)) {
-			die("merge order sort failed\n");
-		}
-#else
-		die("merge order sort unsupported, OpenSSL not linked");
-#endif
-	}
+	show_commit_list(&revs);
 
 	return 0;
 }
diff --git a/revision.c b/revision.c
index fb728c1..f98fae9 100644
--- a/revision.c
+++ b/revision.c
@@ -381,6 +381,9 @@ static void limit_list(struct rev_info *
 	struct commit_list *newlist = NULL;
 	struct commit_list **p = &newlist;
 
+	if (revs->paths)
+		diff_tree_setup_paths(revs->paths);
+
 	while (list) {
 		struct commit_list *entry = list;
 		struct commit *commit = list->item;
@@ -436,12 +439,13 @@ static void handle_all(struct rev_info *
  * Parse revision information, filling in the "rev_info" structure,
  * and removing the used arguments from the argument list.
  *
- * Returns the number of arguments left ("new argc").
+ * Returns the number of arguments left that weren't recognized
+ * (which are also moved to the head of the argument list)
  */
 int setup_revisions(int argc, const char **argv, struct rev_info *revs, const char *def)
 {
 	int i, flags, seen_dashdash;
-	const char **unrecognized = argv+1;
+	const char **unrecognized = argv + 1;
 	int left = 1;
 
 	memset(revs, 0, sizeof(*revs));
@@ -525,6 +529,10 @@ int setup_revisions(int argc, const char
 				revs->remove_empty_trees = 1;
 				continue;
 			}
+			if (!strncmp(arg, "--no-merges", 11)) {
+				revs->no_merges = 1;
+				continue;
+			}
 			if (!strcmp(arg, "--objects")) {
 				revs->tag_objects = 1;
 				revs->tree_objects = 1;
@@ -601,14 +609,11 @@ int setup_revisions(int argc, const char
 	}
 	if (revs->paths)
 		revs->limited = 1;
-	*unrecognized = NULL;
 	return left;
 }
 
 void prepare_revision_walk(struct rev_info *revs)
 {
-	if (revs->paths)
-		diff_tree_setup_paths(revs->paths);
 	sort_by_date(&revs->commits);
 	if (revs->limited)
 		limit_list(revs);
@@ -616,11 +621,67 @@ void prepare_revision_walk(struct rev_in
 		sort_in_topological_order(&revs->commits, revs->lifo);
 }
 
+static int rewrite_one(struct commit **pp)
+{
+	for (;;) {
+		struct commit *p = *pp;
+		if (p->object.flags & (TREECHANGE | UNINTERESTING))
+			return 0;
+		if (!p->parents)
+			return -1;
+		*pp = p->parents->item;
+	}
+}
+
+static void rewrite_parents(struct commit *commit)
+{
+	struct commit_list **pp = &commit->parents;
+	while (*pp) {
+		struct commit_list *parent = *pp;
+		if (rewrite_one(&parent->item) < 0) {
+			*pp = parent->next;
+			continue;
+		}
+		pp = &parent->next;
+	}
+}
+
 struct commit *get_revision(struct rev_info *revs)
 {
-	if (!revs->commits)
+	struct commit_list *list = revs->commits;
+	struct commit *commit;
+
+	if (!list)
 		return NULL;
-	return pop_most_recent_commit(&revs->commits, SEEN);
-}
 
+	/* Check the max_count ... */
+	commit = list->item;
+	switch (revs->max_count) {
+	case -1:
+		break;
+	case 0:
+		return NULL;
+	default:
+		revs->max_count--;
+	}
 
+	do {
+		commit = pop_most_recent_commit(&revs->commits, SEEN);
+		if (commit->object.flags & (UNINTERESTING|SHOWN))
+			continue;
+		if (revs->min_age != -1 && (commit->date > revs->min_age))
+			continue;
+		if (revs->max_age != -1 && (commit->date < revs->max_age))
+			return NULL;
+		if (revs->no_merges && commit->parents && commit->parents->next)
+			continue;
+		if (revs->paths && revs->dense) {
+			if (!(commit->object.flags & TREECHANGE))
+				continue;
+			rewrite_parents(commit);
+		}
+		commit->object.flags |= SHOWN;
+		return commit;
+	} while (revs->commits);
+	return NULL;
+}
diff --git a/revision.h b/revision.h
index 0bed3c0..0043c16 100644
--- a/revision.h
+++ b/revision.h
@@ -4,6 +4,7 @@
 #define SEEN		(1u<<0)
 #define UNINTERESTING   (1u<<1)
 #define TREECHANGE	(1u<<2)
+#define SHOWN		(1u<<3)
 
 struct rev_info {
 	/* Starting list */
@@ -16,6 +17,7 @@ struct rev_info {
 
 	/* Traversal flags */
 	unsigned int	dense:1,
+			no_merges:1,
 			remove_empty_trees:1,
 			lifo:1,
 			topo_order:1,

^ permalink raw reply related

* Re: bug?: stgit creates (unneccessary?) conflicts when pulling
From: Catalin Marinas @ 2006-02-28 22:45 UTC (permalink / raw)
  To: Karl Hasselström; +Cc: git
In-Reply-To: <44037A5C.6080409@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 888 bytes --]

On 27/02/06, Catalin Marinas <catalin.marinas@gmail.com> wrote:
> An idea (untested, I don't even know whether it's feasible) would be to
> check which patches were merged by reverse-applying them starting with
> the last. In this situation, all the merged patches should just revert
> their changes. You only need to do a git-diff between the bottom and the
> top of the patch and git-apply the output (maybe without even modifying
> the tree). If this operation succeeds, the patch was integrated and you
> don't even need to push it.

I attached another patch that should work properly. It also pushes
empty patches on the stack if they were merged upstream (a 'stg clean'
is required to remove them). This is useful for the push --undo
command if you are not happy with the result.

I'll try this patch for a bit more before pushing into the repository.

--
Catalin

[-- Attachment #2: merged-test.diff --]
[-- Type: text/x-patch, Size: 8706 bytes --]

Add a merged upstream test for pull and push

From: Catalin Marinas <catalin.marinas@gmail.com>

This patch adds the --merged option to both pull and push commands. With
this option, these commands will first try to check which patches were
merged upstream by reverse-applying them in reverse order. This should
solve the situation where several patches modify the same line in a file.

Signed-off-by: Catalin Marinas <catalin.marinas@gmail.com>
---

 stgit/commands/common.py |   41 +++++++++++++++++++++++++++++++++++++++++
 stgit/commands/pull.py   |   15 +++++----------
 stgit/commands/push.py   |   28 ++++++----------------------
 stgit/git.py             |   12 +++++++++---
 stgit/stack.py           |   34 +++++++++++++++++++++++++++++++---
 5 files changed, 92 insertions(+), 38 deletions(-)

diff --git a/stgit/commands/common.py b/stgit/commands/common.py
index 2e1ba7a..2985379 100644
--- a/stgit/commands/common.py
+++ b/stgit/commands/common.py
@@ -132,6 +132,47 @@ def resolved_all(reset = None):
             resolved(filename, reset)
         os.remove(os.path.join(git.get_base_dir(), 'conflicts'))
 
+def push_patches(patches, check_merged = False):
+    """Push multiple patches onto the stack. This function is shared
+    between the push and pull commands
+    """
+    forwarded = crt_series.forward_patches(patches)
+    if forwarded > 1:
+        print 'Fast-forwarded patches "%s" - "%s"' % (patches[0],
+                                                      patches[forwarded - 1])
+    elif forwarded == 1:
+        print 'Fast-forwarded patch "%s"' % patches[0]
+
+    names = patches[forwarded:]
+
+    # check for patches merged upstream
+    if check_merged:
+        print 'Checking for patches merged upstream...',
+        sys.stdout.flush()
+
+        merged = crt_series.merged_patches(names)
+
+        print 'done (%d found)' % len(merged)
+    else:
+        merged = []
+
+    for p in names:
+        print 'Pushing patch "%s"...' % p,
+        sys.stdout.flush()
+
+        if p in merged:
+            crt_series.push_patch(p, empty = True)
+            print 'done (merged upstream)'
+        else:
+            modified = crt_series.push_patch(p)
+
+            if crt_series.empty_patch(p):
+                print 'done (empty patch)'
+            elif modified:
+                print 'done (modified)'
+            else:
+                print 'done'
+
 def name_email(address):
     """Return a tuple consisting of the name and email parsed from a
     standard 'name <email>' string
diff --git a/stgit/commands/pull.py b/stgit/commands/pull.py
index 843b579..8f26f4d 100644
--- a/stgit/commands/pull.py
+++ b/stgit/commands/pull.py
@@ -39,6 +39,9 @@ format."""
 
 options = [make_option('-n', '--nopush',
                        help = 'do not push the patches back after pulling',
+                       action = 'store_true'),
+           make_option('-m', '--merged',
+                       help = 'check for patches merged upstream',
                        action = 'store_true')]
 
 def func(parser, options, args):
@@ -75,15 +78,7 @@ def func(parser, options, args):
     print 'done'
 
     # push the patches back
-    if options.nopush:
-        applied = []
-    for p in applied:
-        print 'Pushing patch "%s"...' % p,
-        sys.stdout.flush()
-        crt_series.push_patch(p)
-        if crt_series.empty_patch(p):
-            print 'done (empty patch)'
-        else:
-            print 'done'
+    if not options.nopush:
+        push_patches(applied, options.merged)
 
     print_crt_patch()
diff --git a/stgit/commands/push.py b/stgit/commands/push.py
index 9924a78..90777c1 100644
--- a/stgit/commands/push.py
+++ b/stgit/commands/push.py
@@ -49,6 +49,9 @@ options = [make_option('-a', '--all',
            make_option('--reverse',
                        help = 'push the patches in reverse order',
                        action = 'store_true'),
+           make_option('-m', '--merged',
+                       help = 'check for patches merged upstream',
+                       action = 'store_true'),
            make_option('--undo',
                        help = 'undo the last push operation',
                        action = 'store_true')]
@@ -58,9 +61,9 @@ def is_patch_appliable(p):
     """See if patch exists, or is already applied.
     """
     if p in applied:
-        raise CmdException, 'Patch "%s" is already applied.' % p
+        raise CmdException, 'Patch "%s" is already applied' % p
     if p not in unapplied:
-        raise CmdException, 'Patch "%s" does not exist.' % p
+        raise CmdException, 'Patch "%s" does not exist' % p
 
 def func(parser, options, args):
     """Pushes the given patch or all onto the series
@@ -127,25 +130,6 @@ def func(parser, options, args):
     if options.reverse:
         patches.reverse()
 
-    forwarded = crt_series.forward_patches(patches)
-    if forwarded > 1:
-        print 'Fast-forwarded patches "%s" - "%s"' % (patches[0],
-                                                      patches[forwarded - 1])
-    elif forwarded == 1:
-        print 'Fast-forwarded patch "%s"' % patches[0]
-
-    for p in patches[forwarded:]:
-        is_patch_appliable(p)
-
-        print 'Pushing patch "%s"...' % p,
-        sys.stdout.flush()
+    push_patches(patches, options.merged)
 
-        modified = crt_series.push_patch(p)
-
-        if crt_series.empty_patch(p):
-            print 'done (empty patch)'
-        elif modified:
-            print 'done (modified)'
-        else:
-            print 'done'
     print_crt_patch()
diff --git a/stgit/git.py b/stgit/git.py
index a3488ff..40d54ef 100644
--- a/stgit/git.py
+++ b/stgit/git.py
@@ -465,14 +465,20 @@ def commit(message, files = None, parent
 
     return commit_id
 
-def apply_diff(rev1, rev2):
+def apply_diff(rev1, rev2, check_index = True):
     """Apply the diff between rev1 and rev2 onto the current
     index. This function doesn't need to raise an exception since it
     is only used for fast-pushing a patch. If this operation fails,
     the pushing would fall back to the three-way merge.
     """
-    return os.system('git-diff-tree -p %s %s | git-apply --index 2> /dev/null'
-                     % (rev1, rev2)) == 0
+    if check_index:
+        index_opt = '--index'
+    else:
+        index_opt = ''
+    cmd = 'git-diff-tree -p %s %s | git-apply %s 2> /dev/null' \
+          % (rev1, rev2, index_opt)
+
+    return os.system(cmd) == 0
 
 def merge(base, head1, head2):
     """Perform a 3-way merge between base, head1 and head2 into the
diff --git a/stgit/stack.py b/stgit/stack.py
index e1c55f0..165b5a7 100644
--- a/stgit/stack.py
+++ b/stgit/stack.py
@@ -780,7 +780,27 @@ class Series:
 
         return forwarded
 
-    def push_patch(self, name):
+    def merged_patches(self, names):
+        """Test which patches were merged upstream by reverse-applying
+        them in reverse order. The function returns the list of
+        patches detected to have been applied. The state of the tree
+        is restored to the original one
+        """
+        patches = [Patch(name, self.__patch_dir, self.__refs_dir)
+                   for name in names]
+        patches.reverse()
+
+        merged = []
+        for p in patches:
+            if git.apply_diff(p.get_top(), p.get_bottom(), False):
+                merged.append(p.get_name())
+        merged.reverse()
+
+        git.reset()
+
+        return merged
+
+    def push_patch(self, name, empty = False):
         """Pushes a patch on the stack
         """
         unapplied = self.get_unapplied()
@@ -798,7 +818,15 @@ class Series:
         modified = False
 
         # top != bottom always since we have a commit for each patch
-        if head == bottom:
+        if empty:
+            # just make an empty patch (top = bottom = HEAD). This
+            # option is useful to allow undoing already merged
+            # patches. The top is updated by refresh_patch since we
+            # need an empty commit
+            patch.set_bottom(head, backup = True)
+            patch.set_top(head, backup = True)
+            modified = True
+        elif head == bottom:
             # reset the backup information
             patch.set_bottom(bottom, backup = True)
             patch.set_top(top, backup = True)
@@ -835,7 +863,7 @@ class Series:
         self.__set_current(name)
 
         # head == bottom case doesn't need to refresh the patch
-        if head != bottom:
+        if empty or head != bottom:
             if not ex:
                 # if the merge was OK and no conflicts, just refresh the patch
                 # The GIT cache was already updated by the merge operation

^ permalink raw reply related

* Re: [PATCH 3/3] Tie it all together: "git log"
From: Junio C Hamano @ 2006-02-28 22:22 UTC (permalink / raw)
  To: git; +Cc: Linus Torvalds
In-Reply-To: <Pine.LNX.4.64.0602281251390.22647@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> Anyway, apart from that issue (which I think should be trivial to sort out 
> if we accept breaking --merge-order), the rest looks like it should just 
> get more testing and handling of the few missing flags from rev-parse in 
> revision.c, and it should be good.

I would say we should just rip merge-order out.  Who uses it,
and why does it not work with topo-order, again?

^ permalink raw reply

* [PATCH] Warn about invalid refs
From: Johannes Schindelin @ 2006-02-28 21:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7voe5a1yft.fsf@assigned-by-dhcp.cox.net>


Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>

---

On Thu, 27 Oct 2005, Junio C Hamano wrote:

	> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
	> 
	> > On Thu, 27 Oct 2005, Junio C Hamano wrote:
	> >
	> >> Not that the current loop is any better for that purpose.  
	> >> We silently ignore not just dangling ref and ref not storing
	> >> 40-byte hex, but files starting with a period '.', names 
	> >> longer than 255 bytes, and unreadable ones, all of which we 
	> >> would probably want to warn about in such a tool.
	> >
	> > Okay, how about 'fprintf(stderr, "Warning: ...\n"); continue;' 
	> > instead of 'die("...");' then?
	> 
	> Yup.  That sounds sensible.

	Sorry for taking so long...

 refs.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/refs.c b/refs.c
index 826ae7a..982ebf8 100644
--- a/refs.c
+++ b/refs.c
@@ -151,10 +151,15 @@ static int do_for_each_ref(const char *b
 					break;
 				continue;
 			}
-			if (read_ref(git_path("%s", path), sha1) < 0)
+			if (read_ref(git_path("%s", path), sha1) < 0) {
+				fprintf(stderr, "%s points nowhere!", path);
 				continue;
-			if (!has_sha1_file(sha1))
+			}
+			if (!has_sha1_file(sha1)) {
+				fprintf(stderr, "%s does not point to a valid "
+						"commit object!", path);
 				continue;
+			}
 			retval = fn(path, sha1);
 			if (retval)
 				break;

^ permalink raw reply related

* Re: [PATCH 3/3] Tie it all together: "git log"
From: Linus Torvalds @ 2006-02-28 20:59 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0602281126340.22647@g5.osdl.org>

On Tue, 28 Feb 2006, Linus Torvalds wrote:
> 
> Again, this may not do exactly what the current "git log" does. That's not 
> the point. The point is to introduce the fundamental functionality, so 
> that people can play with this and improve on it, and fix any of my stupid 
> bugs.

Btw, before anybody even pipes up: the missing piece here is the nasty 
"filter_commit()" that rev-list.c does, and that really should be moved 
into revision.c, and this is where I hit on the "--merge-order" issues.

So for example, if you do "git log -- <filename>" with the new git, it 
won't filter out the commits that just change the passed-in <filename> 
properly, because the filtering code still exists only in git-rev-list 
(even if revision.c now does the traversal).

Same goes for the max-count-based filtering, for the same reason.

So the "process_commit()" handling should be moved into "get_revision()", 
but since the merge-order code also hooks into it...

Anyway, apart from that issue (which I think should be trivial to sort out 
if we accept breaking --merge-order), the rest looks like it should just 
get more testing and handling of the few missing flags from rev-parse in 
revision.c, and it should be good.

		Linus

^ permalink raw reply

* Re: [ANNOUNCE] quilt2git v0.2
From: Sam Vilain @ 2006-02-28 20:55 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, git
In-Reply-To: <20060228111115.GA32276@htj.dyndns.org>

Tejun Heo wrote:
> Hello, v0.2 of quilt2git available.  New in v0.2.
> 
> * handles new git HEAD file format properly (regular file storing ref: ...)
> 
> * makes use of mail format header from quilt patch description.  From:
>   becomes the author, Subject: the subject of the patch.  All commit
>   information should be maintained through git2quilt -> quilt2git now.
> 
> * --signoff option added.  This option is simply passed to git-commit.
> 
> * little fixes
> 
> http://home-tj.org/wiki/index.php/Misc
> http://home-tj.org/files/misc/quilt2git-0.2
> http://home-tj.org/files/misc/git2quilt-0.1
> 
> Thanks.
> 

FWIW, I have a similar script to import a quilt export as an stgit patch 
series, it's really simple but quite useful:

   http://vserver.ustl.gen.nz/scripts/import-quilt

Sam.

^ permalink raw reply

* Re: Quick question: end of lines
From: Martin Langhoff @ 2006-02-28 20:15 UTC (permalink / raw)
  To: Emmanuel Guerin; +Cc: git
In-Reply-To: <f898cca90602281032n6603bf14q@mail.gmail.com>

On 3/1/06, Emmanuel Guerin <emmanuel@guerin.fr.eu.org> wrote:
> To be more precise, I need to be able to checkout files on Unix and
> Windows, and it is important that the end of lines are set
> accordingly.

Why is this important?

(I am thinking: any reasonably good text editor will know how to deal
with unix newlines, but you may have different reasons).


martin

^ permalink raw reply

* Re: Quick question: end of lines
From: Johannes Schindelin @ 2006-02-28 20:07 UTC (permalink / raw)
  To: Emmanuel Guerin; +Cc: git
In-Reply-To: <f898cca90602281032n6603bf14q@mail.gmail.com>

Hi,

On Tue, 28 Feb 2006, Emmanuel Guerin wrote:

> Is it possible to checkout sources out of the GIT repository with
> Windows style end of lines?

No.

As far as git is concerned, every versioned file is equal. IMHO this 
decision is good, since

- different handling is more complicated (you have to keep track of the 
file type), and
- it is not really worth doing.

Windows can handle Unix line endings quite properly (with the notable 
exception of notepad.exe), and even Apple has learnt that it might be a 
stupid idea to insist on being different when it's just not worth it.

The only reason I would accept: you have to work with MS-DOS tools. But 
even in this case, I'd rather write a wrapper which converts to DOS line 
endings, executes the tool, and converts back.

Hth,
Dscho

^ permalink raw reply

* [PATCH 3/3] Tie it all together: "git log"
From: Linus Torvalds @ 2006-02-28 19:30 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0602281115110.22647@g5.osdl.org>


This is what the previous diffs all built up to.

We can do "git log" as a trivial small helper function inside git.c, 
because the infrastructure is all there for us to use as a library.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
----

Again, this may not do exactly what the current "git log" does. That's not 
the point. The point is to introduce the fundamental functionality, so 
that people can play with this and improve on it, and fix any of my stupid 
bugs.

It should be pretty easy to change some of the other rev-list-walking 
functions to use the library interfaces too, instead of executing an 
external "git-rev-list" process. This was a perfect example of how to get 
something working, though.

		Linus

diff --git a/Makefile b/Makefile
index 0b1a998..ead13be 100644
--- a/Makefile
+++ b/Makefile
@@ -450,7 +450,7 @@ strip: $(PROGRAMS) git$X
 
 git$X: git.c $(LIB_FILE)
 	$(CC) -DGIT_VERSION='"$(GIT_VERSION)"' \
-		$(CFLAGS) $(COMPAT_CFLAGS) -o $@ $(filter %.c,$^) $(LIB_FILE)
+		$(ALL_CFLAGS) -o $@ $(filter %.c,$^) $(LIB_FILE) $(LIBS)
 
 $(patsubst %.sh,%,$(SCRIPT_SH)) : % : %.sh
 	rm -f $@
diff --git a/git.c b/git.c
index 993cd0d..b0da6b1 100644
--- a/git.c
+++ b/git.c
@@ -12,6 +12,10 @@
 #include "git-compat-util.h"
 #include "exec_cmd.h"
 
+#include "cache.h"
+#include "commit.h"
+#include "revision.h"
+
 #ifndef PATH_MAX
 # define PATH_MAX 4096
 #endif
@@ -245,6 +249,25 @@ static int cmd_help(int argc, char **arg
 	return 0;
 }
 
+#define LOGSIZE (65536)
+
+static int cmd_log(int argc, char **argv, char **envp)
+{
+	struct rev_info rev;
+	struct commit *commit;
+	char *buf = xmalloc(LOGSIZE);
+
+	argc = setup_revisions(argc, argv, &rev, "HEAD");
+	prepare_revision_walk(&rev);
+	setup_pager();
+	while ((commit = get_revision(&rev)) != NULL) {
+		pretty_print_commit(CMIT_FMT_DEFAULT, commit, ~0, buf, LOGSIZE, 18);
+		printf("%s\n", buf);
+	}
+	free(buf);
+	return 0;
+}
+
 #define ARRAY_SIZE(x) (sizeof(x)/sizeof(x[0]))
 
 static void handle_internal_command(int argc, char **argv, char **envp)
@@ -256,6 +279,7 @@ static void handle_internal_command(int 
 	} commands[] = {
 		{ "version", cmd_version },
 		{ "help", cmd_help },
+		{ "log", cmd_log },
 	};
 	int i;
 

^ permalink raw reply related

* [PATCH 2/3] Introduce trivial new pager.c helper infrastructure
From: Linus Torvalds @ 2006-02-28 19:26 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0602281115110.22647@g5.osdl.org>


This introduces the new function

	void setup_pager(void);

to set up output to be written through a pager applocation.

All in preparation for doing the simple scripts in C.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---

Ok, this should be pretty obvious, which is not to say that it shouldn't 
be improved (ie it only handles trivial definitions of PAGER). Any obvious 
improvements are left as an exercise for the reader, as not important for 
my current goal of actually having something working.


diff --git a/Makefile b/Makefile
index 3575489..0b1a998 100644
--- a/Makefile
+++ b/Makefile
@@ -205,7 +205,7 @@ LIB_OBJS = \
 	quote.o read-cache.o refs.o run-command.o \
 	server-info.o setup.o sha1_file.o sha1_name.o strbuf.o \
 	tag.o tree.o usage.o config.o environment.o ctype.o copy.o \
-	fetch-clone.o revision.o \
+	fetch-clone.o revision.o pager.o \
 	$(DIFF_OBJS)
 
 LIBS = $(LIB_FILE)
diff --git a/cache.h b/cache.h
index 58eec00..3af6b86 100644
--- a/cache.h
+++ b/cache.h
@@ -352,4 +352,7 @@ extern int copy_fd(int ifd, int ofd);
 extern int receive_unpack_pack(int fd[2], const char *me, int quiet);
 extern int receive_keep_pack(int fd[2], const char *me, int quiet);
 
+/* pager.c */
+extern void setup_pager(void);
+
 #endif /* CACHE_H */
diff --git a/pager.c b/pager.c
new file mode 100644
index 0000000..1364e15
--- /dev/null
+++ b/pager.c
@@ -0,0 +1,48 @@
+#include "cache.h"
+
+/*
+ * This is split up from the rest of git so that we might do
+ * something different on Windows, for example.
+ */
+
+static void run_pager(void)
+{
+	const char *prog = getenv("PAGER");
+	if (!prog)
+		prog = "less";
+	setenv("LESS", "-S", 0);
+	execlp(prog, prog, NULL);
+}
+
+void setup_pager(void)
+{
+	pid_t pid;
+	int fd[2];
+
+	if (!isatty(1))
+		return;
+	if (pipe(fd) < 0)
+		return;
+	pid = fork();
+	if (pid < 0) {
+		close(fd[0]);
+		close(fd[1]);
+		return;
+	}
+
+	/* return in the child */
+	if (!pid) {
+		dup2(fd[1], 1);
+		close(fd[0]);
+		close(fd[1]);
+		return;
+	}
+
+	/* The original process turns into the PAGER */
+	dup2(fd[0], 0);
+	close(fd[0]);
+	close(fd[1]);
+
+	run_pager();
+	exit(255);
+}

^ permalink raw reply related

* [PATCH 1/3] git-rev-list libification: rev-list walking
From: Linus Torvalds @ 2006-02-28 19:24 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0602281115110.22647@g5.osdl.org>


This actually moves the "meat" of the revision walking from rev-list.c
to the new library code in revision.h. It introduces the new functions

	void prepare_revision_walk(struct rev_info *revs);
	struct commit *get_revision(struct rev_info *revs);

to prepare and then walk the revisions that we have.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---

All the same old warnigns apply!! This is a bit more intrusive than the 
previous patch series, since it actually changes how things work. It 
passes all the tests I threw at it (well, actually, I only tested the end 
result of the whole series, bad me), but I'd obviously like to remind 
everybody that this is some really core code, and mistakes are bad.

I didn't worry about cleaning code up. It probably could be cleaned up, 
but I worked at trying to just move as much of it as-is from rev-list.c as 
possible, while leaving any code that was really only relevant to rev-list 
itself alone.

diff --git a/rev-list.c b/rev-list.c
index 2e80930..94f22dd 100644
--- a/rev-list.c
+++ b/rev-list.c
@@ -8,11 +8,10 @@
 #include "diff.h"
 #include "revision.h"
 
-/* bits #0 and #1 in revision.h */
+/* bits #0-2 in revision.h */
 
-#define COUNTED		(1u << 2)
-#define SHOWN		(1u << 3)
-#define TREECHANGE	(1u << 4)
+#define COUNTED		(1u << 3)
+#define SHOWN		(1u << 4)
 #define TMP_MARK	(1u << 5) /* for isolated cases; clean after use */
 
 static const char rev_list_usage[] =
@@ -213,17 +212,17 @@ static struct object_list **process_tree
 	return p;
 }
 
-static void show_commit_list(struct commit_list *list)
+static void show_commit_list(struct rev_info *revs)
 {
+	struct commit *commit;
 	struct object_list *objects = NULL, **p = &objects, *pending;
-	while (list) {
-		struct commit *commit = pop_most_recent_commit(&list, SEEN);
 
+	while ((commit = get_revision(revs)) != NULL) {
 		p = process_tree(commit->tree, p, NULL, "");
 		if (process_commit(commit) == STOP)
 			break;
 	}
-	for (pending = revs.pending_objects; pending; pending = pending->next) {
+	for (pending = revs->pending_objects; pending; pending = pending->next) {
 		struct object *obj = pending->item;
 		const char *name = pending->name;
 		if (obj->flags & (UNINTERESTING | SEEN))
@@ -259,19 +258,6 @@ static void show_commit_list(struct comm
 	}
 }
 
-static int everybody_uninteresting(struct commit_list *orig)
-{
-	struct commit_list *list = orig;
-	while (list) {
-		struct commit *commit = list->item;
-		list = list->next;
-		if (commit->object.flags & UNINTERESTING)
-			continue;
-		return 0;
-	}
-	return 1;
-}
-
 /*
  * This is a truly stupid algorithm, but it's only
  * used for bisection, and we just don't care enough.
@@ -379,224 +365,12 @@ static void mark_edges_uninteresting(str
 	}
 }
 
-#define TREE_SAME	0
-#define TREE_NEW	1
-#define TREE_DIFFERENT	2
-static int tree_difference = TREE_SAME;
-
-static void file_add_remove(struct diff_options *options,
-		    int addremove, unsigned mode,
-		    const unsigned char *sha1,
-		    const char *base, const char *path)
-{
-	int diff = TREE_DIFFERENT;
-
-	/*
-	 * Is it an add of a new file? It means that
-	 * the old tree didn't have it at all, so we
-	 * will turn "TREE_SAME" -> "TREE_NEW", but
-	 * leave any "TREE_DIFFERENT" alone (and if
-	 * it already was "TREE_NEW", we'll keep it
-	 * "TREE_NEW" of course).
-	 */
-	if (addremove == '+') {
-		diff = tree_difference;
-		if (diff != TREE_SAME)
-			return;
-		diff = TREE_NEW;
-	}
-	tree_difference = diff;
-}
-
-static void file_change(struct diff_options *options,
-		 unsigned old_mode, unsigned new_mode,
-		 const unsigned char *old_sha1,
-		 const unsigned char *new_sha1,
-		 const char *base, const char *path)
-{
-	tree_difference = TREE_DIFFERENT;
-}
-
-static struct diff_options diff_opt = {
-	.recursive = 1,
-	.add_remove = file_add_remove,
-	.change = file_change,
-};
-
-static int compare_tree(struct tree *t1, struct tree *t2)
-{
-	if (!t1)
-		return TREE_NEW;
-	if (!t2)
-		return TREE_DIFFERENT;
-	tree_difference = TREE_SAME;
-	if (diff_tree_sha1(t1->object.sha1, t2->object.sha1, "", &diff_opt) < 0)
-		return TREE_DIFFERENT;
-	return tree_difference;
-}
-
-static int same_tree_as_empty(struct tree *t1)
-{
-	int retval;
-	void *tree;
-	struct tree_desc empty, real;
-
-	if (!t1)
-		return 0;
-
-	tree = read_object_with_reference(t1->object.sha1, "tree", &real.size, NULL);
-	if (!tree)
-		return 0;
-	real.buf = tree;
-
-	empty.buf = "";
-	empty.size = 0;
-
-	tree_difference = 0;
-	retval = diff_tree(&empty, &real, "", &diff_opt);
-	free(tree);
-
-	return retval >= 0 && !tree_difference;
-}
-
-static void try_to_simplify_commit(struct commit *commit)
-{
-	struct commit_list **pp, *parent;
-
-	if (!commit->tree)
-		return;
-
-	if (!commit->parents) {
-		if (!same_tree_as_empty(commit->tree))
-			commit->object.flags |= TREECHANGE;
-		return;
-	}
-
-	pp = &commit->parents;
-	while ((parent = *pp) != NULL) {
-		struct commit *p = parent->item;
-
-		if (p->object.flags & UNINTERESTING) {
-			pp = &parent->next;
-			continue;
-		}
-
-		parse_commit(p);
-		switch (compare_tree(p->tree, commit->tree)) {
-		case TREE_SAME:
-			parent->next = NULL;
-			commit->parents = parent;
-			return;
-
-		case TREE_NEW:
-			if (revs.remove_empty_trees && same_tree_as_empty(p->tree)) {
-				*pp = parent->next;
-				continue;
-			}
-		/* fallthrough */
-		case TREE_DIFFERENT:
-			pp = &parent->next;
-			continue;
-		}
-		die("bad tree compare for commit %s", sha1_to_hex(commit->object.sha1));
-	}
-	commit->object.flags |= TREECHANGE;
-}
-
-static void add_parents_to_list(struct commit *commit, struct commit_list **list)
-{
-	struct commit_list *parent = commit->parents;
-
-	/*
-	 * If the commit is uninteresting, don't try to
-	 * prune parents - we want the maximal uninteresting
-	 * set.
-	 *
-	 * Normally we haven't parsed the parent
-	 * yet, so we won't have a parent of a parent
-	 * here. However, it may turn out that we've
-	 * reached this commit some other way (where it
-	 * wasn't uninteresting), in which case we need
-	 * to mark its parents recursively too..
-	 */
-	if (commit->object.flags & UNINTERESTING) {
-		while (parent) {
-			struct commit *p = parent->item;
-			parent = parent->next;
-			parse_commit(p);
-			p->object.flags |= UNINTERESTING;
-			if (p->parents)
-				mark_parents_uninteresting(p);
-			if (p->object.flags & SEEN)
-				continue;
-			p->object.flags |= SEEN;
-			insert_by_date(p, list);
-		}
-		return;
-	}
-
-	/*
-	 * Ok, the commit wasn't uninteresting. Try to
-	 * simplify the commit history and find the parent
-	 * that has no differences in the path set if one exists.
-	 */
-	if (revs.paths)
-		try_to_simplify_commit(commit);
-
-	parent = commit->parents;
-	while (parent) {
-		struct commit *p = parent->item;
-
-		parent = parent->next;
-
-		parse_commit(p);
-		if (p->object.flags & SEEN)
-			continue;
-		p->object.flags |= SEEN;
-		insert_by_date(p, list);
-	}
-}
-
-static struct commit_list *limit_list(struct commit_list *list)
-{
-	struct commit_list *newlist = NULL;
-	struct commit_list **p = &newlist;
-	while (list) {
-		struct commit_list *entry = list;
-		struct commit *commit = list->item;
-		struct object *obj = &commit->object;
-
-		list = list->next;
-		free(entry);
-
-		if (revs.max_age != -1 && (commit->date < revs.max_age))
-			obj->flags |= UNINTERESTING;
-		if (revs.unpacked && has_sha1_pack(obj->sha1))
-			obj->flags |= UNINTERESTING;
-		add_parents_to_list(commit, &list);
-		if (obj->flags & UNINTERESTING) {
-			mark_parents_uninteresting(commit);
-			if (everybody_uninteresting(list))
-				break;
-			continue;
-		}
-		if (revs.min_age != -1 && (commit->date > revs.min_age))
-			continue;
-		p = &commit_list_insert(commit, p)->next;
-	}
-	if (revs.tree_objects)
-		mark_edges_uninteresting(newlist);
-	if (bisect_list)
-		newlist = find_bisection(newlist);
-	return newlist;
-}
-
 int main(int argc, const char **argv)
 {
 	struct commit_list *list;
 	int i;
 
-	argc = setup_revisions(argc, argv, &revs);
+	argc = setup_revisions(argc, argv, &revs, NULL);
 
 	for (i = 1 ; i < argc; i++) {
 		const char *arg = argv[i];
@@ -672,24 +446,18 @@ int main(int argc, const char **argv)
 	    (!(revs.tag_objects||revs.tree_objects||revs.blob_objects) && !revs.pending_objects))
 		usage(rev_list_usage);
 
-	if (revs.paths)
-		diff_tree_setup_paths(revs.paths);
+	prepare_revision_walk(&revs);
+	if (revs.tree_objects)
+		mark_edges_uninteresting(revs.commits);
+
+	if (bisect_list)
+		revs.commits = find_bisection(revs.commits);
 
 	save_commit_buffer = verbose_header;
 	track_object_refs = 0;
 
-	if (!merge_order) {		
-		sort_by_date(&list);
-		if (list && !revs.limited && revs.max_count == 1 &&
-		    !revs.tag_objects && !revs.tree_objects && !revs.blob_objects) {
-			show_commit(list->item);
-			return 0;
-		}
-	        if (revs.limited)
-			list = limit_list(list);
-		if (revs.topo_order)
-			sort_in_topological_order(&list, revs.lifo);
-		show_commit_list(list);
+	if (!merge_order) {
+		show_commit_list(&revs);
 	} else {
 #ifndef NO_OPENSSL
 		if (sort_list_in_merge_order(list, &process_commit)) {
diff --git a/revision.c b/revision.c
index 0422593..fb728c1 100644
--- a/revision.c
+++ b/revision.c
@@ -3,6 +3,7 @@
 #include "blob.h"
 #include "tree.h"
 #include "commit.h"
+#include "diff.h"
 #include "refs.h"
 #include "revision.h"
 
@@ -183,6 +184,229 @@ static struct commit *get_commit_referen
 	die("%s is unknown object", name);
 }
 
+static int everybody_uninteresting(struct commit_list *orig)
+{
+	struct commit_list *list = orig;
+	while (list) {
+		struct commit *commit = list->item;
+		list = list->next;
+		if (commit->object.flags & UNINTERESTING)
+			continue;
+		return 0;
+	}
+	return 1;
+}
+
+#define TREE_SAME	0
+#define TREE_NEW	1
+#define TREE_DIFFERENT	2
+static int tree_difference = TREE_SAME;
+
+static void file_add_remove(struct diff_options *options,
+		    int addremove, unsigned mode,
+		    const unsigned char *sha1,
+		    const char *base, const char *path)
+{
+	int diff = TREE_DIFFERENT;
+
+	/*
+	 * Is it an add of a new file? It means that
+	 * the old tree didn't have it at all, so we
+	 * will turn "TREE_SAME" -> "TREE_NEW", but
+	 * leave any "TREE_DIFFERENT" alone (and if
+	 * it already was "TREE_NEW", we'll keep it
+	 * "TREE_NEW" of course).
+	 */
+	if (addremove == '+') {
+		diff = tree_difference;
+		if (diff != TREE_SAME)
+			return;
+		diff = TREE_NEW;
+	}
+	tree_difference = diff;
+}
+
+static void file_change(struct diff_options *options,
+		 unsigned old_mode, unsigned new_mode,
+		 const unsigned char *old_sha1,
+		 const unsigned char *new_sha1,
+		 const char *base, const char *path)
+{
+	tree_difference = TREE_DIFFERENT;
+}
+
+static struct diff_options diff_opt = {
+	.recursive = 1,
+	.add_remove = file_add_remove,
+	.change = file_change,
+};
+
+static int compare_tree(struct tree *t1, struct tree *t2)
+{
+	if (!t1)
+		return TREE_NEW;
+	if (!t2)
+		return TREE_DIFFERENT;
+	tree_difference = TREE_SAME;
+	if (diff_tree_sha1(t1->object.sha1, t2->object.sha1, "", &diff_opt) < 0)
+		return TREE_DIFFERENT;
+	return tree_difference;
+}
+
+static int same_tree_as_empty(struct tree *t1)
+{
+	int retval;
+	void *tree;
+	struct tree_desc empty, real;
+
+	if (!t1)
+		return 0;
+
+	tree = read_object_with_reference(t1->object.sha1, "tree", &real.size, NULL);
+	if (!tree)
+		return 0;
+	real.buf = tree;
+
+	empty.buf = "";
+	empty.size = 0;
+
+	tree_difference = 0;
+	retval = diff_tree(&empty, &real, "", &diff_opt);
+	free(tree);
+
+	return retval >= 0 && !tree_difference;
+}
+
+static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
+{
+	struct commit_list **pp, *parent;
+
+	if (!commit->tree)
+		return;
+
+	if (!commit->parents) {
+		if (!same_tree_as_empty(commit->tree))
+			commit->object.flags |= TREECHANGE;
+		return;
+	}
+
+	pp = &commit->parents;
+	while ((parent = *pp) != NULL) {
+		struct commit *p = parent->item;
+
+		if (p->object.flags & UNINTERESTING) {
+			pp = &parent->next;
+			continue;
+		}
+
+		parse_commit(p);
+		switch (compare_tree(p->tree, commit->tree)) {
+		case TREE_SAME:
+			parent->next = NULL;
+			commit->parents = parent;
+			return;
+
+		case TREE_NEW:
+			if (revs->remove_empty_trees && same_tree_as_empty(p->tree)) {
+				*pp = parent->next;
+				continue;
+			}
+		/* fallthrough */
+		case TREE_DIFFERENT:
+			pp = &parent->next;
+			continue;
+		}
+		die("bad tree compare for commit %s", sha1_to_hex(commit->object.sha1));
+	}
+	commit->object.flags |= TREECHANGE;
+}
+
+static void add_parents_to_list(struct rev_info *revs, struct commit *commit, struct commit_list **list)
+{
+	struct commit_list *parent = commit->parents;
+
+	/*
+	 * If the commit is uninteresting, don't try to
+	 * prune parents - we want the maximal uninteresting
+	 * set.
+	 *
+	 * Normally we haven't parsed the parent
+	 * yet, so we won't have a parent of a parent
+	 * here. However, it may turn out that we've
+	 * reached this commit some other way (where it
+	 * wasn't uninteresting), in which case we need
+	 * to mark its parents recursively too..
+	 */
+	if (commit->object.flags & UNINTERESTING) {
+		while (parent) {
+			struct commit *p = parent->item;
+			parent = parent->next;
+			parse_commit(p);
+			p->object.flags |= UNINTERESTING;
+			if (p->parents)
+				mark_parents_uninteresting(p);
+			if (p->object.flags & SEEN)
+				continue;
+			p->object.flags |= SEEN;
+			insert_by_date(p, list);
+		}
+		return;
+	}
+
+	/*
+	 * Ok, the commit wasn't uninteresting. Try to
+	 * simplify the commit history and find the parent
+	 * that has no differences in the path set if one exists.
+	 */
+	if (revs->paths)
+		try_to_simplify_commit(revs, commit);
+
+	parent = commit->parents;
+	while (parent) {
+		struct commit *p = parent->item;
+
+		parent = parent->next;
+
+		parse_commit(p);
+		if (p->object.flags & SEEN)
+			continue;
+		p->object.flags |= SEEN;
+		insert_by_date(p, list);
+	}
+}
+
+static void limit_list(struct rev_info *revs)
+{
+	struct commit_list *list = revs->commits;
+	struct commit_list *newlist = NULL;
+	struct commit_list **p = &newlist;
+
+	while (list) {
+		struct commit_list *entry = list;
+		struct commit *commit = list->item;
+		struct object *obj = &commit->object;
+
+		list = list->next;
+		free(entry);
+
+		if (revs->max_age != -1 && (commit->date < revs->max_age))
+			obj->flags |= UNINTERESTING;
+		if (revs->unpacked && has_sha1_pack(obj->sha1))
+			obj->flags |= UNINTERESTING;
+		add_parents_to_list(revs, commit, &list);
+		if (obj->flags & UNINTERESTING) {
+			mark_parents_uninteresting(commit);
+			if (everybody_uninteresting(list))
+				break;
+			continue;
+		}
+		if (revs->min_age != -1 && (commit->date > revs->min_age))
+			continue;
+		p = &commit_list_insert(commit, p)->next;
+	}
+	revs->commits = newlist;
+}
+
 static void add_one_commit(struct commit *commit, struct rev_info *revs)
 {
 	if (!commit || (commit->object.flags & SEEN))
@@ -214,10 +438,9 @@ static void handle_all(struct rev_info *
  *
  * Returns the number of arguments left ("new argc").
  */
-int setup_revisions(int argc, const char **argv, struct rev_info *revs)
+int setup_revisions(int argc, const char **argv, struct rev_info *revs, const char *def)
 {
 	int i, flags, seen_dashdash;
-	const char *def = NULL;
 	const char **unrecognized = argv+1;
 	int left = 1;
 
@@ -381,3 +604,23 @@ int setup_revisions(int argc, const char
 	*unrecognized = NULL;
 	return left;
 }
+
+void prepare_revision_walk(struct rev_info *revs)
+{
+	if (revs->paths)
+		diff_tree_setup_paths(revs->paths);
+	sort_by_date(&revs->commits);
+	if (revs->limited)
+		limit_list(revs);
+	if (revs->topo_order)
+		sort_in_topological_order(&revs->commits, revs->lifo);
+}
+
+struct commit *get_revision(struct rev_info *revs)
+{
+	if (!revs->commits)
+		return NULL;
+	return pop_most_recent_commit(&revs->commits, SEEN);
+}
+
+
diff --git a/revision.h b/revision.h
index a22f198..0bed3c0 100644
--- a/revision.h
+++ b/revision.h
@@ -3,6 +3,7 @@
 
 #define SEEN		(1u<<0)
 #define UNINTERESTING   (1u<<1)
+#define TREECHANGE	(1u<<2)
 
 struct rev_info {
 	/* Starting list */
@@ -32,7 +33,10 @@ struct rev_info {
 };
 
 /* revision.c */
-extern int setup_revisions(int argc, const char **argv, struct rev_info *revs);
+extern int setup_revisions(int argc, const char **argv, struct rev_info *revs, const char *def);
+extern void prepare_revision_walk(struct rev_info *revs);
+extern struct commit *get_revision(struct rev_info *revs);
+
 extern void mark_parents_uninteresting(struct commit *commit);
 extern void mark_tree_uninteresting(struct tree *tree);
 

^ permalink raw reply related

* [PATCH 0/3] git-rev-list libification effort: the next stage
From: Linus Torvalds @ 2006-02-28 19:19 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List

Ok, the following three patches that I'll send out are still pretty rough, 
but they actually get us to the first real point of this whole exercise: 
writing one of the trivial git helper scripts in C.

In particular, at the end, we have "git log" being implemented as this 
trivial C function:

	#define LOGSIZE (65536)

	static int cmd_log(int argc, char **argv, char **envp)
	{
		struct rev_info rev;
		struct commit *commit;
		char *buf = xmalloc(LOGSIZE);

		argc = setup_revisions(argc, argv, &rev, "HEAD");
		prepare_revision_walk(&rev);
		setup_pager();
		while ((commit = get_revision(&rev)) != NULL) {
			pretty_print_commit(CMIT_FMT_DEFAULT, commit, ~0, buf, LOGSIZE, 18);
			printf("%s\n", buf);
		}
		free(buf);
		return 0;
	}

which is actually a pretty good example of what I wanted to do.  It's
not perfect yet (it doesn't parse the "--pretty=xxx" option yet, nor the
"--since" and "--until" dates, for example), but I think this is all
going in the right direction. 

			Linus

^ permalink raw reply

* Re: bug?: stgit creates (unneccessary?) conflicts when pulling
From: Catalin Marinas @ 2006-02-28 18:53 UTC (permalink / raw)
  To: Karl Hasselström; +Cc: git
In-Reply-To: <b0943d9e0602280700p132c6da2v@mail.gmail.com>

On 28/02/06, Catalin Marinas <catalin.marinas@gmail.com> wrote:
> On 27/02/06, Catalin Marinas <catalin.marinas@gmail.com> wrote:
> > An idea (untested, I don't even know whether it's feasible) would be to
> > check which patches were merged by reverse-applying them starting with
> > the last. In this situation, all the merged patches should just revert
> > their changes. You only need to do a git-diff between the bottom and the
> > top of the patch and git-apply the output (maybe without even modifying
> > the tree). If this operation succeeds, the patch was integrated and you
> > don't even need to push it.
>
> I tried some simple tests with the idea above. I attached a patch if
> you'd like to try (I won't push it to the main StGIT repository yet.
> For safety reasons, it only skips the merged patches when pushing
> them. A future version could simply delete the merged patches.

Don't bother trying this patch. I just found a bug with git.reset()
and the caching of the git.__head variable. I'll post another patch in
a few hours.

--
Catalin

^ permalink raw reply

* Quick question: end of lines
From: Emmanuel Guerin @ 2006-02-28 18:32 UTC (permalink / raw)
  To: git

Hi all,

I  began recently to use git, and there is one thing I still do not
know how to do.

Is it possible to checkout sources out of the GIT repository with
Windows style end of lines?
In a manner much like the one Subversion is using?

To be more precise, I need to be able to checkout files on Unix and
Windows, and it is important that the end of lines are set
accordingly.

Thanks for any hints or pointers,

Regards,

Manu

^ permalink raw reply

* Re: [PATCH] diff-delta: bound hash list length to avoid O(m*n) behavior
From: Nicolas Pitre @ 2006-02-28 17:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vbqwrq4yi.fsf@assigned-by-dhcp.cox.net>

On Mon, 27 Feb 2006, Junio C Hamano wrote:

> Although I do not mean to rain on your effort and substantial
> improvement you made with this patch, we need to admit that
> improving pathological corner case has quite a diminishing
> effect in the overall picture.

It does, unfortunately.

The problem is that the code in diff_delta() is highly solicited during 
a pack generation.  Just the simple hashing change from this patch that 
involves two shifts instead of only one increased the CPU time to 
generate a pack quite appreciably.  This is just to say that this code 
is pretty sensitive to even small details.

> But this definitely is an improvement nevertheless.  I should
> try this on my wife's AMD64 (Cygwin).  The same datasets I used
> in my previous complaint still seem to take a couple of seconds
> (or more) each on my Duron 750 X-<.

There is no miracle.  To prevent the extreme cases additional tests have 
to be performed in all cases, and therefore performance for all cases is 
affected.

> A handful additional ideas.
> 
>  * Lower the hash limit a bit more (obvious).  That might have
>    detrimental effect in the general case.

It does.  The current parameters are what I think is the best compromize 
between cost and compression.  And still 99% of the cases don't need 
a lower hash limit since their hash lists are already way below the 
limit.  For the  cases where it makes a difference, well lowering the 
hash limit also increase the cost associated with the reworking of the 
hash list so there comes a point where you actually increase the 
resulting delta size while the CPU time stays constant.

>  * Study the hash bucket distribution for the pathological case
>    and see if we can cheaply detect a pattern.  I suspect these
>    cases have relatively few but heavily collided buckets with
>    mostly sparse other entries.  If there is such an easily
>    detectable pattern, maybe we can look for such a pattern at
>    runtime, and cull hash entries more aggressively in such a
>    case?

Same issue as above.  And "looking for such a pattern" really does 
increase the CPU time in _all_ cases.  So that looking for patological 
cases has to be as cheap as possible which I think is the case right 
now (and it is still too costly for my taste already).  And yet I don't 
think there is really a need for further reduction of the hash list at 
this point since the patological cases are really handled gracefully 
with this patch even with a nice and pretty packed delta.

>  * Try checking the target buffer to see if it would have many
>    hits on the heavily collided hash entries from the source
>    (maybe hash_count the target buffer as well).

Again that'd add another significant CPU cost to all cases, even those 
that don't need it at all.  The problem is not about those patological 
cases anymore since I think they are well under control now.  It is the 
overhead those few patological cases impose on the other 180000 good 
behaving objects that is a problem.

>  * Have pack-object detect a pathological blob (the test patch I
>    sent you previously uses the eye-candy timer for this
>    purpose, but we could getrusage() if we want to be more
>    precise) by observing how much time is spent for a single
>    round, and mark the blob as such, so we do not delta against
>    it with other blobs in find_deltas, when we are packing many
>    objects.  It does not really matter in the big picture if we
>    choose not to delta the pathological ones tightly, as long as
>    they are relatively few.

That is one solution, but that doesn't handle the root of the problem 
which is the cost of detecting those cases in the first place.

>  * Also in pack-object, have an optional backing-store to write
>    out deltified representations for results that took more than
>    certain amount of time to produce in find_deltas(), and reuse
>    them in write_object().

The pack reusing code is pretty effective in doing so already, isn't it?  
Since using git-repack -f should not be the common case then those 
patological cases (now taking one second instead of 60 or more) should 
be reused most of the time.

> I tried an experimental patch to cull collided hash buckets
> very aggressively.  I haven't applied your last "reuse index"
> patch, though -- I think that is orthogonal and I'd like to
> leave that to the next round.

It is indeed orthogonal and I think you could apply it to the next 
branch without the other patches (it should apply with little problems).  
This is an obvious and undisputable gain, even more if pack-objects is 
reworked to reduce memory usage by keeping only one live index for 
multiple consecutive deltaattempts.

> With the same dataset: resulting pack is 9651096 vs 9664546
> (your patch) final pack size, with wallclock 2m45s (user 2m31).
> Still not good enough, and at the same time I wonder why it gets
> _smaller_ results than yours.

It really becomes hard to find the best balance especially when the 
resulting delta is then fed through zlib.  Sometimes a larger delta will 
compress better, sometimes not.  My test bench was the whole git 
repository and the kernel repository, and with such large number of 
objects it seems that smaller deltas always translate into smaller 
packs.  But it might not necessarily always be the case.

> I'd appreciate it if you can test it on the 20MB blobs and see
> what happens if you have time.

Before your patch:
user 0m9.713s, delta size = 4910988 bytes, or 1744110 compressed.

With your patch:
user 0m3.948s, delta size = 6753517 bytes, or 1978803 once compressed.

BTW there is another potential for improvement in the delta code (but I 
have real work to do now so)...

Let's suppose the reference buffer has:

***********************************************************************/

This is common with some comment block styles.  Now that line will end 
up with multiple blocks that will hash to the same thing and linked 
successively in the same hash bucket except for the last ones where the 
'/' is involved in the hashing.

Now if the target buffer also contains a line similar to the above but 
one '*' character longer.  The first "***" will be hashed to x, then x 
will be looked up in the reference index, and the first entry 
corresponding to the first character of the line above will be returned.  
At that point a search forward is started to find out how much matches.  
In this case it will match up to the '/' in the reference buffer while 
the target will have another '*'.  The length will be recorded as the 
best match, so far so good.

Now the next entry from the same hash bucket will be tested in case it 
might provide a starting point for a longer match.  But we know in this 
case that the location of the second '*' in the reference buffer will be 
returned.  And we obviously know that this cannot match more data than 
the previous attempt, in fact it'll match one byte less.  And the hash 
entry to follow will return the location of the third '*' for another 
byte less to match.  Then the fourth '*' for yet another shorter match. 
And so on repeating those useless string comparisons multiple times.

One improvement might consist of counting the number of consecutive 
identical bytes when starting a compare, and manage to skip as many hash 
entries (minus the block size) before looping again with more entries in 
the same hash bucket.

The other case to distinguish is when the target buffer has instead a 
_shorter_ line, say 5 fewer '*''s before the final '/'.  Here we would 
loop 4 times matching only a bunch of '*' before we finally match the 
'/' as well on the fifth attempt becoming the best match.  In this case 
we could test if the repeated byte we noticed from the start is present 
further away in the reference buffer, and if so simply skip hash entries 
while searching forward in the reference buffer.  When the reference 
buffer doesn't match the repeated character then the comparison is 
resumed with the target buffer, and in this case the '/' would match 
right away, avoiding those four extra loops recomparing all those '*' 
needlessly.

Such improvements added to the search algorithm might make it 
significantly faster, even in the presence of multiple hash entries all 
located in the same bucket.  Or maybe not if the data set has few 
repeated characters.  And this is probably worthwhile only on top of my 
small block patch. Only experimentation could tell.  Someone willing to 
try?

Nicolas

^ permalink raw reply

* Re: [PATCH] git pull cannot find remote refs.
From: Stefan-W. Hahn @ 2006-02-28 16:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vlkvwuvyl.fsf@assigned-by-dhcp.cox.net>

Also sprach Junio C Hamano am Mon, 27 Feb 2006 at 17:13:22 -0800:

> ls-remote shows "SHA1\tPATH".  The original says "hexadecimal
> followed by [either a single space or a single tab] followed by

> difference.  Puzzled...

Grmph... You are right.

> I've seen two servers DNS round-robin and one of them fail to
> respond.  The first "fetch" goes to the good one and the second
> ls-remote goes to the bad one, then you would see "Oops, we
> cannot peek tags".  But this patch does not have anything to do
> with that problem..

Trapped. I haven't seen this, but perhaps it was the problem. 
I'll watching for the next occurence.

Sorry for the noise.

Stefan

-- 
Stefan-W. Hahn                          It is easy to make things.
/ mailto:stefan.hahn@s-hahn.de /        It is hard to make things simple.			

^ permalink raw reply

* Re: fatal: unexpected EOF
From: Tony Luck @ 2006-02-28 15:59 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Linus Torvalds, Git Mailing List
In-Reply-To: <44046F94.3070806@didntduck.org>

> I doubt it is a problem with mirroring, since it affects all repos
> (kernel, git, cogito, etc.) at the same time.

Ditto.  Jes has been grumbling overnight that he can't get a reliable pull
from my kernel repo ... and that hasn't been updated in 10 days, so the
mirror code shouldn't be touching it.  His error was:

  fatal: read error (Connection reset by peer)
  Fetch failure: git://git.kernel.org/pub/...

He also reported that after a few retries it worked.

Does the git daemon log any errors to syslog on the server?  If so, can someone
with access go take a look.

-Tony

^ permalink raw reply

* Re: fatal: unexpected EOF
From: Brian Gerst @ 2006-02-28 15:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0602280731210.22647@g5.osdl.org>

Linus Torvalds wrote:
> 
> On Tue, 28 Feb 2006, Brian Gerst wrote:
> 
>>Lately I've been receiving this error frequently from git.kernel.org:
>>
>>Fetching pack (head and objects)...
>>fatal: unexpected EOF
>>cg-fetch: fetching pack failed
>>
>>What is causing this?
> 
> 
> Almost any error will cause the pack sending to abort, and the git:// 
> protocol only opens a single socket for data, so there is no way for the 
> other end to say _what_ failed.
> 
> With git.kernel.org, I suspect the reason for the failure is almost always 
> the same, though: the mirroring is not complete, so it doesn't have all 
> object files. The mirroring from master.kernel.org to the actual public 
> machines is just a rsync script, so there's no atomicity guarantees.
> 
> That said, it might be a load issue too - I don't know what limits 
> Peter & co put on the git daemons, and it might also be that it's set up 
> to accept at most <n> connections and will close anything else.
> 
> 		Linus
> 
> 

I doubt it is a problem with mirroring, since it affects all repos 
(kernel, git, cogito, etc.) at the same time.

--
				Brian Gerst

^ permalink raw reply

* Re: fatal: unexpected EOF
From: Linus Torvalds @ 2006-02-28 15:34 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Git Mailing List
In-Reply-To: <440449D7.3010508@didntduck.org>

On Tue, 28 Feb 2006, Brian Gerst wrote:
>
> Lately I've been receiving this error frequently from git.kernel.org:
> 
> Fetching pack (head and objects)...
> fatal: unexpected EOF
> cg-fetch: fetching pack failed
> 
> What is causing this?

Almost any error will cause the pack sending to abort, and the git:// 
protocol only opens a single socket for data, so there is no way for the 
other end to say _what_ failed.

With git.kernel.org, I suspect the reason for the failure is almost always 
the same, though: the mirroring is not complete, so it doesn't have all 
object files. The mirroring from master.kernel.org to the actual public 
machines is just a rsync script, so there's no atomicity guarantees.

That said, it might be a load issue too - I don't know what limits 
Peter & co put on the git daemons, and it might also be that it's set up 
to accept at most <n> connections and will close anything else.

		Linus

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox