Git development
 help / color / mirror / Atom feed
* Re: [PATCH] Add git-annotate, a tool for assigning blame.
From: Johannes Schindelin @ 2006-02-24  0:00 UTC (permalink / raw)
  To: Fredrik Kuivinen; +Cc: Ryan Anderson, Junio C Hamano, git
In-Reply-To: <20060223225547.GB8673@c165.ib.student.liu.se>

Hi,

On Thu, 23 Feb 2006, Fredrik Kuivinen wrote:

> On Thu, Feb 23, 2006 at 05:10:49PM -0500, Ryan Anderson wrote:
> > (Biased critique since I have the other tool in the tree, but still...)
> > 
> > > +    FILE* fout = fopen("/tmp/git-blame-tmp1", "w");
> > 
> > Probably should be using something like mkstemp (mkstmp?) here, so if
> > someone is runnign things as root or with a malicous user around, things
> > don't collide.
> > 
> > Hell, so on a multi-user machine this doesn't blow up on you.
> 
> Yep, I know. The code is mostly a proof of concept. I didn't submit it
> for inclusion.

Ha ha, famous last words!

> > But, read down for a related comment.
> > 
> > > +    if(!fout)
> > > +        die("fopen tmp1 failed: %s", strerror(errno));
> > > +
> > > +    if(fwrite(info_c->buf, info_c->size, 1, fout) != 1)
> > > +        die("fwrite 1 failed: %s", strerror(errno));
> > > +    fclose(fout);
> > > +
> > > +    fout = fopen("/tmp/git-blame-tmp2", "w");
> > > +    if(!fout)
> > > +        die("fopen tmp2 failed: %s", strerror(errno));
> > > +
> > > +    if(fwrite(info_o->buf, info_o->size, 1, fout) != 1)
> > > +        die("fwrite 2 failed: %s", strerror(errno));
> > > +    fclose(fout);
> > > +
> > > +    FILE* fin = popen("diff -u0 /tmp/git-blame-tmp1 /tmp/git-blame-tmp2", "r");
> > > +    if(!fin)
> > > +        die("popen failed: %s", strerror(errno));
> > 
> > Can't git-diff-tree do this sufficiently, anyway?  See my Perl script
> > for an example, you just need both commit IDs and both filenames and the
> > appropriate -M and you get the right results.
> > 
> > (It's possible that's part of where the performance differences are,
> > though, not really sure at the moment.)
> > 
> 
> Yeah.. maybe. My first thought was to avoid forking and execing diff
> and use some C library for doing the diffing instead (libxdiff). But
> then I just wanted to get some code working and the simplest solution
> I could think of was to fork and exec diff.

git-diff-tree fork()s a diff. So, by fork()ing git-diff-tree you get two 
fork()s (and no knife...).

> > I'm going to stop there for the moment, I'm not really confident in my
> > understanding of git-internals to say much more just yet.
> > 
> > This could probably benefit a *LOT* from the libification project, I
> > think, though.
> 
> Yes, perhaps. Some of the git-rev-list bits might simplify a couple of
> things.

The major problem is probably not solved: What Linus calls a "stream 
interface".

I.e. if you pipe the output of git-rev-list to another program, you 
*need* to execute the two semi-simultaneously. The "alternative" would be 
to use buffers, which can get huge (and are sometimes not needed: think 
git-whatchanged, which starts outputting before it's getting no more 
input).

> I have found some severe problems with the code I posted, in
> particular it doesn't handle parallel development tracks at all. I am
> working on fixing it, but it isn't finished yet.

Looking forward to them!

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH] Add git-annotate, a tool for assigning blame.
From: Junio C Hamano @ 2006-02-24  0:17 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Fredrik Kuivinen
In-Reply-To: <Pine.LNX.4.63.0602240055080.31816@wbgn013.biozentrum.uni-wuerzburg.de>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> > This could probably benefit a *LOT* from the libification project, I
>> > think, though.
>> 
>> Yes, perhaps. Some of the git-rev-list bits might simplify a couple of
>> things.
>
> The major problem is probably not solved: What Linus calls a "stream 
> interface".
>
> I.e. if you pipe the output of git-rev-list to another program, you 
> *need* to execute the two semi-simultaneously. The "alternative" would be 
> to use buffers, which can get huge (and are sometimes not needed: think 
> git-whatchanged, which starts outputting before it's getting no more 
> input).

You need a limited coroutine support, something like generator
functions in Python ;-).  In C, traditional way of doing it is
to make your application specific function a callback of
rev-list or whatever generator is, which is very unpleasant to
code.

>> I have found some severe problems with the code I posted, in
>> particular it doesn't handle parallel development tracks at all. I am
>> working on fixing it, but it isn't finished yet.
>
> Looking forward to them!

Likewise.

^ permalink raw reply

* Re: [PATCH] New git-seek command with documentation and test.
From: J. Bruce Fields @ 2006-02-24  0:18 UTC (permalink / raw)
  To: Carl Worth; +Cc: Junio C Hamano, git, Linus Torvalds
In-Reply-To: <87zmkhrf4y.wl%cworth@cworth.org>

On Thu, Feb 23, 2006 at 12:31:25PM -0800, Carl Worth wrote:
> --- /dev/null
> +++ b/Documentation/git-seek.txt
> @@ -0,0 +1,44 @@
> +git-bisect(1)
> +=============

Oops.

> +When given a <revision>, git-seek updates the files in the working
> +tree to the state of the given revision. It will do this by performing
> +a checkout of <revision> to a new branch named "seek", or by resetting
> +the seek branch if it already exists.

I wonder if its a good idea to silently reset a branch named with a
short common word?

> +LONG_USAGE='git-seek provides a temporary excursion through the revision history.
> +
> +When given a <revision>, git-seek updates the files in the working
> +tree to the state of the given revision. It will do this by performing
> +a checkout of <revision> to a new branch named "seek", or by resetting
> +the seek branch if it already exists.

These long usage texts with language duplicated from the man pages seem
like they'd be asking for bit-rot, when an update happens in one place
but not the other.  I dunno.

--b.

^ permalink raw reply

* Re: [PATCH] New git-seek command with documentation and test.
From: linux @ 2006-02-24  0:29 UTC (permalink / raw)
  To: cworth; +Cc: git

The annoying thing about temporary branch names like "bisect" and "seek"
is that:
a) They clutter up the nae space available to the repository user.
   Users have to know that those are reserved names.
b) If a repository is cloned while they're in use, they might get
   into a "remotes" file, with even more confusing results.

This is somewhat heretical, but how about making a truly unnamed branch by
having .git/HEAD *not* be a symlink, but rather hold a commit ID directly?
It's already well established that files in the .git directory directly
are strictly local to this working directory, so it seems a much better
home for such temporary state.

Admittedly, this requires more invasive edits (particularly adding a third
legitimate case to validate_symref()), but it seems to make more sense.
And be ultimately simpler than workarounds for the above problems.

Just loosen the rules from ".git/HEAD must be a symlink" to
".git/HEAD must be a symlink before you can check in".


Yes, I know it's radical.  At least I'm not questioning the
power and efficacy of indulgences.  :-)

^ permalink raw reply

* Re: [PATCH] Add git-annotate, a tool for assigning blame.
From: Johannes Schindelin @ 2006-02-24  0:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Fredrik Kuivinen
In-Reply-To: <7vek1thaop.fsf@assigned-by-dhcp.cox.net>

Hi,

On Thu, 23 Feb 2006, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> >> > This could probably benefit a *LOT* from the libification project, I
> >> > think, though.
> >> 
> >> Yes, perhaps. Some of the git-rev-list bits might simplify a couple of
> >> things.
> >
> > The major problem is probably not solved: What Linus calls a "stream 
> > interface".
> >
> > I.e. if you pipe the output of git-rev-list to another program, you 
> > *need* to execute the two semi-simultaneously. The "alternative" would be 
> > to use buffers, which can get huge (and are sometimes not needed: think 
> > git-whatchanged, which starts outputting before it's getting no more 
> > input).
> 
> You need a limited coroutine support, something like generator
> functions in Python ;-).  In C, traditional way of doing it is
> to make your application specific function a callback of
> rev-list or whatever generator is, which is very unpleasant to
> code.

The most unpleasant aspect is that you usually need something like "this" 
in C++: a pointer to an object (which you have to pass around all the 
time). Without it you can not use the function in a nested way.

However, I can also see benefits of this when compared to the traditional 
UNIX approach. It should be faster, for one, since you don't need to pass 
data through pipes all the time. (This might be not as true for Linux as 
for other OSes.)

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH] New git-seek command with documentation and test.
From: Johannes Schindelin @ 2006-02-24  0:54 UTC (permalink / raw)
  To: linux; +Cc: cworth, git
In-Reply-To: <20060224002915.17331.qmail@science.horizon.com>

Hi,

On Fri, 23 Feb 2006, linux@horizon.com wrote:

> This is somewhat heretical, but how about making a truly unnamed branch by
> having .git/HEAD *not* be a symlink, but rather hold a commit ID directly?

Not heretical. How do you intend to switch branches now? And how do you 
intend to record the starting point of git-seek to which you want to 
return to? All leads back to .git/HEAD pointing to a branch (or whatever 
you want to call it). And BTW, .git/HEAD is no symlink these days, but a 
symref.

Hth,
Dscho

^ permalink raw reply

* [PATCH] git-seek: Eliminate spurious warning. Fix errant reference to git-bisect in docs.
From: Carl Worth @ 2006-02-24  1:01 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Junio C Hamano, git, Linus Torvalds
In-Reply-To: <20060224001848.GB21094@fieldses.org>

[-- Attachment #1: Type: text/plain, Size: 2198 bytes --]

This fixed a bug that would cause "git seek" to mistakenly try to
checkout the seek branch just after deleting it. Of course, that would
never work, but fixing the bug does squelch the annoying error caused
by the bug.

Also fix an errant title of "git-bisect" in the git-seek documentation.

---

On Thu, 23 Feb 2006 19:18:48 -0500, "J. Bruce Fields" wrote:
> On Thu, Feb 23, 2006 at 12:31:25PM -0800, Carl Worth wrote:
> > +git-bisect(1)
> > +=============
> 
> Oops.

Thanks.

> I wonder if its a good idea to silently reset a branch named with a
> short common word?

It at least takes some care not to leave commits dangling when doing
this, (the seek branch must at least be a subset of the current
HEAD). I was pretty much following the lead of git-bisect here,
(though "bisect" is definitely a touch longer and less common than
"seek").

If it would be preferred to hide such "internal" branch names behind
some unlikely symbol or such, that would obviously be easy to do.

As is, the seek branch is at least documented, and rather well
advertised in operation, (for example, returning with "git seek"
reported "Deleted branch seek.").

> These long usage texts with language duplicated from the man pages seem
> like they'd be asking for bit-rot, when an update happens in one place
> but not the other.  I dunno.

Yeah, I don't know. Again, I was just imitating things I'd seen
elsewhere.

 Documentation/git-seek.txt |    4 ++--
 git-seek.sh                |    2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

43f042982c26859b6b7f6055fc03dda8e89f4e70
diff --git a/Documentation/git-seek.txt b/Documentation/git-seek.txt
index cb5c13d..513dbc7 100644
--- a/Documentation/git-seek.txt
+++ b/Documentation/git-seek.txt
@@ -1,5 +1,5 @@
-git-bisect(1)
-=============
+git-seek(1)
+===========
 
 NAME
 ----
diff --git a/git-seek.sh b/git-seek.sh
index 26f0b76..921c014 100644
--- a/git-seek.sh
+++ b/git-seek.sh
@@ -65,7 +65,7 @@ seek_reset() {
 		source 
 	fi
 	git checkout "$source" &&
-	(git branch -d seek || err=$? ; git checkout seek ; exit $err) &&
+	(git branch -d seek || (err=$? ; git checkout seek ; exit $err)) &&
 	rm -f "$GIT_DIR/head-name"
 }
 
-- 
1.2.3.g2656-dirty


[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply related

* Re: [PATCH] New git-seek command with documentation and test.
From: linux @ 2006-02-24  1:23 UTC (permalink / raw)
  To: Johannes.Schindelin, linux; +Cc: cworth, git
In-Reply-To: <Pine.LNX.4.63.0602240152490.32472@wbgn013.biozentrum.uni-wuerzburg.de>

> Not heretical. How do you intend to switch branches now?

Point .git/HEAD at the branch like usual.
.git/HEAD would not be a symref *only* when on the unnamed temporary
branch (which doesn't accept commits).

> And how do you intend to record the starting point of git-seek to
> which you want to return to?

Te same way it's done now for git-seek or git-bisect: by copying the
old HEAD to a temporary location like .git/head-name.

> All leads back to .git/HEAD pointing to a branch (or whatever 
> you want to call it).

In the usual case, yes it should.  Any time you want to be able to
develop on a branch, you need .git/HEAD pointing to a branch.
You only make it point to a commit directly is when exploring the
history with no intention of developing from it.

(Note that you can easily change your mind with a simple
"git checkout -b <branch>".)

> And BTW, .git/HEAD is no symlink these days, but a symref.

Yes, I'm sorry; I was just being lazy with my terminology.

^ permalink raw reply

* Re: [PATCH] Convert open("-|") to qx{} calls
From: Rogan Dawes @ 2006-02-24  5:19 UTC (permalink / raw)
  To: Alex Riesen; +Cc: Johannes Schindelin, Junio C Hamano, git
In-Reply-To: <20060223211403.GB5827@steel.home>

Alex Riesen wrote:
> Randal L. Schwartz, Thu, Feb 23, 2006 21:41:44 +0100:
>> Johannes> Now that our local Perl guru joined the discussion, may I ask what
>> Johannes> is, and what is not quoted when put inside qx{}?
>>
>> Nothing is quoted.  Your string acts as if it was XXX in:
>>
>>         sh -c 'XXX'
>>
> 
> Not so for ActiveState. It'll just run the first non-whitespace word
> passing the rest of the line in its command-line.
> It's not even worse then to pass it all to cmd/command :)
> 

Not true.

 > type t
#!perl -w

print qx{echo joe & echo joe}."\n";
 > perl t
joe
joe

 >

If the shell was not interpreting the arguments, you would expect to get 
1 line with:

joe & echo joe

on it.

 > perl -v
This is perl, v5.8.7 built for MSWin32-x86-multi-thread
(with 7 registered patches, see perl -V for more detail)

Copyright 1987-2005, Larry Wall

Binary build 813 [148120] provided by ActiveState http://www.ActiveState.com
ActiveState is a division of Sophos.
Built Jun  6 2005 13:36:37

Perl may be copied only under the terms of either the Artistic License 
or the
GNU General Public License, which may be found in the Perl 5 source kit.

Complete documentation for Perl, including FAQ lists, should be found on
this system using `man perl' or `perldoc perl'.  If you have access to the
Internet, point your browser at http://www.perl.org/, the Perl Home Page.

Regards,

Rogan

^ permalink raw reply

* Re: FYI: git-am allows creation of empty commits.
From: Junio C Hamano @ 2006-02-24  5:56 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: git
In-Reply-To: <m1slqahyxt.fsf@ebiederm.dsl.xmission.com>

ebiederm@xmission.com (Eric W. Biederman) writes:

> After fixing it up and doing all of my edits I occasionally forget
> the git-update-index step, before calling git-am --resolved.  This
> proceeds along it's merry way and creates an empty commit.

Certainly a safty measure is missing here.  Thanks for
noticing.  How about something like this?

---
diff --git a/git-am.sh b/git-am.sh
index 85ecada..6730813 100755
--- a/git-am.sh
+++ b/git-am.sh
@@ -300,10 +300,16 @@ do
 	    } >"$dotest/final-commit"
 	    ;;
 	*)
-		case "$resolved,$interactive" in
-		tt)
-			# This is used only for interactive view option.
+		case "$resolved" in
+		t)
+			# This is used for interactive view option, but
+			# also we should see if the user really did
+			# something...
 			git-diff-index -p --cached HEAD >"$dotest/patch"
+			test -s "$dotest/patch" || {
+				echo "You said resolved, but there is no change in the index..."
+				stop_here $this
+			}
 			;;
 		esac
 	esac

^ permalink raw reply related

* Re: [PATCH] git-seek: Eliminate spurious warning. Fix errant reference to git-bisect in docs.
From: Junio C Hamano @ 2006-02-24  6:02 UTC (permalink / raw)
  To: Carl Worth; +Cc: git
In-Reply-To: <87vev5r2m4.wl%cworth@cworth.org>

Carl Worth <cworth@cworth.org> writes:

>> I wonder if its a good idea to silently reset a branch named with a
>> short common word?
>
> It at least takes some care not to leave commits dangling when doing
> this, (the seek branch must at least be a subset of the current
> HEAD). I was pretty much following the lead of git-bisect here,
> (though "bisect" is definitely a touch longer and less common than
> "seek").

IIUC Cogito seems to use cg-seek-point or something long and
unusual like that...

^ permalink raw reply

* Re: [PATCH] fmt-merge-msg: avoid open "-|" list form for Perl 5.6
From: Linus Torvalds @ 2006-02-24  6:43 UTC (permalink / raw)
  To: Sam Vilain; +Cc: Junio C Hamano, git
In-Reply-To: <43FE1B9D.10403@vilain.net>



On Fri, 24 Feb 2006, Sam Vilain wrote:
> 
> I like the term "Domain Specific Language" to refer to this sort of thing.  It
> even hints at using the right kind of tools to achieve it, too :)

Just for fun, I wrote a first cut at a script engine for passing pipes 
around.

It's designed so that the "fork+exec with a pipe" should be easily 
replaced by "spawn with a socket" if that's what the target wants, but 
it also has some rather strange syntax, so I'm in no way claiming that 
this is a sane approach.

It was fun to write, though. You can already do some strange things with 
it, like writing a script like this

	set @ --since=2.months.ago Makefile
	exec git-rev-parse --default HEAD $@
		stdout arguments
	exec git-rev-list $arguments
		stdout revlist
	exec git-diff-tree --pretty --stdin
		stdin revlist
		stdout diff-tree-output
	exec less -S
		stdin diff-tree-output

which kind of shows the idea (it sets the "@" variable by hand, because 
the silly "git-script" thing doesn't set it itself).

I'm not sure this is worth pursuing (it really is a very strange kind of 
script syntax), but it was amusing to do. 

No docs - if you want to know how it works, you'll just have to read the 
equally strange sources.

		Linus

----
diff-tree 3e7dbcaae63278ccd413d93ecf9cba65a0d07021 (from d27d5b3c5b97ca30dfc5c448dc8cdae914131051)
Author: Linus Torvalds <torvalds@osdl.org>
Date:   Thu Feb 23 22:06:12 2006 -0800

    Add really strange script engine

diff --git a/Makefile b/Makefile
index 0c04882..247030b 100644
--- a/Makefile
+++ b/Makefile
@@ -164,7 +164,7 @@ PROGRAMS = \
 	git-upload-pack$X git-verify-pack$X git-write-tree$X \
 	git-update-ref$X git-symbolic-ref$X git-check-ref-format$X \
 	git-name-rev$X git-pack-redundant$X git-repo-config$X git-var$X \
-	git-describe$X git-merge-tree$X
+	git-describe$X git-merge-tree$X git-script$X
 
 # what 'all' will build and 'install' will install, in gitexecdir
 ALL_PROGRAMS = $(PROGRAMS) $(SIMPLE_PROGRAMS) $(SCRIPTS)
@@ -204,7 +204,7 @@ LIB_OBJS = \
 	quote.o read-cache.o refs.o run-command.o \
 	server-info.o setup.o sha1_file.o sha1_name.o strbuf.o \
 	tag.o tree.o usage.o config.o environment.o ctype.o copy.o \
-	fetch-clone.o \
+	fetch-clone.o execute.o \
 	$(DIFF_OBJS)
 
 LIBS = $(LIB_FILE)
diff --git a/cache.h b/cache.h
index 5020f07..e4e66ce 100644
--- a/cache.h
+++ b/cache.h
@@ -352,4 +352,7 @@ extern int copy_fd(int ifd, int ofd);
 extern int receive_unpack_pack(int fd[2], const char *me, int quiet);
 extern int receive_keep_pack(int fd[2], const char *me, int quiet);
 
+/* script execution engine.. */
+extern int execute(const char *name, char *buf, unsigned int size);
+
 #endif /* CACHE_H */
diff --git a/execute.c b/execute.c
new file mode 100644
index 0000000..abb6801
--- /dev/null
+++ b/execute.c
@@ -0,0 +1,622 @@
+/*
+ * Stupid git script execution engine
+ *
+ * Copyrigt (C) 2006, Linus Torvalds
+ *
+ * There's one rule here: only ever expand a single level of variables.
+ * In particular - we never expand as a string, and keep everything as
+ * a list of entries. Always.
+ *
+ * This avoids all issues with quoting etc, since it's never an issue.
+ * When we execute a program, we have a list of arguments, no quoting
+ * or string parsing involved.
+ */
+#include "cache.h"
+#include <sys/wait.h>
+
+enum vartype {
+	var_none,
+	var_fd,
+	var_array
+};
+
+struct argument {
+	enum vartype type;
+	int fd, members, allocs, error;
+	const char **array;
+};
+
+struct variable {
+	const char *name;
+	struct variable *next;
+	struct argument value;
+};
+
+struct cmd_struct {
+	const char *line;
+	unsigned int len;
+	struct cmd_struct *subcmd;
+	struct cmd_struct *next;
+};
+
+struct parse_buf {
+	const char *name;
+	const char *error;
+	char *prog;
+	unsigned int size;
+	unsigned int offset;
+	unsigned int line;
+	unsigned int linestart;
+};
+
+static struct variable *vars = NULL;
+static void run_program(struct cmd_struct *cmd);
+
+static int countline(struct parse_buf *buf)
+{
+	int count = 0;
+	unsigned offset;
+
+	for (offset = buf->offset; offset < buf->size; offset++) {
+		unsigned char c = buf->prog[offset];
+		switch (c) {
+		case '\n':
+			buf->line++;
+		/* fallthrough */
+		case '\r':
+			count = 0;
+			buf->offset = offset + 1;
+			buf->prog[offset] = 0;
+			continue;
+		case ' ':
+			count++;
+			continue;
+		case '\t':
+			count = (count + 8) & ~7;
+			continue;
+		default:
+			buf->linestart = offset;
+			return count;
+		}
+	}
+	buf->offset = offset;
+	return -2;
+}
+
+/*
+ * When this is called, we've already done the indentation check,
+ * and "buf->linestart" points to the actual start of the command.
+ */
+static struct cmd_struct *parse_one_line(struct parse_buf *buf)
+{
+	unsigned int offset;
+	struct cmd_struct *cmd = xmalloc(sizeof(*cmd));
+	memset(cmd, 0, sizeof(*cmd));
+
+	offset = buf->linestart;
+	cmd->line = buf->prog + offset;
+	for ( ; offset < buf->size; offset++) {
+		unsigned char c = buf->prog[offset];
+		switch (c) {
+		case '\n':
+			buf->prog[offset++] = 0;
+			buf->line++;
+			break;
+		default:
+			continue;
+		}
+		break;
+	}
+	buf->offset = offset;
+	return cmd;
+}
+
+static struct cmd_struct *parse(struct parse_buf *buf, int indent)
+{
+	struct cmd_struct *first = NULL, *last = NULL;
+
+	for (;;) {
+		struct cmd_struct *now;
+		int newindent = countline(buf);
+
+		if (newindent < indent)
+			break;
+		if (!first)
+			indent = newindent;
+		if (newindent > indent) {
+			struct cmd_struct *subcmd;
+			if (last->subcmd) {
+				buf->error = "bad indentation";
+				return NULL;
+			}
+			subcmd = parse(buf, newindent);
+			if (!subcmd)
+				return NULL;
+			last->subcmd = subcmd;
+			continue;
+		}
+		now = parse_one_line(buf);
+		if (!now)
+			return NULL;
+		if (last)
+			last->next = now;
+		else
+			first = now;
+		last = now;
+	}
+	return first;
+}
+
+static struct cmd_struct *exec_bad(struct cmd_struct *cmd, struct argument *arg)
+{
+	printf("unrecognized command: '%s'\n", cmd->line);
+	return NULL;
+}
+
+static struct cmd_struct *exec_echo(struct cmd_struct *cmd, struct argument *arg)
+{
+	int i;
+	for (i = 0; i < arg->members; i++)
+		printf("%s%c", arg->array[i], i == arg->members-1 ? '\n': ' ');
+	return cmd->next;
+}
+
+static struct variable *find_variable(const char *name)
+{
+	struct variable *var = vars;
+	while (var) {
+		if (!strcmp(var->name, name))
+			return var;
+		var = var->next;
+	}
+	return NULL;
+}
+
+static struct variable *create_variable(const char *name)
+{
+	struct variable *var = find_variable(name);
+
+	if (!var) {
+		var = xmalloc(sizeof(*var));
+		memset(var, 0, sizeof(*var));
+		var->name = name;
+		var->next = vars;
+		vars = var;
+	}
+	return var;
+}
+
+static struct cmd_struct *exec_set(struct cmd_struct *cmd, struct argument *arg)
+{
+	int count = arg->members;
+	struct variable *var;
+	const char *name;
+	unsigned size;
+
+	if (!count)
+		return cmd->next;
+	name = arg->array[0];
+	var = create_variable(arg->array[0]);
+
+	var->value.members = count-1;
+	size = count * sizeof(var->value.array[0]);
+	var->value.array = xmalloc(size);
+	memcpy(var->value.array, arg->array+1, size);
+
+	return cmd->next;
+}
+
+static void free_arg_list(struct argument *arg)
+{
+	/*
+	 * We can't free the actual entries, since we re-use them
+	 * on expansion. Right or wrong, that's how it is...
+	 */
+	free(arg->array);
+}
+
+static void drop_variable(struct variable *var)
+{
+	free_arg_list(&var->value);
+	free(var);
+}
+
+static struct cmd_struct *exec_unset(struct cmd_struct *cmd, struct argument *arg)
+{
+	int i;
+
+	for (i = 0; i < arg->members; i++) {
+		const char *name = arg->array[i];
+		struct variable *var, **p = &vars;
+
+		while ((var = *p) != NULL) {
+			if (!strcmp(var->name, name)) {
+				*p = var->next;
+				drop_variable(var);
+				break;
+			}
+			p = &var->next;
+		}
+	}
+	return cmd->next;
+}
+
+static struct cmd_struct *exec_exit(struct cmd_struct *cmd, struct argument *arg)
+{
+	int value = 0;
+	if (arg->members)
+		value = atoi(arg->array[0]);
+	exit(value);
+}
+
+static struct cmd_struct *exec_else(struct cmd_struct *cmd, struct argument *arg)
+{
+	return cmd->next;
+}
+
+static struct cmd_struct *exec_if(struct cmd_struct *cmd, struct argument *arg)
+{
+	struct cmd_struct *pos, *neg;
+
+	pos = cmd->subcmd;
+	neg = cmd->next;
+	if (neg) {
+		if (!strncmp(neg->line, "else", 4))
+			neg = neg->subcmd;
+		else
+			neg = NULL;
+	}
+	if (!arg->members)
+		pos = neg;
+	run_program(pos);
+	return cmd->next;
+}
+
+static int match_cmd(const char *match, struct cmd_struct *cmd)
+{
+	int len = strlen(match), cmdlen = strlen(cmd->line);
+	if (cmdlen < len)
+		return 0;
+	if (cmdlen > len && !isspace(cmd->line[len]))
+		return 0;
+	return !memcmp(match, cmd->line, len);
+}
+
+static int set_input(int *redirect, const char *val)
+{
+	struct variable *var;
+
+	while (isspace(*val))
+		val++;
+	var = find_variable(val);
+	if (!var || var->value.type != var_fd)
+		die("bad 'fd' variable %s", val);
+
+	*redirect = var->value.fd;
+	var->value.fd = -1;
+	return 0;
+}
+
+static int set_output(int *redirect, const char *val)
+{
+	int fd[2];
+	struct variable *var;
+
+	while (isspace(*val))
+		val++;
+	var = create_variable(val);
+
+	if (pipe(fd) < 0)
+		die("unable to pipe");
+	var->value.type = var_fd;
+	var->value.fd = fd[0];
+	*redirect = fd[1];
+	return 0;
+}
+
+/*
+ * Only these routines should need to be ported to a "spawn()" interface
+ */
+static struct cmd_struct *exec_exec(struct cmd_struct *cmd, struct argument *arg)
+{
+	int redirect[3];
+	pid_t pid;
+	int nr = arg->members;
+	struct cmd_struct *io;
+
+	if (!nr) {
+		run_program(cmd->subcmd);
+		return cmd->next;
+	}
+
+	memset(redirect, 0, sizeof(redirect));
+	for (io = cmd->subcmd; io ; io = io->next) {
+		if (match_cmd("stdin", io)) {
+			set_input(redirect+0, io->line + 5);
+			continue;
+		}
+		if (match_cmd("stdout", io)) {
+			set_output(redirect+1, io->line + 6);
+			continue;
+		}
+		if (match_cmd("stderr", io)) {
+			set_output(redirect+2, io->line + 6);
+			continue;
+		}
+	}
+
+	/*
+	 * HERE! Use spawn if necessary - the fd redirect table has been set up
+	 */
+	pid = vfork();
+	if (pid < 0) {
+		error("vfork failed (%s)", strerror(errno));
+		return NULL;
+	}
+
+	if (!pid) {
+		int retval;
+		if (redirect[0]) {
+			dup2(redirect[0], 0);
+			close(redirect[0]);
+		}
+		if (redirect[1]) {
+			dup2(redirect[1], 1);
+			close(redirect[1]);
+		}
+		if (redirect[2]) {
+			dup2(redirect[2], 2);
+			close(redirect[2]);
+		}
+		retval = execvp(arg->array[0], (char *const*) arg->array);
+		exit(255);
+	}
+
+	if (redirect[0])
+		close(redirect[0]);
+	if (redirect[1])
+		close(redirect[1]);
+	if (redirect[2])
+		close(redirect[2]);
+
+	/*
+	 * If we don't have anybody waiting for output,
+	 * wait for it
+	 */
+	if (!redirect[1]) {
+		int status;
+		while (waitpid(pid, &status, 0) < 0) {
+			if (errno == EINTR)
+				continue;
+			error("unable to wait for child (%s)", strerror(errno));
+			return NULL;
+		}
+		/* FIXME! Put exit status in a variable! */
+	}
+	run_program(cmd->subcmd);
+	return cmd->next;
+}
+
+static struct cmd_struct *exec_nop(struct cmd_struct *cmd, struct argument *arg)
+{
+	return cmd->next;
+}
+
+static const struct cmd_def {
+	const char *n;
+	int len;
+	struct cmd_struct *(*exec)(struct cmd_struct *, struct argument *);
+} cmds[] = {
+	{ "bad", 0, exec_bad },
+	{ "set", 3, exec_set },
+	{ "unset", 5, exec_unset },
+	{ "echo", 4, exec_echo },
+	{ "exit", 4, exec_exit },
+	{ "if", 2, exec_if },
+	{ "else", 4, exec_else },
+	{ "exec", 4, exec_exec },
+	{ "stdin", 5, exec_nop },
+	{ "stdout", 6, exec_nop },
+	{ "stderr", 6, exec_nop },
+};
+
+static void add_argument(struct argument *arg, const char *n)
+{
+	int allocs = arg->allocs, members = arg->members;
+
+	if (members+1 >= allocs) {
+		allocs = (allocs * 3) / 2 + 32;
+		arg->array = xrealloc(arg->array, allocs*sizeof(arg->array[0]));
+		arg->allocs = allocs;
+	}
+	arg->array[members++] = n;
+	arg->array[members] = NULL;
+	arg->members = members;
+}
+
+static int get_word(const char *line, const char **res)
+{
+	int quoted = 0;
+	int offset = 0;
+	int stop = 0;
+	char *buf;
+
+	for (;;) {
+		unsigned char c = line[offset];
+		if (!c)
+			break;
+		offset++;
+		if (c == '\\') {
+			quoted ^= 1;
+			continue;
+		}
+		if (quoted) {
+			quoted = 0;
+			continue;
+		}
+		if (stop) {
+			if (c == stop)
+				stop = 0;
+			continue;
+		}
+		if (c == '\'' || c == '"') {
+			stop = c;
+			continue;
+		}
+		if (!isspace(c)) {
+			continue;
+		}
+		offset--;
+		break;
+	}
+	if (quoted || stop)
+		return -1;
+	buf = xmalloc(offset+1);
+	memcpy(buf, line, offset);
+	buf[offset] = 0;
+	*res = buf;
+	return offset;
+}
+
+static int expand_word(const char *line, struct argument *arg)
+{
+	const char *word;
+	int offset = get_word(line, &word);
+
+	if (offset > 0)
+		add_argument(arg, word);
+	return offset;
+}
+
+static void convert_fd_into_array(struct variable *var)
+{
+	int fd = var->value.fd;
+	char buffer[8192];
+	int len, offset, last;
+
+	var->value.fd = -1;
+	var->value.type = var_array;
+	len = 0;
+	for (;;) {
+		int ret = read(fd, buffer + len, sizeof(buffer) - len);
+		if (!ret)
+			break;
+		if (ret < 0) {
+			if (errno == EINTR)
+				continue;
+			break;
+		}
+		len += ret;
+		if (len >= sizeof(buffer))
+			break;
+	}
+
+	last = 0;
+	for (offset = 0; offset < len; offset++) {
+		unsigned char c = buffer[offset];
+		if (c == '\n') {
+			buffer[offset] = 0;
+			add_argument(&var->value, buffer+last);
+			last = offset+1;
+			continue;
+		}
+	}		
+}
+
+static int expand_variable(const char *line, struct argument *arg)
+{
+	const char *word;
+	int offset = get_word(line+1, &word);
+
+	if (offset > 0) {
+		struct variable *var = find_variable(word);
+		offset++;	/* The '$' character itself */
+		if (var) {
+			int i;
+			if (var->value.type == var_fd)
+				convert_fd_into_array(var);
+			for (i = 0; i < var->value.members; i++)
+				add_argument(arg, var->value.array[i]);
+		}
+	}
+	return offset;
+}
+
+static int expand_value(const char *line, struct argument *arg)
+{
+	unsigned char c = *line;
+
+	switch (c) {
+	case '$':
+		return expand_variable(line, arg);
+	default:
+		return expand_word(line, arg);
+	}
+}
+
+static struct argument *expand_line(const char *line)
+{
+	struct argument *arg;
+
+	arg = xmalloc(sizeof(*arg));
+	memset(arg, 0, sizeof(*arg));
+	arg->type = var_array;
+	for (;;) {
+		int n;
+		while (isspace(*line)) {
+			line++;
+		}
+		if (!*line)
+			break;
+		n = expand_value(line, arg);
+		if (n <= 0)
+			break;
+		line += n;
+	}
+	return arg;
+}
+
+static void run_program(struct cmd_struct *cmd)
+{
+	while (cmd) {
+		int i;
+		const struct cmd_def *run = cmds+0;
+		struct argument *arg = NULL;
+		int cmdlen = strlen(cmd->line);
+
+		for (i = 1; i < sizeof(cmds)/sizeof(cmds[0]); i++) {
+			const struct cmd_def *def = cmds + i;
+			int len = def->len;
+			if (len > cmdlen)
+				continue;
+			if (len < cmdlen && !isspace(cmd->line[len]))
+				continue;
+			if (memcmp(cmd->line, def->n, len))
+				continue;
+			run = def;
+			arg = expand_line(cmd->line + len);
+			break;
+		}
+		cmd = run->exec(cmd, arg);
+	}
+}
+
+int execute(const char *name, char *buf, unsigned int size)
+{
+	struct parse_buf p;
+	struct cmd_struct *program;
+
+	p.name = name;
+	p.prog = buf;
+	p.size = size;
+	p.offset = 0;
+	p.line = 1;
+	p.error = "empty program";
+
+	program = parse(&p, -1);
+	if (!program || p.offset != p.size)
+		die("parse error at %s:%d: %s", p.name, p.line, p.error);
+
+	run_program(program);
+	return 0;
+}
diff --git a/script.c b/script.c
new file mode 100644
index 0000000..ae85598
--- /dev/null
+++ b/script.c
@@ -0,0 +1,58 @@
+/*
+ * Silly git script language
+ *
+ * Copyright (C) 2006, Linus Torvalds
+ */
+#include "cache.h"
+
+static const char script_usage[] = "git-script <scriptfile>";
+
+int main(int argc, char **argv)
+{
+	int fd;
+	char *buf;
+	const char *filename;
+	unsigned int size, alloc;
+
+	fd = 0;
+	switch (argc) {
+	case 1:
+		filename = "stdin";
+		fd = dup(0);
+		close(0);
+		open("/dev/null", O_RDONLY);
+		break;
+	case 2:
+		filename = argv[1];
+		fd = open(filename, O_RDONLY);
+		if (fd < 0)
+			die("unable to open '%s': %s", filename, strerror(errno));
+		break;
+	default:
+		usage(script_usage);
+	}
+
+	buf = NULL;
+	alloc = 0;
+	size = 0;
+	for (;;) {
+		int nr;
+		if (size >= alloc) {
+			alloc = (alloc * 3) / 2 + 8192;
+			buf = xrealloc(buf, alloc);
+		}
+		nr = read(fd, buf + size, alloc - size);
+		if (!nr)
+			break;
+		if (nr < 0) {
+			if (errno == EAGAIN || errno == EINTR)
+				continue;
+			die("script read failed (%s)", strerror(errno));
+		}
+		size += nr;
+	}
+	close(fd);
+
+	execute(filename, buf, size);
+	return 0;
+}

^ permalink raw reply related

* Re: [PATCH] New git-seek command with documentation and test.
From: H. Peter Anvin @ 2006-02-24  6:53 UTC (permalink / raw)
  To: linux; +Cc: cworth, git
In-Reply-To: <20060224002915.17331.qmail@science.horizon.com>

linux@horizon.com wrote:
> The annoying thing about temporary branch names like "bisect" and "seek"
> is that:
> a) They clutter up the nae space available to the repository user.
>    Users have to know that those are reserved names.
> b) If a repository is cloned while they're in use, they might get
>    into a "remotes" file, with even more confusing results.
> 
> This is somewhat heretical, but how about making a truly unnamed branch by
> having .git/HEAD *not* be a symlink, but rather hold a commit ID directly?
> It's already well established that files in the .git directory directly
> are strictly local to this working directory, so it seems a much better
> home for such temporary state.
> 

It might be easier to just reserve part of the namespace, e.g. ".bisect" 
and ".seek" instead.

	-hpa

^ permalink raw reply

* gitview: Fix DeprecationWarning
From: Aneesh Kumar @ 2006-02-24  8:32 UTC (permalink / raw)
  To: Git Mailing List, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 2 bytes --]



[-- Attachment #2: 0002-gitview-Fix-DeprecationWarning.txt --]
[-- Type: text/plain, Size: 743 bytes --]

Subject: gitview: Fix  DeprecationWarning

DeprecationWarning: integer argument expected, got float

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com>

---

 contrib/gitview/gitview |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

1d34275e275b7450721033bf261ea29219934982
diff --git a/contrib/gitview/gitview b/contrib/gitview/gitview
index b04df74..4a6b448 100755
--- a/contrib/gitview/gitview
+++ b/contrib/gitview/gitview
@@ -154,7 +154,7 @@ class CellRendererGraph(gtk.GenericCellR
 
 		cols = self.node[0]
 		for start, end, colour in self.in_lines + self.out_lines:
-			cols = max(cols, start, end)
+			cols = int(max(cols, start, end))
 
 		(column, colour, names) = self.node
 		names_len = 0
-- 
1.2.3.g2cf3-dirty


^ permalink raw reply related

* gitview: Bump the rev
From: Aneesh Kumar @ 2006-02-24  8:38 UTC (permalink / raw)
  To: Git Mailing List, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 2 bytes --]



[-- Attachment #2: 0003-gitview-Bump-the-rev.txt --]
[-- Type: text/plain, Size: 610 bytes --]

Subject: gitview: Bump the rev
Make the 0.7 release
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com>

---

 contrib/gitview/gitview |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

32e92e4586a4b48c2d4776bcb7ba885f5c710935
diff --git a/contrib/gitview/gitview b/contrib/gitview/gitview
index 4a6b448..02e2445 100755
--- a/contrib/gitview/gitview
+++ b/contrib/gitview/gitview
@@ -422,7 +422,7 @@ class DiffWindow:
 class GitView:
 	""" This is the main class
 	"""
-	version = "0.6"
+	version = "0.7"
 
 	def __init__(self, with_diff=0):
 		self.with_diff = with_diff
-- 
1.2.3.g2cf3-dirty


^ permalink raw reply related

* Re: [PATCH] diff-delta: produce optimal pack data
From: Junio C Hamano @ 2006-02-24  8:49 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0602212043260.5606@localhost.localdomain>

Nicolas Pitre <nico@cam.org> writes:

> Indexing based on adler32 has a match precision based on the block size 
> (currently 16).  Lowering the block size would produce smaller deltas 
> but the indexing memory and computing cost increases significantly.

Indeed.

I had this patch in my personal tree for a while.  I was
wondring why sometimes progress indication during "Deltifying"
stage stops for literally several seconds, or more.

In Linux 2.6 repository, these object pairs take forever to
delta.

        blob 9af06ba723df75fed49f7ccae5b6c9c34bc5115f -> 
        blob dfc9cd58dc065d17030d875d3fea6e7862ede143
        size (491102 -> 496045)
        58 seconds

        blob 4917ec509720a42846d513addc11cbd25e0e3c4f -> 
        blob dfc9cd58dc065d17030d875d3fea6e7862ede143
        size (495831 -> 496045)
        64 seconds

Admittedly, these are *BAD* input samples (a binary firmware
blob with many similar looking ", 0x" sequences).  I can see
that trying to reuse source materials really hard would take
significant computation.

However, this is simply unacceptable.

The new algoritm takes 58 seconds to produce 136000 bytes of
delta, while the old takes 0.25 seconds to produce 248899 (I am
using the test-delta program in git.git distribution).  The
compression ratio is significantly better, but this is unusable
even for offline archival use (remember, pack delta selection
needs to do window=10 such deltification trials to come up with
the best delta, so you are spending 10 minutes to save 100k from
one oddball blob), let alone on-the-fly pack generation for
network transfer.

Maybe we would want two implementation next to each other, and
internally see if it is taking too much cycles compared to the
input size then switch to cheaper version?

^ permalink raw reply

* Re: [PATCH] New git-seek command with documentation and test.
From: Andreas Ericsson @ 2006-02-24 10:00 UTC (permalink / raw)
  To: Carl Worth; +Cc: Junio C Hamano, git, Linus Torvalds
In-Reply-To: <87zmkhrf4y.wl%cworth@cworth.org>

Carl Worth wrote:
> Add git-seek which allows for temporary excursions through the
> revision history. With "git seek <revision>" one gets a working tree
> corresponding to <revision>. When done with the excursion "git seek"
> returns back to the original branch from where the first seek began.
> 

I've said it before, and I'll say it again. This tool provides less 
flexibility and much less power than "git checkout -b branch 
<commit-ish>" (although it would be nice to have '-o' for 'overwrite 
existing branch' as an argument to git checkout)

> Signed-off-by: Carl Worth <cworth@cworth.org>
> 
> ---
>  
>  I had planned to just let this drop as my original need was some
>  historical exploration that I've already finished. But now I've found
>  a common use case in my everyday workflow that could benefit from
>  git-seek. Here it is:
>  
>  I receive a bug-fix patch that updates a test case to demonstrate the
>  bug. I can apply both the fix and the test case and see it succeed.
>  But what I really want to do is first commit the test case, see it
>  fail, and only then commit the fix and see the test now succeed.  I'd
>  also like the history to reflect that order. So what I do is:
>  
>  	$ git-am
>  	$ git update-index test.c ; git commit -m "Update test"
>  	$ git update-index buggy.c ; git commit -m "Fix bug"
>  
>  At that point, without git-seek I can get by with:
>  
>  	$ git checkout -b tmp HEAD^
>  	$ make check # to see failure
>  	$ git checkout <branch_I_was_on_to_begin_with>
>  	$ git branch -d tmp # easy to forget, but breaks the next time otherwise
>  	$ make check # to see success
>  
>  But what I'd really like to do, (and can with the attached patch), is:
>  
>  	$ git seek HEAD^
>  	$ make check # to see failure
>  	$ git seek
>  	$ make check # to see success
>  
>  This avoids me having to:
> 1) invent a throwaway name,

All programmers have at least five throwaway names that are only ever 
used as such (mine are, in order of precedence, foo, bar, tmp, fnurg, 
sdf and asd).

> 2) remember the branch I started on,

With topic branches, you need to pick more careful topic names. Without 
topic branches you're always on "master". Surely you know what the 
patches touch, so you know what branch they should be in.

> 3) remember to actually throwaway the temporary branch.
> 

This isn't always a bad thing, since you after applying some patch or 
other decide you want to go back to this point in history, or want to 
keep the point so you can show the author some problem or other with the 
patch. With git-seek you'll then have to remember the hard-to-learn 
SHA1, or how far below HEAD or some other easily remembered point in 
history it is. In that case, you need to remember to add the 
branch/tag/whatever to where you seeked rather than just go on with the 
work. Removing a branch later is simple. Finding the right spot to 
create it later can be trouble-some.

If I had a vote, I'd say no to this patch, and to this tool entirely.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply

* Re: [PATCH] New git-seek command with documentation and test.
From: Andreas Ericsson @ 2006-02-24 10:11 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux, cworth, git
In-Reply-To: <43FEAD62.6050302@zytor.com>

H. Peter Anvin wrote:
> linux@horizon.com wrote:
> 
>> The annoying thing about temporary branch names like "bisect" and "seek"
>> is that:
>> a) They clutter up the nae space available to the repository user.
>>    Users have to know that those are reserved names.
>> b) If a repository is cloned while they're in use, they might get
>>    into a "remotes" file, with even more confusing results.
>>
>> This is somewhat heretical, but how about making a truly unnamed 
>> branch by
>> having .git/HEAD *not* be a symlink, but rather hold a commit ID 
>> directly?
>> It's already well established that files in the .git directory directly
>> are strictly local to this working directory, so it seems a much better
>> home for such temporary state.
>>
> 
> It might be easier to just reserve part of the namespace, e.g. ".bisect" 
> and ".seek" instead.
> 

Ach, no. Not specific names. ^\.-.* would be acceptable, but I sometimes 
use '.name' or '-name' to mark a temporary branch. Making .- or some 
such reserved would perhaps make sense, but not with specific names.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply

* Some more pack-objects tweaks
From: Junio C Hamano @ 2006-02-24 10:38 UTC (permalink / raw)
  To: git

I've been working more pack-objects improvements.  There will be
two tweaks in the "next" branch I'll be pushing out tonight.

 * rev-list reports full pathnames not just basenames for
   contained trees and blobs.  pack-objects hashes the incoming
   names (and names obtained from "negative" trees when
   --objects-edge aka "thin pack" is used) taking into account
   the dirname and basename part.

   Earlier, I had a patch that hashes the whole pathname, and
   found it perform worse than the original "hash just the
   basename" approach, so I never published it.  The idea in
   this round is to give "Makefile" and "t/Makefile" a different
   but close hash values.  Type-size sort groups "Makefile"s
   from different revs together, and another group of bunch of
   "t/Makefile"s are found close by.

 * when creating "thin" pack, disable the code to avoid too
   long a delta chain to be made due to reused delta (see
   15b4d57 and ab7cd7b commit log for details).

   This is because limiting delta chain is more costly than let
   it grow by using preexisting delta, and "thin" pack is usable
   by first exploding it, so at that point delta depth does not
   matter.

In Linux 2.6 repository, I've created a thin pack between
v2.6.14..v2.6.15-rc1 (36k objects).  Here are the results:

    [without either patch]
    15463034 bytes
    Total 36248, written 36248 (delta 29046), reused 28306 (delta 22512)
    real    1m38.157s       user    1m32.520s       sys     0m5.440s

    [with full names]
    11138621 bytes
    Total 36248, written 36248 (delta 30368), reused 27918 (delta 22512)
    real    1m36.254s       user    1m28.650s       sys     0m5.470s

    [with full names, and allowing deeper delta]
    9971223 bytes
    Total 36248, written 36248 (delta 30868), reused 27429 (delta 22512)
    real    1m36.923s       user    1m29.770s       sys     0m5.470s

All of these tests were done with the last patch in Nico's delta
enhancement series reverted, because the dataset used in this
test triggers a corner case performance disaster in it (I've
sent a message separately).

^ permalink raw reply

* Re: gitview: Use monospace font to draw the branch and tag name
From: Junio C Hamano @ 2006-02-24 11:17 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: git
In-Reply-To: <43FC8BF2.60205@gmail.com>

"Aneesh Kumar K.V" <aneesh.kumar@gmail.com> writes:

> This patch address the below:
> Use monospace font to draw branch and tag name
> set the font size to 13.

I have an impression that hardcoding UI policy like this is
generally frowned upon.

^ permalink raw reply

* Re: FYI: git-am allows creation of empty commits.
From: Eric W. Biederman @ 2006-02-24 11:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v1wxtgv02.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> writes:

> ebiederm@xmission.com (Eric W. Biederman) writes:
>
>> After fixing it up and doing all of my edits I occasionally forget
>> the git-update-index step, before calling git-am --resolved.  This
>> proceeds along it's merry way and creates an empty commit.
>
> Certainly a safty measure is missing here.  Thanks for
> noticing.  How about something like this?

Just tested it the patch works, at least in my simple contrived
example :)

Is this something that we always want to test for in the porcelain
or do we want to move a check into git-commit-tree?

For getting a reasonable error message where you have the test
seems to be the only sane place, but having the check deeper
down would be more likely to catch this kind of thing.

Eric

^ permalink raw reply

* Re: [PATCH] New git-seek command with documentation and test.
From: Junio C Hamano @ 2006-02-24 11:38 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Carl Worth, git, Linus Torvalds
In-Reply-To: <43FED93D.1000601@op5.se>

Andreas Ericsson <ae@op5.se> writes:

> I've said it before, and I'll say it again. This tool provides less
> flexibility and much less power than "git checkout -b branch
> <commit-ish>" (although it would be nice to have '-o' for 'overwrite
> existing branch' as an argument to git checkout)

True, but assembly provides more flexibility than higher level
languages and you need to strike a balance between power and
usability.

The real question is if the structure the tool enforces to your
workflow is simply being a straight-jacket or helping an average
user to avoid common mistakes.

One occasion I've felt the need for "seek" like feature was when
starting to bisect.  You usually notice breakage, so you can
start with "git bisect bad HEAD", but then what next?  You
usually are not absolutely sure which one _was_ working the last
time.

If I had a seek, then I could go back to some randomly chosen
version to try it out, going back until I find a good one.

Maybe "git bisect try $committish" would be a good addition.  We
could live without it (we can just say "git reset --hard
$committish"), but it can be a bit more than just that.  If
given committish is known to be good or bad, we could remind the
user what she said the last time, and offer a chance to take it
back.  That is, (1) if the given $committish is an ancestor of
existing good one, list those good ones and ask "do you mean you
are not sure if they are good anymore, and retry the
bisection?"  If yes, delete those good-* refs; (2) if the given
$committish is a descendant of a bad one, show it and ask "do
you mean you are not sure if they are good anymore, and retry
the bisection?"  If yes, remove the existing bad ref.  In any
case, "reset --hard" to it after user responds.

Other than that, I haven't felt a need for seek-like feature;
instead, I make liberal use of throw-away branches.

^ permalink raw reply

* Re: [PATCH] fmt-merge-msg: avoid open "-|" list form for Perl 5.6
From: Eric Wong @ 2006-02-24 12:02 UTC (permalink / raw)
  To: Alex Riesen; +Cc: Sam Vilain, Junio C Hamano, Johannes Schindelin, git
In-Reply-To: <81b0412b0602220835p4c4243edm145ee827eb706121@mail.gmail.com>

Alex Riesen <raa.lkml@gmail.com> wrote:
> On 2/21/06, Sam Vilain <sam@vilain.net> wrote:
> > Alex Riesen wrote:
> > >>>Does not work here (ActiveState Build 811, Perl 5.8.6):
> > >>>$ perl -e 'open(F, "-|")'
> > >>>'-' is not recognized as an internal or external command,
> > >>>operable program or batch file.
> > >>Portability, Ease of Coding, Few CPAN Module Dependencies.  Pick any two.
> > > Sometimes an upgrade is just out of question. Besides, that'd mean an
> > > upgrade to another operating system, because very important scripts
> > > over here a just not portable to anything else but
> > >     "ActiveState Perl on Windows (TM)"
> > > I just have no choice.
> >
> > Sure, but perhaps IPC::Open2 or some other CPAN module has solved this
> > problem already.
> 
> IPC::Open2 works! Well "kind of": there are still strange segfaults regarding
> stack sometimes. And I don't know yet whether and how the arguments are escaped
> (Windows has no argument array. It has that bloody stupid one-line command line)

It seems that ActiveState has more problems with pipes than it does with fork.
If it supports redirects reasonably well, this avoids pipes entirely and
may be more stable as a result (but possibly slower):

# IO::File is standard in Perl 5.x, new_tmpfile
# returns an open filehandle to an already unlinked file

use IO::File;
my $out = IO::File->new_tmpfile;
file
my $pid = fork;
defined $pid or die $!;
if (!$pid) {
	# redirects STDOUT to $out file
	open STDOUT, '>&', $out or die $!;
	exec('foo','bar');
}
waitpid $pid, 0;
seek $out, 0, 0;
while (<$out>) {
	...
}

Writing and reading from a tempfile are very fast for me in Linux, and probably
not much slower than pipes.  Of course I'm still assuming file descriptors stay
shared after a 'fork', which may be asking too much on Windows.  Using something
from File::Temp to get a temp filename would still work.

-- 
Eric Wong

^ permalink raw reply

* Re: FYI: git-am allows creation of empty commits.
From: Junio C Hamano @ 2006-02-24 12:04 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: git
In-Reply-To: <m18xs1dmp3.fsf@ebiederm.dsl.xmission.com>

ebiederm@xmission.com (Eric W. Biederman) writes:

> Is this something that we always want to test for in the porcelain
> or do we want to move a check into git-commit-tree?
>
> For getting a reasonable error message where you have the test
> seems to be the only sane place, but having the check deeper
> down would be more likely to catch this kind of thing.

I think 99.9% of the time it is a mistake if a single-parented
commit has the same tree as its parent commit has, so having a
check in commit-tree may not be a bad idea.

If you want to do it, however, you need to be a bit careful
about merges.  In a multi-parented commit, it is legitimate if
the merge result exactly match one of the parents (in fact
"git-merge -s ours" can be used to create a merge that matches
its first parent).

The commit-tree program is one of the oldest part of the system,
and as a mechanism, does not set nor enforce policies like this.
It has been more liberal than the best current practice, such as
(1) don't do single-parent empty commit; (2) don't list the same
parent twice.  The second one was introduced in mid-June last
year, which is "long after it was invented" in git timescale.

On the other hand, it is more anal than it could be.  It does
not take a tag that points to a commit to its -p parameter.
That's because we did not have tag objects in the beginning, and
by the time tags were introduced, nobody ran commit-tree by hand
anymore.

-- >8 --
[PATCH] commit-tree: disallow an empty single-parent commit.

Also allow "git-commit-tree -p v2.6.16-rc2", instead of having
to say "git-commit-tree -p v2.6.16-rc2^0".

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
diff --git a/commit-tree.c b/commit-tree.c
index 88871b0..14a7552 100644
--- a/commit-tree.c
+++ b/commit-tree.c
@@ -45,12 +45,14 @@ static void check_valid(unsigned char *s
 {
 	void *buf;
 	char type[20];
+	unsigned char real_sha1[20];
 	unsigned long size;
 
-	buf = read_sha1_file(sha1, type, &size);
-	if (!buf || strcmp(type, expect))
+	buf = read_object_with_reference(sha1, expect, &size, real_sha1);
+	if (!buf)
 		die("%s is not a valid '%s' object", sha1_to_hex(sha1), expect);
 	free(buf);
+	memcpy(sha1, real_sha1, 20);
 }
 
 /*
@@ -75,6 +77,19 @@ static int new_parent(int idx)
 	return 1;
 }
 
+static int check_empty_commit(const unsigned char *tree_sha1,
+			      const unsigned char *parent_sha1)
+{
+	unsigned char sha1[20];
+	char refit[50];
+	sprintf(refit, "%s^{tree}", sha1_to_hex(parent_sha1));
+	if (get_sha1(refit, sha1))
+		die("cannot determine tree in parent commit.");
+	if (!memcmp(sha1, tree_sha1, 20))
+		return error ("empty commit?  aborting.");
+	return 0;
+}
+
 int main(int argc, char **argv)
 {
 	int i;
@@ -105,6 +120,9 @@ int main(int argc, char **argv)
 	}
 	if (!parents)
 		fprintf(stderr, "Committing initial tree %s\n", argv[1]);
+	if (parents == 1)
+		if (check_empty_commit(tree_sha1, parent_sha1[0]))
+			exit(1);
 
 	init_buffer(&buffer, &size);
 	add_buffer(&buffer, &size, "tree %s\n", sha1_to_hex(tree_sha1));

^ permalink raw reply related

* [PATCH] commit-tree: disallow an empty single-parent commit.
From: Junio C Hamano @ 2006-02-24 12:20 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: git
In-Reply-To: <7vy8019d44.fsf@assigned-by-dhcp.cox.net>

Allow "git-commit-tree v2.6.15^{tree} -p HEAD", instead of
requiring "git-commit-tree `git rev-parse v2.6.15^{tree}` -p
HEAD".  The parent parameter that follows -p uses get_sha1() to
understand the extended notation, and there is little reason not
to allow it for the tree object name parameter.

Also make the check_valid() function simpler.  This function
which predates sha1_object_info() so it had to do things by hand
but there is no reason to read the data just to discard anymore.

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 * Replacement patch.  Very lightly tested.

 commit-tree.c |   27 ++++++++++++++++++++-------
 1 files changed, 20 insertions(+), 7 deletions(-)

8f97300b2580b43905ce11eef21e77c5e8e809a6
diff --git a/commit-tree.c b/commit-tree.c
index 88871b0..b1d816c 100644
--- a/commit-tree.c
+++ b/commit-tree.c
@@ -43,14 +43,11 @@ static void add_buffer(char **bufp, unsi
 
 static void check_valid(unsigned char *sha1, const char *expect)
 {
-	void *buf;
 	char type[20];
-	unsigned long size;
 
-	buf = read_sha1_file(sha1, type, &size);
-	if (!buf || strcmp(type, expect))
-		die("%s is not a valid '%s' object", sha1_to_hex(sha1), expect);
-	free(buf);
+	if (sha1_object_info(sha1, type, NULL) || strcmp(type, expect))
+		die("%s is not a valid '%s' object",
+		    sha1_to_hex(sha1), expect);
 }
 
 /*
@@ -75,6 +72,19 @@ static int new_parent(int idx)
 	return 1;
 }
 
+static int check_empty_commit(const unsigned char *tree_sha1,
+			      const unsigned char *parent_sha1)
+{
+	unsigned char sha1[20];
+	char refit[50];
+	sprintf(refit, "%s^{tree}", sha1_to_hex(parent_sha1));
+	if (get_sha1(refit, sha1))
+		die("cannot determine tree in parent commit.");
+	if (!memcmp(sha1, tree_sha1, 20))
+		return error ("empty commit?  aborting.");
+	return 0;
+}
+
 int main(int argc, char **argv)
 {
 	int i;
@@ -90,7 +100,7 @@ int main(int argc, char **argv)
 
 	git_config(git_default_config);
 
-	if (argc < 2 || get_sha1_hex(argv[1], tree_sha1) < 0)
+	if (argc < 2 || get_sha1(argv[1], tree_sha1) < 0)
 		usage(commit_tree_usage);
 
 	check_valid(tree_sha1, "tree");
@@ -105,6 +115,9 @@ int main(int argc, char **argv)
 	}
 	if (!parents)
 		fprintf(stderr, "Committing initial tree %s\n", argv[1]);
+	if (parents == 1)
+		if (check_empty_commit(tree_sha1, parent_sha1[0]))
+			exit(1);
 
 	init_buffer(&buffer, &size);
 	add_buffer(&buffer, &size, "tree %s\n", sha1_to_hex(tree_sha1));
-- 
1.2.3.g7465

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox