Git development

Git development
 help / color / mirror / Atom feed

* Re: [RFC] Another way to provide help details. (was Re: [PATCH] Add help details to git help command.)
From: Steven Cole @ 2005-04-19 16:03 UTC (permalink / raw)
  To: David Greaves; +Cc: Petr Baudis, git
In-Reply-To: <4265189E.6090801@dgreaves.com>

David Greaves wrote:
> Petr Baudis wrote:
> 
>> Dear diary, on Tue, Apr 19, 2005 at 03:40:54AM CEST, I got a letter
>> where Steven Cole <elenstev@mesatop.com> told me that...
>>
>>> Here is perhaps a better way to provide detailed help for each
>>> git command.  A command.help file for each command can be
>>> written in the style of a man page.
>>
>>
>>
>> I don't like it. I think the 'help' command should serve primarily as a
>> quick reference, which does not blend so well with a manual page - it's
>> too long and too convoluted by repeated output.
>>
>> I'd just print the top comment from each file. :-)
>>
> 
> On the other hand, having more complete docs seems like an excellent 
> idea (and other threads support that)
> I'd certainly like to see more specification oriented documentation...
> (even if it turns out to be disposable)
> 
> Steven, if you carry on sending more verbose docs I'll certainly read 
> and work with you on editing them...

I only did those first two as a straw man.  Doing the others is a couple
hours (or less) work, but I don't want to do it if folks don't want it.

Having the help files separate has advantages/disadvantages.

> 
> Nb kernel-doc doesn't seem appropriate for user level docs.
> maybe, whilst there's so much flux, have:
>   git man command
> that just outputs text
> 
> If Petr wants the top comment to be extracted by help then maybe a 
> bottom comment block could contain the more complete text?
> I *really* think that the user docs should live in the source for now 
> (hence I think that git man is better than going straight to man/docbook).
> 
> I wasn't sure whether to perlise the code or do a shell-lib - but 
> looking at the algorithms needed in things like git status I reckon the 
> shell will end up becoming a hackish mess of awk/sed/tr/sort/uniq/pipe 
> (ie perl) anyway.
> 
> So I'm going to have a go at that - Petr, if you have a minute could you 
> send me, off list, a bit of perl code that epitomises the style you like?
> 
> David
> 

Funny you should mention Perl.  Here is small bit of code:

[steven@spc0 git-pasky-testing]$ cat print_help_header.pl
#!/usr/bin/perl
# reads from stdin   writes to stdout  no error checking
<STDIN>;<STDIN>;
while (substr( $line=<STDIN>, 0, 1) eq "#") {
                  print $line;
}

[steven@spc0 git-pasky-testing]$ ./print_help_header.pl <gitdiff.sh | grep ^# | grep -v "(c)" | cut -c 3-
Make a diff between two GIT trees.

By default compares the current working tree to the state at the
last commit. You can specify -r rev1:rev2 or -r rev1 -r rev2 to
tell it to make a diff between the specified revisions. If you
do not specify a revision, the current working tree is implied
(note that no revision is different from empty revision - -r rev:
compares between rev and HEAD, while -r rev compares between rev
and working tree).

-p instead of one ID denotes a parent commit to the specified ID
(which must not be a tree, obviously).

Outputs a diff converting the first tree to the second one.
-------end of output from perl plus grep and cut.

Without the perl, extra comments came out (plus the dreaded first blank line).

[steven@spc0 git-pasky-testing]$ cat gitdiff.sh | grep -v "/bin" | grep ^# | grep -v "(c)" | cut -c 3-

Make a diff between two GIT trees.

By default compares the current working tree to the state at the
last commit. You can specify -r rev1:rev2 or -r rev1 -r rev2 to
tell it to make a diff between the specified revisions. If you
do not specify a revision, the current working tree is implied
(note that no revision is different from empty revision - -r rev:
compares between rev and HEAD, while -r rev compares between rev
and working tree).

-p instead of one ID denotes a parent commit to the specified ID
(which must not be a tree, obviously).

Outputs a diff converting the first tree to the second one.
FIXME: The commandline parsing is awful.
-------end of output from grep and cut.

David,

I'm a bit pressed for time, so if you or anyone else would like to
use this code to fix up my earlier patch, you're welcome to it.
Otherwise, it will be later this evening or tomorrow before I can
do any more with this.

Steven

^ permalink raw reply

* Re: [darcs-devel] Darcs and git: plan of action
From: Tupshin Harper @ 2005-04-19 16:33 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: David Roundy, darcs-devel, Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0504190749030.19286@ppc970.osdl.org>

Linus Torvalds wrote:

>(In other words: if it looks like something a careful human _could_ have
>written, it's certainly ok. But if it looks like something a careful human
>would have used a script to generate 40 entries of, it's bad).
>
>		Linus
>  
>
This is the way that darcs would currently represent a "darcs replace 
foo bar" on 15 files, which is obviously exactly what you are objecting to:
[global foo to bar
tupshin@tupshin.com**20050419155539] {
replace ./dir1/file1 [A-Za-z_0-9] foo bar
replace ./dir1/file2 [A-Za-z_0-9] foo bar
replace ./dir1/file3 [A-Za-z_0-9] foo bar
replace ./dir1/file4 [A-Za-z_0-9] foo bar
replace ./dir1/file5 [A-Za-z_0-9] foo bar
replace ./dir2/file1 [A-Za-z_0-9] foo bar
replace ./dir2/file2 [A-Za-z_0-9] foo bar
replace ./dir2/file3 [A-Za-z_0-9] foo bar
replace ./dir2/file4 [A-Za-z_0-9] foo bar
replace ./dir2/file5 [A-Za-z_0-9] foo bar
replace ./dir3/file1 [A-Za-z_0-9] foo bar
replace ./dir3/file2 [A-Za-z_0-9] foo bar
replace ./dir3/file3 [A-Za-z_0-9] foo bar
replace ./dir3/file4 [A-Za-z_0-9] foo bar
replace ./dir3/file5 [A-Za-z_0-9] foo bar
}

I see two possible complementary ways to address this:
1) allow something akin to the above form in git free-form comments as a 
*technical* solution, while leaving it up to the individual repository 
owner whether to accept such patches on aesthetic grounds.
2) explore adding a different format to darcs that would allow a files 
affected to be represented more compactly.

I suspect that any use of wildcards in a new format would be impossible 
for darcs since it wouldn't allow darcs to construct dependencies, 
though I'll leave it to david to respond to that.

At a minimum, something like:
replace ./dir1/[file1|file2|file3|file4|file5] [A-Za-z_0-9] foo bar
replace ./dir2/[file1|file2|file3|file4|file5] [A-Za-z_0-9] foo bar
replace ./dir3/[file1|file2|file3|file4|file5] [A-Za-z_0-9] foo bar
should be pretty feasible.

I don't believe, however, that it would ever be 100% reliable to try to 
look at a one line replace description and combine it with the actual 
changes and end up with a correct darcs replace patch.

-Tupshin

^ permalink raw reply

* Re: [darcs-devel] Darcs and git: plan of action
From: Linus Torvalds @ 2005-04-19 16:49 UTC (permalink / raw)
  To: Tupshin Harper; +Cc: David Roundy, darcs-devel, Git Mailing List
In-Reply-To: <426532D5.3040306@tupshin.com>

On Tue, 19 Apr 2005, Tupshin Harper wrote:
> 
> I suspect that any use of wildcards in a new format would be impossible 
> for darcs since it wouldn't allow darcs to construct dependencies, 
> though I'll leave it to david to respond to that.

Note that git _does_ very efficiently (and I mean _very_) expose the 
changed files.

So if this kind of darcs patch is always the same pattern just repeated
over <n> files, then you really don't need to even list the files at all.  
Git gives you a very efficient file listing by just doing a "diff-tree"  
(which does not diff the _contents_ - it really just gives you a pretty
much zero-cost "which files changed" listing).

So that combination would be 100% reliable _if_ you always split up darcs 
patches to "common elements". 

And note that there does not have to be a 1:1 relationship between a git
commit and a darcs patch. For example, say that you have a darcs patch
that does a combination of "change token x to token y in 100 files" and
"rename file a into b". I don't know if you do those kind of "combination 
patches" at all, but if you do, why not just split them up into two? That 
way the list of files changed _does_ 100% determine the list of files for 
the token exchange.

		Linus

^ permalink raw reply

* [PATCH] write-tree performance problems
From: Chris Mason @ 2005-04-19 16:50 UTC (permalink / raw)
  To: Linus Torvalds, git

[-- Attachment #1: Type: text/plain, Size: 1654 bytes --]

Hello everyone,

I did a quick experiment with applying/commit 100 patches from the suse kernel 
into a kernel git tree, which quilt can do in 2 seconds.  git needs 1m5s.

The primary performance problem during each commit is write-tree recalculating 
the hash of each directory, even though the contents of most directories are 
not changing.  I've attached a very quick and dirty mod of write-tree.c, it 
takes an optional tree id (sha1) and list of directories.  The hash of any 
directories not in the list are read in from existing files instead of being 
recalculated.

You have to pass each sub dir with a modified file.  So, if you change 
fs/super.c and fs/ext3/super.c, you would call "write-tree sha1 fs fs/ext3"
With this patch, the time to apply 100 commits goes down to 22 seconds.  It 
could be faster (and easier to use) if the index stored the hash of trees 
instead of just blobs, but that would be a larger change.

I was able to get the commit time down to 13 seconds by changing read-tree.c, 
update-cache.c and read-cache.c to store/read the index in tmpfs instead of 
on the local filesystem.  I haven't attached the patch for that, but it seems 
easiest to move .git/index into .git/index_dir/index, and let the user decide 
where to put index_dir.

Quick speed summary, apply/commit 100 patches
quilt push -a     :                    2s
git (unmodified):               1m5s
git (tree hash reduction)      22s
git (tree hash, tmpfs index) 13s

This patch is against pasky's tree from this morning, but also applies to 
linus' tree.  It's nasty stuff, but will hopefully get some discussion 
started on speeding things up.

-chris

[-- Attachment #2: fast-dirs.diff --]
[-- Type: text/x-diff, Size: 4526 bytes --]

--- a/write-tree.c
+++ b/write-tree.c
@@ -4,6 +4,8 @@
  * Copyright (C) Linus Torvalds, 2005
  */
 #include "cache.h"
+static char **dirs;
+static int num_dirs = 0;
 
 static int check_valid_sha1(unsigned char *sha1)
 {
@@ -27,15 +29,47 @@ static int prepend_integer(char *buffer,
 	return i;
 }
 
+static int find_sha(char *buffer, int size, const char *base, int baselen, char *returnsha1)
+{
+	while(size) {
+		int len = strlen(buffer)+1;
+		unsigned char *sha1 = buffer + len;
+		char *path = strchr(buffer, ' ')+1;
+		unsigned int mode;
+
+		if (size < len + 20 || sscanf(buffer, "%o", &mode) != 1)
+			die("corrupt 'tree' file");
+		buffer = sha1 + 20;
+		size -= len + 20;
+		if (strncmp(path, base, baselen) == 0 &&
+		    strlen(path) == baselen) {
+			memcpy(returnsha1, sha1, 20);
+			return 0;
+		}
+	}
+	return -1;
+}
+
 #define ORIG_OFFSET (40)	/* Enough space to add the header of "tree <size>\0" */
 
-static int write_tree(struct cache_entry **cachep, int maxentries, const char *base, int baselen, unsigned char *returnsha1)
+static int write_tree(struct cache_entry **cachep, int maxentries, const char *base, int baselen, unsigned char *returnsha1, char *treesha)
 {
 	unsigned char subdir_sha1[20];
 	unsigned long size, offset;
 	char *buffer;
 	int i, nr;
-
+	char *tree = NULL;
+	unsigned long tree_size;
+	char type[20];
+	if (treesha) {
+		tree = read_sha1_file(treesha, type, &tree_size);
+		if (!tree) {
+			die("unable to read sha1 file");
+		} else {
+		}
+		if (strcmp(type, "tree"))
+			die("expected a tree node");
+	}
 	/* Guess at some random initial size */
 	size = 8192;
 	buffer = malloc(size);
@@ -55,15 +89,60 @@ static int write_tree(struct cache_entry
 
 		sha1 = ce->sha1;
 		mode = ntohl(ce->ce_mode);
-
 		/* Do we have _further_ subdirectories? */
 		filename = pathname + baselen;
 		dirname = strchr(filename, '/');
 		if (dirname) {
 			int subdir_written;
-
-			subdir_written = write_tree(cachep + nr, maxentries - nr, pathname, dirname-pathname+1, subdir_sha1);
-			nr += subdir_written;
+			int dirlen = dirname - pathname;
+			int dirmatch = 1;
+			if (tree && num_dirs > 0) {
+				dirmatch = 0;
+				for(i = 0 ; i < num_dirs; i++) {
+					int len = strlen(dirs[i]);
+					if (len <= baselen)
+						continue;
+					if (memcmp(dirs[i], pathname, dirlen)==0 &&
+					    pathname[dirlen] == '/') {
+						dirmatch = 1;
+						break;
+					}
+				}
+				if (!dirmatch && find_sha(tree, tree_size, 
+							 filename, 
+							 dirname-filename, 
+							 subdir_sha1)) {
+					dirmatch = 1;
+				}
+			}
+			if (!dirmatch) {
+				/* eat all the entries in this dir */
+				while(++nr < maxentries) {
+					char *p;
+					ce = cachep[nr];
+					p = strchr(ce->name + baselen, '/');
+					if (!p)
+						break;
+					if (p - ce->name != dirname-pathname)
+						break;
+					if (memcmp(pathname, ce->name, p-ce->name))
+						break;
+				}
+			} else {
+				unsigned char thisdir_sha1[20];
+				char *p = thisdir_sha1;
+				if (num_dirs && tree) {
+				    if (find_sha(tree, tree_size, filename, 
+				                 dirname-filename, p)) {
+				    	num_dirs = 0;
+					p = NULL;
+				    }
+				} else {
+					p = NULL;
+				}
+				subdir_written = write_tree(cachep + nr, maxentries - nr, pathname, dirname-pathname+1, subdir_sha1, p);
+				nr += subdir_written;
+			}
 
 			/* Now we need to write out the directory entry into this tree.. */
 			mode = S_IFDIR;
@@ -92,9 +172,10 @@ static int write_tree(struct cache_entry
 	i = prepend_integer(buffer, offset - ORIG_OFFSET, ORIG_OFFSET);
 	i -= 5;
 	memcpy(buffer+i, "tree ", 5);
-
 	write_sha1_file(buffer + i, offset - i, returnsha1);
 	free(buffer);
+	if (tree)
+		free(tree);
 	return nr;
 }
 
@@ -103,7 +184,19 @@ int main(int argc, char **argv)
 	int i, unmerged;
 	int entries = read_cache();
 	unsigned char sha1[20];
+	unsigned char treesha1[20];
+	char *p = NULL;
 
+	if (argc > 1) {
+		if (argc < 3)
+			die("usage: write-tree [sha1 dir1 dir2 ...]");
+		num_dirs = argc - 2;
+		dirs = argv + 2;
+		if (get_sha1_hex(argv[1], treesha1) < 0)
+			die("bad sha1 given");
+		p = treesha1;
+
+	}
 	if (entries <= 0)
 		die("write-tree: no cache contents to write");
 
@@ -123,7 +216,7 @@ int main(int argc, char **argv)
 		die("write-tree: not able to write tree");
 
 	/* Ok, write it out */
-	if (write_tree(active_cache, entries, "", 0, sha1) != entries)
+	if (write_tree(active_cache, entries, "", 0, sha1, p) != entries)
 		die("write-tree: internal error");
 	printf("%s\n", sha1_to_hex(sha1));
 	return 0;

^ permalink raw reply

* Re: GIT Web Interface
From: Greg KH @ 2005-04-19 16:52 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Petr Baudis, git
In-Reply-To: <1113926385.29953.7.camel@localhost.localdomain>

On Tue, Apr 19, 2005 at 05:59:45PM +0200, Kay Sievers wrote:
> On Tue, 2005-04-19 at 02:52 +0200, Petr Baudis wrote:
> > Dear diary, on Tue, Apr 19, 2005 at 02:44:15AM CEST, I got a letter
> > where Kay Sievers <kay.sievers@vrfy.org> told me that...
> > > I'm hacking on a simple web interface, cause I missed the bkweb too much.
> > > It can't do much more than browse through the source tree and show the
> > > log now, but that should change... :)
> > >   http://ehlo.org/~kay/gitweb.pl?project=linux-2.6
> > 
> > Hmm, looks nice for a start. (But you have obsolete git-pasky tree there! ;-)
> 
> Yeah, it's fresh now. :)
> 
> > > How can I get the files touched with a changeset and the corresponding
> > > diffs belonging to it?
> > 
> > diff-tree to get the list of files, you can do the corresponding diffs
> > e.g. by doing git diff -r tree1:tree2. Preferably make a patch for it
> > first to make it possible to diff individual files this way.
> 
> Ah, nice! Got it working.

Looks good, care to post the updated version?

thanks,

greg k-h

^ permalink raw reply

* Re: missing: git api, reference, user manual and mission statement
From: Petr Baudis @ 2005-04-19 16:58 UTC (permalink / raw)
  To: Klaus Robert Suetterlin; +Cc: git
In-Reply-To: <20050419123631.GD3739@xdt04.mpe-garching.mpg.de>

Dear diary, on Tue, Apr 19, 2005 at 02:36:32PM CEST, I got a letter
where Klaus Robert Suetterlin <robert@mpe.mpg.de> told me that...
> 1) There is no clear (e.g. by name) distinction between ``git as done
> by Linus'', which is a kind of content addressable database with added
> semantics, and ``git as done by the rest of You'', which is a kind of
> SCM on top of Linuses stuff.

There is git and git-pasky (git-pasky is superset; therefore various
patches floating around either get to git-pasky or to both). I'm not
sure what else do you mean.

> 2) For Linuses stuff I dare to say that it is an evil hack from
> hell.  A prototype come alive.  This is not meant as an insult;  I
> guess Linus agrees.

I don't think it's evil at all. Why should it?

> I do think there should be a well defined API or UI so that the
> backend could be replaced / changed / improved as need dictates.

It's stabilizing. Mind you, it's 2 weeks old.

> 3) As of the gitSCM stuff, I really miss any kind of description
> how it works.  That is it completely lacks any concept, except for
> ``we will use gitLinus as backend''.

Have you read the README? If you have any questions, go ahead and ask.
_Write_ the description if you miss it.

> 4) Concerning usability on systems other than Linux...  I guess
> this one can be ignored by most.
> 
> The source still uses st->st_mtim.tv_nsec which should be ->st_mtimensec, I guess.

Patches welcome.

> git is implemented as mostly sh shell scripts.
> gitdiff-do and gitlog.sh rely on bash, more precisely on /bin/bash.
> git pull uses rsync
> ...
> 
> The list of dependencies is long and growing.  So if the intent of
> doing gitSCM with shell scripts was to make it portable: that goal was missed.

I think the way to go now that it's working and we are to add some sweet
cream on it is to rewrite it in Perl. I have some parts in progress
already.

> 5) gitLinus as library.
> 
> First I have to say that between what I saw in git-0.04 and the
> current stuff from git-pasky there has been quite a lot of work to
> get further away from the evil prototype.
> 
> Still gitLinus lacks a clear definition of its interface, so I
> guess no one will be able to tell if it works correct.  How could You
> do a test case without knowing
> a) what the software should do and
> b) how You should tell it?

You couldn't. UTSL now and write the docs and the testcases, or wait a
while.

> And of course there are still memory leaks.  The obvious
> --- i.e. malloc and (missing) free in the same function --- I found
> while reading the git-0.04 source yesterday are gone.  Still I found
> one of the ``malloc in called function no free in caller'' leaks
> in git-pasky as pulled NOW.  And all I did was `grep malloc *'.
> Someone should sit down and read all the source top to bottom.  And
> the software should either check its resource usage or someone
> should use a good tool on it.

"Someone"?

Again, patches welcome. The patches are likely usually no big deal now,
though. I'm by all means for fixing them, especially when git will start
to head towards libgit.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: [script] ge: export commits as patches
From: Petr Baudis @ 2005-04-19 17:03 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git
In-Reply-To: <20050419134843.GA19146@elte.hu>

Dear diary, on Tue, Apr 19, 2005 at 03:48:43PM CEST, I got a letter
where Ingo Molnar <mingo@elte.hu> told me that...
> is there any 'export commit as patch' support in git-pasky? I didnt find 
> any such command (maybe it got added meanwhile), so i'm using the 'ge' 
> hack below.
> 
> e.g. i typically look at commits via 'git log', and then when i see 
> something interesting, i look at the commit via the 'ge' script. E.g.  
> "ge 834f6209b22af2941a8640f1e32b0f123c833061" done in the kernel tree 
> will output a particular commit's header and the patch.

Nice idea. I will add it, probably as 'git patch'.

> TREE1=$(cat-file commit 2>/dev/null $1 | head -4 | grep ^tree | cut -d' ' -f2)
> if [ "$TREE1" = "" ]; then echo 'ge <commit-ID>'; exit -1; fi
> PARENT=$(cat-file commit 2>/dev/null $1 | head -4 | grep ^parent | cut -d' ' -f2)
> if [ "$PARENT" = "" ]; then echo 'ge <commit-ID>'; exit -1; fi
> TREE2=$(cat-file commit 2>/dev/null $PARENT | head -4 | grep ^tree | cut -d' ' -f2)
> if [ "$TREE2" = "" ]; then echo 'ge <commit-ID>'; exit -1; fi

commit-id and parent-id tools might be useful. ;-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: naive question
From: Petr Baudis @ 2005-04-19 17:15 UTC (permalink / raw)
  To: torvalds, Paul Mackerras; +Cc: git
In-Reply-To: <16997.222.917219.386956@cargo.ozlabs.ibm.com>

Dear diary, on Tue, Apr 19, 2005 at 03:00:14PM CEST, I got a letter
where Paul Mackerras <paulus@samba.org> told me that...
> Is there a way to check out a tree without changing the mtime of any
> files that you have already checked out and which are the same as the
> version you are checking out?  It seems that checkout-cache -a doesn't
> overwrite any existing files, and checkout-cache -f -a overwrites all
> files and gives them the current mtime.  This is a pain if you are
> using make and your tree is large (like, for instance, the linux
> kernel :), because it means that after a checkout-cache -f -a you get
> to recompile everything.

Actually, to then get sensible show-diff output, you need to also
update-cache --refresh to compensate for the changes. I personally
really hate update-cache --refresh; sure, 0.1s with hot cache, but
easily eats 5 minutes (!) with cold cache.

I'd actually prefer, if:

(i) checkout-cache simply wouldn't touch files whose stat matches with
what is in the cache; it updates the cache with the stat informations
of touched files

(ii) read-tree would take over the stat information from the matching
files in previous cache.

This way, doing update-cache --refresh would become a rather rare event.
Stuff would become swifter, faster, less I/O bound and you would get rid
of problems as the one described above.

What do you think?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* [PATCH] Makefile uses installed commit-id
From: Pavel Roskin @ 2005-04-19 17:28 UTC (permalink / raw)
  To: git

Hello!

Current git-pasky cannot be compiled properly unless it's already
installed.  The rule for creating gitversion.sh requires commit-id in
the PATH, which won't be there until "make install" is run.

Also, commit-id runs gitXnormid.sh, which in turn runs cat-file.  All of
them should be from the current directory to avoid the requirement that
git is installed before gitversion.sh is generated.

This patch adds "." to PATH when running commit-id.  The patch is
against current git-pasky.

Signed-off-by: Pavel Roskin <proski@gnu.org>

--- a/Makefile
+++ b/Makefile
@@ -46,11 +46,11 @@ $(LIB_FILE): $(LIB_OBJS)

 %.o: $(LIB_H)

-gitversion.sh: $(VERSION)
+gitversion.sh: $(VERSION) commit-id
 	@echo Generating gitversion.sh...
 	@rm -f $@
 	@echo "#!/bin/sh" > $@
-	@echo "echo \"$(shell cat $(VERSION)) ($(shell commit-id))\"" >> $@
+	@echo "echo \"$(shell cat $(VERSION)) ($(shell PATH=.:$$PATH ./commit-id))\"" >> $@
 	@chmod +x $@

 install: $(PROG) $(GEN_SCRIPT)

-- 
Regards,
Pavel Roskin

^ permalink raw reply

* Re: [RFC] Another way to provide help details. (was Re: [PATCH] Add help details to git help command.)
From: Petr Baudis @ 2005-04-19 17:32 UTC (permalink / raw)
  To: David Greaves; +Cc: Steven Cole, git
In-Reply-To: <4265189E.6090801@dgreaves.com>

Dear diary, on Tue, Apr 19, 2005 at 04:41:34PM CEST, I got a letter
where David Greaves <david@dgreaves.com> told me that...
> If Petr wants the top comment to be extracted by help then maybe a 
> bottom comment block could contain the more complete text?
> I *really* think that the user docs should live in the source for now 
> (hence I think that git man is better than going straight to man/docbook).

I'd stay with the top comment for now, and go for perldoc in the Perl
scripts. It's cool, nice, and you can export it to anything and it still
looks mostly cute.

> I wasn't sure whether to perlise the code or do a shell-lib - but 
> looking at the algorithms needed in things like git status I reckon the 
> shell will end up becoming a hackish mess of awk/sed/tr/sort/uniq/pipe 
> (ie perl) anyway.

I'm all for slow migration from sh to Perl so that we can add more cream
on the stuff. Shell was great for the first phase of rapid development
(where basically the most important thing was to be able to call the
core git easily, and just wrap it around) but now if you want to do the
fancier stuff (like git status, or even git log), it's getting more of a
nuisance.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: GIT Web Interface
From: Kay Sievers @ 2005-04-19 17:32 UTC (permalink / raw)
  To: Greg KH; +Cc: Petr Baudis, git
In-Reply-To: <20050419165247.GB32259@kroah.com>

On Tue, Apr 19, 2005 at 09:52:48AM -0700, Greg KH wrote:
> On Tue, Apr 19, 2005 at 05:59:45PM +0200, Kay Sievers wrote:
> > On Tue, 2005-04-19 at 02:52 +0200, Petr Baudis wrote:
> > > Dear diary, on Tue, Apr 19, 2005 at 02:44:15AM CEST, I got a letter
> > > where Kay Sievers <kay.sievers@vrfy.org> told me that...
> > > > I'm hacking on a simple web interface, cause I missed the bkweb too much.
> > > > It can't do much more than browse through the source tree and show the
> > > > log now, but that should change... :)
> > > >   http://ehlo.org/~kay/gitweb.pl?project=linux-2.6
> > > 
> > > Hmm, looks nice for a start. (But you have obsolete git-pasky tree there! ;-)
> > 
> > Yeah, it's fresh now. :)
> > 
> > > > How can I get the files touched with a changeset and the corresponding
> > > > diffs belonging to it?
> > > 
> > > diff-tree to get the list of files, you can do the corresponding diffs
> > > e.g. by doing git diff -r tree1:tree2. Preferably make a patch for it
> > > first to make it possible to diff individual files this way.
> > 
> > Ah, nice! Got it working.
> 
> Looks good, care to post the updated version?

Sure, but expect it to change dramatically tonight. :)

Thanks,
Kay

-- 
#!/usr/bin/perl

# This file is licensed under the GPL v2, or a later version
# (C) 2005, Kay Sievers <kay.sievers@vrfy.org>


use strict;
use warnings;
use CGI;
use CGI::Carp qw(fatalsToBrowser);

my $query = new CGI;
my $myself = "gitweb.pl";
my $gitbin = "/home/kay/bin";
my $gitroot = "/home/kay/public_html";
my $gittmp = "/tmp";

my $project = $query->param("project") || "";
my $action = $query->param("action") || "";
my $hash = $query->param("hash") || "";
my $parent = $query->param("parent") || "";
my $projectroot = "$gitroot/$project";
$ENV{'SHA1_FILE_DIRECTORY'} = "$projectroot/.git/objects";

$hash =~ s/[^0-9a-fA-F]//g;
$parent =~ s/[^0-9a-fA-F]//g;

sub header {
	print $query->header();
	print $query->start_html("gitweb");
	if ($project ne "") {
		print $query->h1($project);
		print $query->a({-href => "$myself?project=$project&action=show_tree"}, "Browse Project") . "<br/>\n";
		print $query->a({-href => "$myself?project=$project&action=show_log"}, "Show Log") . "<br/>\n";
		print "<br/><br/>\n";
	}
}

sub footer {
	print $query->end_html();
}

if ($project eq "") {
	open my $fd, "-|", "ls", "-1", $gitroot;
	my (@path) = map { chomp; $_ } <$fd>;
	close $fd;
	header();
	print "Projects:<br/><br/>\n";
	foreach my $line (@path) {
		if (-e "$gitroot/$line/.git/HEAD") {
			print $query->a({-href => "$myself?project=$line"}, $line) . "<br/>\n";
		}
	}
	footer();
	exit;
}

if ($action eq "") {
	print $query->redirect("$myself?project=$project&action=show_log");
	exit;
}

if ($action eq "show_file") {
	header();
	print "<pre>\n";
	open my $fd, "-|", "$gitbin/cat-file", "blob", $hash;
	my $nr;
	while (my $line = <$fd>) {
		$nr++;
		print "$nr\t$line";
	}
	close $fd;
	print "</pre>\n";
	footer();
} elsif ($action eq "show_tree") {
	if ($hash eq "") {
		open my $fd, "$projectroot/.git/HEAD";
		my $head = <$fd>;
		chomp $head;
		close $fd;

		open $fd, "-|", "$gitbin/cat-file", "commit", $head;
		my $tree = <$fd>;
		chomp $tree;
		$tree =~ s/tree //;
		close $fd;
		$hash = $tree;
	}
	open my $fd, "-|", "$gitbin/ls-tree", $hash;
	my (@entries) = map { chomp; $_ } <$fd>;
	close $fd;
	header();
	print "<pre>\n";
	foreach my $line (@entries) {
		$line =~ m/^([0-9]+)\t(.*)\t(.*)\t(.*)$/;
		my $t_type = $2;
		my $t_hash = $3;
		my $t_name = $4;
		if ($t_type eq "blob") {
			print "FILE\t" . $query->a({-href => "$myself?project=$project&action=show_file&hash=$3"}, $4) . "\n";
		} elsif ($t_type eq "tree") {
			print "DIR\t" . $query->a({-href => "$myself?project=$project&action=show_tree&hash=$3"}, $4) . "\n";
		}
	}
	print "</pre>\n";
	footer();
} elsif ($action eq "show_log") {
	open my $fd, "$projectroot/.git/HEAD";
	my $head = <$fd>;
	chomp $head;
	close $fd;

	open my $fd, "-|", "$gitbin/rev-tree", $head;
	my (@revtree) = map { chomp; $_ } <$fd>;
	close $fd;

	header();
	foreach my $rev (reverse sort @revtree) {
		$rev =~ m/^([0-9]+) ([0-9a-fA-F]+).* ([0-9a-fA-F]+)/;
		my $time = $1;
		my $commit = $2;
		my $parent = $3;

		open my $fd, "-|", "$gitbin/cat-file", "commit", $commit;
		print "commit $commit " . $query->a({-href => "$myself?project=$project&action=show_cset&hash=$commit&parent=$parent"}, "(show cset)") . "<br/>\n";
		while (my $line = <$fd>) {
			if ($line =~ m/^tree (.*)$/) {
				print "$line " . $query->a({-href => "$myself?project=$project&action=show_tree&hash=$1"}, "(show tree)") . "<br/>\n";
			} elsif ($line =~ m/^(author|committer) (.*)/) {
				$line =~ m/^(.*) (.*>) ([0-9]+) (.*)$/;
				my $type = $1;
				my $name = $2;
				my $time = $3;
				my $timezone = $4;
				$name =~ s/</&lt;/;
				$name =~ s/>/&gt;/;
				$time = gmtime($time);
				print "$type $name $time $timezone<br/>\n";
			} else {
				$line =~ s/</&lt;/;
				$line =~ s/>/&gt;/;
				print "$line<br/>\n";
			}
		}
		close $fd;
		print "====================================<br/><br/>\n";
	}
	footer();
} elsif ($action eq "show_cset") {
	open my $fd, "-|", "$gitbin/cat-file", "commit", $hash;
	my $tree = <$fd>;
	chomp $tree;
	$tree =~ s/tree //;
	close $fd;

	open my $fd, "-|", "$gitbin/cat-file", "commit", $parent;
	my $parent_tree = <$fd>;
	chomp $parent_tree;
	$parent_tree =~ s/tree //;
	close $fd;

	open my $fd, "-|", "$gitbin/diff-tree", "-r", $parent_tree, $tree;
	my (@difftree) = map { chomp; $_ } <$fd>;
	close $fd;

	header();
	print "<pre>\n";
	foreach my $line (@difftree) {
		$line =~ m/^(.)(.*)\t(.*)\t(.*)\t(.*)$/;
		my $op = $1;
		my $mode = $2;
		my $type = $3;
		my $id = $4;
		my $file = $5;
		if ($type eq "blob") {
			if ($op eq "+") {
				print "NEW\t" . $query->a({-href => "$myself?project=$project&action=show_file&hash=$id"}, $file) . "\n";
			} elsif ($op eq "-") {
				print "DEL\t" . $query->a({-href => "$myself?project=$project&action=show_file&hash=$id"}, $file) . "\n";
			} elsif ($op eq "*") {
				$id =~ m/([0-9a-fA-F]+)->([0-9a-fA-F]+)/;
				my $old = $1;
				my $new = $2;
				print "DIFF\t" . $query->a({-href => "$myself?project=$project&action=show_diff&hash=$old&parent=$new"}, $file) . "\n";
			}
		}
	}
	print "</pre>\n";
	footer();
} elsif ($action eq "show_diff") {
	open my $fd2, "> $gittmp/$hash";
	open my $fd, "-|", "$gitbin/cat-file", "blob", $hash;
	while (my $line = <$fd>) {
		print $fd2 $line;
	}
	close $fd2;
	close $fd;

	open my $fd2, "> $gittmp/$parent";
	open my $fd, "-|", "$gitbin/cat-file", "blob", $parent;
	while (my $line = <$fd>) {
		print $fd2 $line;
	}
	close $fd2;
	close $fd;

	header();
	print "<pre>\n";
	open my $fd, "-|", "/usr/bin/diff", "-L", "$hash", "-L", "$parent", "-u", "-p", "$gittmp/$hash", "$gittmp/$parent";
	while (my $line = <$fd>) {
		$line =~ s/</&lt;/;
		$line =~ s/>/&gt;/;
		print $line;
	}
	close $fd;
	unlink("$gittmp/$hash");
	unlink("$gittmp/$parent");
	print "</pre>\n";
	footer();
} else {
	header();
	print "unknown action\n";
	footer();
}


^ permalink raw reply

* Re: [PATCH] write-tree performance problems
From: Linus Torvalds @ 2005-04-19 17:36 UTC (permalink / raw)
  To: Chris Mason; +Cc: git
In-Reply-To: <200504191250.10286.mason@suse.com>

On Tue, 19 Apr 2005, Chris Mason wrote:
> 
> I did a quick experiment with applying/commit 100 patches from the suse kernel 
> into a kernel git tree, which quilt can do in 2 seconds.  git needs 1m5s.

Note that I don't think you want to replace quilt with git. The approaches 
are totally different, and git does _not_ obviate the need for the quilt 
kind of "patch testing".

In fact, git has all the same issues that BK had, and for the same 
fundamental reason: if you do distributed work, you have to always 
"append" stuff, and that means that you can never re-order anything after 
the fact.

So git really is _not_ very good at all at doing what quilt does. Also, 
there's an inevitable cost of being careful, and as you note, the sha1 
calculation is expensive (*).

However, I hate your modification. Yeah, I know, performance is important 
to me, but even more than performance is that I can trust the end results, 
and that means that we calculate the hashes instead of just taking them 
from somewhere else..

What I _would_ like is the ability to re-use an old tree, though. What you 
really want to do is not pass in a set of directory names and just trust 
that they are correct, but just pass in a directory to compare with, and 
if the contents match, you don't need to write out a new one.

I'll try to whip up something that does what you want done, but doesn't
need (or take) any untrusted information from the user in the form "trust
me, it hasn't changed".

		Linus

(*) Actually, I think it's the compression that ends up being the most
expensive part.

^ permalink raw reply

* Re: [PATCH] Add help details to git help command. (This time with Perl)
From: Steven Cole @ 2005-04-19 17:35 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git, David Greaves
In-Reply-To: <20050418102412.GJ1461@pasky.ji.cz>

Petr Baudis wrote:
> Dear diary, on Mon, Apr 18, 2005 at 06:42:26AM CEST, I got a letter
> where Steven Cole <elenstev@mesatop.com> told me that...

[snippage]

> 
>>This patch will provide the comment lines in the shell script associated
>>with the command, cleaned up a bit for presentation.
>>
>>BUGS: This will also print any comments in the entire file, which may
>>not be desired.  If a command name and shell script filename
>>do not follow the usual convention, this won't work, e.g. ci for commit.
> 
> 
> Hey, those BUGS are the only slightly non-trivial thing on the whole
> thing! I could do this patch myself... ;-) Also, you don't want to print
> the first newline and the Copyright notices.
> 

OK, here is a patch which _may_ do what you want.

This one no longer prints comments embedded later in the file,
and doesn't print the first newline and disposes of the (c) notices
as requested.  And does the right thing for git help ci.

Example:

[steven@spc0 git-pasky-testing]$ ./git help diff
Make a diff between two GIT trees.

By default compares the current working tree to the state at the
last commit. You can specify -r rev1:rev2 or -r rev1 -r rev2 to
tell it to make a diff between the specified revisions. If you
do not specify a revision, the current working tree is implied
(note that no revision is different from empty revision - -r rev:
compares between rev and HEAD, while -r rev compares between rev
and working tree).

-p instead of one ID denotes a parent commit to the specified ID
(which must not be a tree, obviously).

Outputs a diff converting the first tree to the second one.

---------
Speaking of 'git diff', I ran that before applying the following patch,
and got a diff starting thusly:

  --- /dev/null
  +++ b/gitmerge-file.sh

I had earlier done a 'git pull pasky', which was 'Up to date'.

So, the following patch is a conventional diff.

If the Perl filename or code  is too hideous, you're more than
welcome to change it.

Steven
---------
This patch will provide the comment lines in the shell script associated
with the command, cleaned up a bit for presentation.

Thanks to Bob Newell for some Perl help.

Signed-off-by: Steven Cole <elenstev@mesatop.com>

diff -urN git-pasky-orig/git git-pasky-testing/git
--- git-pasky-orig/git	2005-04-19 10:27:54.000000000 -0600
+++ git-pasky-testing/git	2005-04-19 10:19:08.000000000 -0600
@@ -19,6 +19,11 @@


  help () {
+
+command=$1
+scriptfile=git$command.sh
+
+if [ ! $command ]; then
  	cat <<__END__
  The GIT scripted toolkit  $(gitversion.sh)

@@ -48,6 +53,8 @@
  	track		[RNAME]
  	version

+Additional help is available with: git help COMMAND
+
  Note that these expressions can be used interchangably as "ID"s:
  	empty string (current HEAD)
  	remote name (as registered with git addremote)
@@ -56,6 +63,13 @@
  	commit object hash (as returned by commit-id)
  	tree object hash (accepted only by some commands)
  __END__
+fi
+if [ $scriptfile = "gitci.sh" ]; then
+	scriptfile="gitcommit.sh"
+fi
+if [ ! $scriptfile = "git.sh" ]; then
+	print_help_header.pl <$scriptfile  | grep -v "(c)" | cut -c 3-
+fi
  }


diff -urN git-pasky-orig/Makefile git-pasky-testing/Makefile
--- git-pasky-orig/Makefile	2005-04-19 10:27:54.000000000 -0600
+++ git-pasky-testing/Makefile	2005-04-19 10:32:50.000000000 -0600
@@ -21,7 +21,7 @@
  	gitcommit.sh gitdiff-do gitdiff.sh gitlog.sh gitls.sh gitlsobj.sh \
  	gitmerge.sh gitpull.sh gitrm.sh gittag.sh gittrack.sh gitexport.sh \
  	gitapply.sh gitcancel.sh gitXlntree.sh commit-id gitlsremote.sh \
-	gitfork.sh gitinit.sh gitseek.sh gitstatus.sh
+	gitfork.sh gitinit.sh gitseek.sh gitstatus.sh print_help_header.pl

  COMMON=	read-cache.o

diff -urN git-pasky-orig/print_help_header.pl git-pasky-testing/print_help_header.pl
--- git-pasky-orig/print_help_header.pl	1969-12-31 17:00:00.000000000 -0700
+++ git-pasky-testing/print_help_header.pl	2005-04-19 10:24:34.000000000 -0600
@@ -0,0 +1,10 @@
+#!/usr/bin/perl
+#
+# Prints the block of text preceded by #
+# Copyright (c) Steven Cole, 2005
+#
+# reads from stdin   writes to stdout  no error checking
+<STDIN>;<STDIN>;
+while (substr( $line=<STDIN>, 0, 1) eq "#") {
+                 print $line;
+}

^ permalink raw reply

* Re: GIT Web Interface
From: Stéphane Fillod @ 2005-04-19 17:19 UTC (permalink / raw)
  To: git
In-Reply-To: <20050419165247.GB32259@kroah.com>

Greg KH <greg <at> kroah.com> writes:
[...]
> Looks good, care to post the updated version?

  http://ehlo.org/~kay/

What about a git repo of gitweb?

gitweb2.pl is nice with the browse function. BTW, but there's a '1' artefact
right after the browse link in action=show_tree :-)

Kay, your script is really nice, good job!

Here are some random ideas:
* make *any* hash clickable instead of the (show xx) links.
  Applicable in show_log, show_diff
* in show_diff, keep a back link to cset
* provide a download link in show_file (as well as show_cset/show_diff ?)
* obfuscate against spam the mail adresses in show_log?
* use of colors in show_log (commiter, author, ..)
* perhaps borrow some ideas from other SCM web interfaces besides BK
* kindly ask kernel.org to host your script one day?

All the best,
-- 
Stephane

^ permalink raw reply

* Re: naive question
From: Linus Torvalds @ 2005-04-19 17:41 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Paul Mackerras, git
In-Reply-To: <20050419171534.GH12757@pasky.ji.cz>

On Tue, 19 Apr 2005, Petr Baudis wrote:
> 
> I'd actually prefer, if:
> 
> (i) checkout-cache simply wouldn't touch files whose stat matches with
> what is in the cache; it updates the cache with the stat informations
> of touched files

Run "update-cache --refresh" _before_ doing the "checkout-cache", and that 
is exactly what will happen.

But yes, if you want to make checkout-cache update the stat info (Ingo 
wanted to do that too), it should be possible. The end result is a 
combination of "update-cache" and "checkout-cache", though: you'll 
effectively need to both (just in one pass).

With the current setup, you have to do

	update-cache --refresh
	checkout-cache -f -a
	update-cache --refresh

which is admittedly fairly inefficient.

The real expense right now of a merge is that we always forget all the
stat information when we do a merge (since it does a read-tree). I have a
cunning way to fix that, though, which is to make "read-tree -m" read in
the old index state like it used to, and then at the end just throw it
away except for the stat information.

		Linus

^ permalink raw reply

* Re: GIT Web Interface
From: Greg KH @ 2005-04-19 17:41 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Petr Baudis, git
In-Reply-To: <20050419173242.GA32478@vrfy.org>

On Tue, Apr 19, 2005 at 07:32:42PM +0200, Kay Sievers wrote:
> On Tue, Apr 19, 2005 at 09:52:48AM -0700, Greg KH wrote:
> > On Tue, Apr 19, 2005 at 05:59:45PM +0200, Kay Sievers wrote:
> > > On Tue, 2005-04-19 at 02:52 +0200, Petr Baudis wrote:
> > > > Dear diary, on Tue, Apr 19, 2005 at 02:44:15AM CEST, I got a letter
> > > > where Kay Sievers <kay.sievers@vrfy.org> told me that...
> > > > > I'm hacking on a simple web interface, cause I missed the bkweb too much.
> > > > > It can't do much more than browse through the source tree and show the
> > > > > log now, but that should change... :)
> > > > >   http://ehlo.org/~kay/gitweb.pl?project=linux-2.6
> > > > 
> > > > Hmm, looks nice for a start. (But you have obsolete git-pasky tree there! ;-)
> > > 
> > > Yeah, it's fresh now. :)
> > > 
> > > > > How can I get the files touched with a changeset and the corresponding
> > > > > diffs belonging to it?
> > > > 
> > > > diff-tree to get the list of files, you can do the corresponding diffs
> > > > e.g. by doing git diff -r tree1:tree2. Preferably make a patch for it
> > > > first to make it possible to diff individual files this way.
> > > 
> > > Ah, nice! Got it working.
> > 
> > Looks good, care to post the updated version?
> 
> Sure, but expect it to change dramatically tonight. :)

Ok, how about putting a link to it somewhere then, so you don't have to
be bothered with people like me asking for the latest copy? :)

thanks,

greg k-h

^ permalink raw reply

* [PATCH] Better options in Makefile
From: Pavel Roskin @ 2005-04-19 17:48 UTC (permalink / raw)
  To: git

Hello!

This patch allows to enable collision check and subsecond time
resolution by uncommenting one line.  Also, the corresponding comments
have been improved.

Signed-off-by: Pavel Roskin <proski@gnu.org>

--- a/Makefile
+++ b/Makefile
@@ -1,13 +1,18 @@
-# -DCOLLISION_CHECK if you believe that SHA1's
+# Define COLLISION_CHECK if you believe that SHA1's
 # 1461501637330902918203684832716283019655932542976 hashes do not give you
-# enough guarantees about no collisions between objects ever hapenning.
-#
-# -DNSEC if you want git to care about sub-second file mtimes and ctimes.
-# Note that you need some new glibc (at least >2.2.4) for this, and it will
+# sufficient guarantee the no collisions between objects will ever happen.
+
+# DEFINES += -DCOLLISION_CHECK
+
+# Define NSEC if you want git to care about sub-second file mtimes and ctimes.
+# Note that you will need recent glibc (at least 2.2.4) for this, and it will
 # BREAK YOUR LOCAL DIFFS! show-diff and anything using it will likely randomly
 # break unless your underlying filesystem supports those sub-second times
 # (my ext3 doesn't).
-CFLAGS=-g -O3 -Wall
+
+# DEFINES += -DNSEC
+
+CFLAGS=-g -O3 -Wall $(DEFINES)
 
 CC=gcc
 AR=ar


-- 
Regards,
Pavel Roskin



^ permalink raw reply

* Re: [PATCH] Add help details to git help command. (This time with Perl)
From: Petr Baudis @ 2005-04-19 17:50 UTC (permalink / raw)
  To: Steven Cole; +Cc: git, David Greaves
In-Reply-To: <42654153.8080307@mesatop.com>

Dear diary, on Tue, Apr 19, 2005 at 07:35:15PM CEST, I got a letter
where Steven Cole <elenstev@mesatop.com> told me that...
> Example:
..snip a perfect-looking example..
> ---------
> Speaking of 'git diff', I ran that before applying the following patch,
> and got a diff starting thusly:
> 
>  --- /dev/null
>  +++ b/gitmerge-file.sh
> 
> I had earlier done a 'git pull pasky', which was 'Up to date'.

Check/prune your add/rm-queue.

> So, the following patch is a conventional diff.
> 
> If the Perl filename or code  is too hideous, you're more than
> welcome to change it.

I'd actually prefer to just throw the whole help command handling
to githelp.pl. I dislike helper scripts if I can avoid them. ;-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: [PATCH] write-tree performance problems
From: Chris Mason @ 2005-04-19 18:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.58.0504191017300.19286@ppc970.osdl.org>

On Tuesday 19 April 2005 13:36, Linus Torvalds wrote:
> On Tue, 19 Apr 2005, Chris Mason wrote:
> > I did a quick experiment with applying/commit 100 patches from the suse
> > kernel into a kernel git tree, which quilt can do in 2 seconds.  git
> > needs 1m5s.
>
> Note that I don't think you want to replace quilt with git. The approaches
> are totally different, and git does _not_ obviate the need for the quilt
> kind of "patch testing".
>
> In fact, git has all the same issues that BK had, and for the same
> fundamental reason: if you do distributed work, you have to always
> "append" stuff, and that means that you can never re-order anything after
> the fact.

Very true, you can't replace quilt with git without ruining both of them.  But 
it would be nice to take a quilt tree and turn it into a git tree for merging 
purposes, or to make use of whatever visualization tools might exist someday.  

> What I _would_ like is the ability to re-use an old tree, though. What you
> really want to do is not pass in a set of directory names and just trust
> that they are correct, but just pass in a directory to compare with, and
> if the contents match, you don't need to write out a new one.
>
> I'll try to whip up something that does what you want done, but doesn't
> need (or take) any untrusted information from the user in the form "trust
> me, it hasn't changed".

We already have a "trust me, it hasn't changed" via update-cache.  If it gets 
called wrong the tree won't reflect reality.  The patch doesn't change the 
write-tree default, but does enable you to give write-tree better information 
about the parts of the tree you want written back to git.

With that said, I hate the patch too.  I didn't see how to compare against the 
old tree without reading each tree object from the old tree, and that should 
be slower then what write-tree does now.  So I wimped out and made the quick 
patch that demonstrates the cause of the performance hit.

The "move .git/index to a tmpfs file" change should be easier though, and has 
a real benefit.  How do you feel about s|.git/index|.git/index_dir/index| in 
the sources?  This gives us the flexibility to link it wherever is needed.

-chris

^ permalink raw reply

* Re: naive question
From: Linus Torvalds @ 2005-04-19 18:27 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Paul Mackerras, git
In-Reply-To: <Pine.LNX.4.58.0504191036560.19286@ppc970.osdl.org>

On Tue, 19 Apr 2005, Linus Torvalds wrote:
> 
> The real expense right now of a merge is that we always forget all the
> stat information when we do a merge (since it does a read-tree). I have a
> cunning way to fix that, though, which is to make "read-tree -m" read in
> the old index state like it used to, and then at the end just throw it
> away except for the stat information.

Ok, done. That was really the plan all along, it just got dropped in the 
excitement of trying to get the dang thing to _work_ in the first place ;)

The current version only does

	read-tree -m <orig> <branch1> <branch2>

which now reads the old stat cache information, and then applies that to 
the end result of any trivial merges in case the merge result matches the 
old file stats. It really boils down to this littel gem;

            /*
             * See if we can re-use the old CE directly?
             * That way we get the uptodate stat info.
             */
            if (path_matches(result, old) && same(result, old))
                    *result = *old;

and it seems to work fine.

HOWEVER, I'll also make it do the same for a "single-tree merge":

	read-tree -m <newtree>

so that you can basically say "read a new tree, and merge the stat 
information from the current cache".  That means that if you do a
"read-tree -m <newtree>" followed by a "checkout-cache -f -a", the 
checkout-cache only checks out the stuff that really changed.

You'll still need to do an "update-cache --refresh" for the actual new
stuff. We could make "checkout-cache" update the cache too, but I really
do prefer a "checkout-cache only reads the index, never changes it"  
world-view. It's nice to be able to have a read-only git tree.

Final note: just doing a plain "read-tree <newtree>" will still throw all
the stat info away, and you'll have to refresh it all...

		Linus

^ permalink raw reply

* Re: Change "pull" to _only_ download, and "git update"=pull+merge?
From: Daniel Barkalow @ 2005-04-19 18:28 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Martin Schlemmer, David Greaves, dwheeler, git
In-Reply-To: <20050419105008.GB12757@pasky.ji.cz>

On Tue, 19 Apr 2005, Petr Baudis wrote:

> I disagree. This already forces you to have two branches (one to pull
> from to get the data, mirroring the remote branch, one for your real
> work) uselessly and needlessly.

If you pull in a non-tracked tree, it certainly won't apply the
changes, so you can just have your local tree and pull other people's
trees as desired.

> I think there is just no good name for what pull is doing now, and
> update seems like a great name for what pull-and-merge really is. Pull
> really is pull - it _pulls_ the data, while update also updates the
> given tree. No surprises.

I'm actually getting suspicious that the right thing is to hide "pull" in
the id scheme. That is, instead of saying "linus" to refer to the
"linus" head that you currently have, you say "+linus" to refer to the
head Linus has on his server currently, and this will cause you to
download anything necessary to perform the operation with the resulting
value.

See, I don't think you ever want to just pull. You want to
pull-and-do-something, but the something could be any operation that uses
a commit, not necessarily update. So you could do "git diff -r +linus" to
compare your head against current linus. You'd want "git update" to take a
working directory from "linus" to "+linus" (just because you know Linus's
more recent head doesn't mean you're automatically using it). You could
just "git merge +linus" in your working directory to sync with Linus. Even
"git log +linus" to see his recent changes.

I think the only reason not to just make any reference to a head pull it
is performance on looking up the head; you don't really want to hammer the
server getting these 40-byte files constantly or wait for a connection
every time (not to mention the possibility of not being able to
connect). But there's no reason to want to not have the latest data, since
the older data doesn't go away.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply

* Re: [PATCH] write-tree performance problems
From: Olivier Galibert @ 2005-04-19 18:51 UTC (permalink / raw)
  To: git
In-Reply-To: <Pine.LNX.4.58.0504191017300.19286@ppc970.osdl.org>

On Tue, Apr 19, 2005 at 10:36:06AM -0700, Linus Torvalds wrote:
> In fact, git has all the same issues that BK had, and for the same 
> fundamental reason: if you do distributed work, you have to always 
> "append" stuff, and that means that you can never re-order anything after 
> the fact.

You can, moving a patch around is just a chain of merges.

[Warning, ascii "art" ahead]

A merge is traditionally seen as:

1- Start with (A, B, C... are nodes/trees..., Pn are patches/changesets):

     /--P1->B
    /
   A
    \
     \--P2->C

2- End with:

     /--P1->B
    /
   A----(P1+P2)->D
    \
     \--P2->C

   where D is the merge between B and C with A as common ancestor.

But you can also see the result as:

     /--P1->B--P2--\
    /               \
   A                 D
    \               /
     \--P2->C--P1--/

i.e. you have two patch chains, one being A-P1->B-P2->D and the other
A-P2->C-P1->D.  I.e. you have the two patches P1 and P2 in two
possible patching orders.  But you can do even more amusing.  Start
with a patch chain:

   E--P3-->F--P4-->G

and merge E and G with F as common ancestor.  You'll then get H where
E--P4-->H--P3-->G.  I.e. you inverted two patches in your patch chain.
Or, if you keep H instead of G as your head, you removed P3 from your
patch chain.

Of course you can permute blocs of patches that way by having E, F and
G further away from each other.  You just increase the merge conflict
probability.

That is, I think, the way to do quilt/arch patch handling with safe
distribution and safe backtracing procedures.

  OG.

^ permalink raw reply

* Re: [script] ge: export commits as patches
From: Ingo Molnar @ 2005-04-19 18:56 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git
In-Reply-To: <20050419170320.GG12757@pasky.ji.cz>


* Petr Baudis <pasky@ucw.cz> wrote:

> Dear diary, on Tue, Apr 19, 2005 at 03:48:43PM CEST, I got a letter
> where Ingo Molnar <mingo@elte.hu> told me that...
> > is there any 'export commit as patch' support in git-pasky? I didnt find 
> > any such command (maybe it got added meanwhile), so i'm using the 'ge' 
> > hack below.
> > 
> > e.g. i typically look at commits via 'git log', and then when i see 
> > something interesting, i look at the commit via the 'ge' script. E.g.  
> > "ge 834f6209b22af2941a8640f1e32b0f123c833061" done in the kernel tree 
> > will output a particular commit's header and the patch.
> 
> Nice idea. I will add it, probably as 'git patch'.
> 
> > TREE1=$(cat-file commit 2>/dev/null $1 | head -4 | grep ^tree | cut -d' ' -f2)
> > if [ "$TREE1" = "" ]; then echo 'ge <commit-ID>'; exit -1; fi
> > PARENT=$(cat-file commit 2>/dev/null $1 | head -4 | grep ^parent | cut -d' ' -f2)
> > if [ "$PARENT" = "" ]; then echo 'ge <commit-ID>'; exit -1; fi
> > TREE2=$(cat-file commit 2>/dev/null $PARENT | head -4 | grep ^tree | cut -d' ' -f2)
> > if [ "$TREE2" = "" ]; then echo 'ge <commit-ID>'; exit -1; fi
> 
> commit-id and parent-id tools might be useful. ;-)

find a cleaned up 'ge' script below.

and please fix gitXnormid.sh to simply echo nothing and return with a -1 
exit value when a nonsensical ID is passed to it. Right now the output 
is quite ugly if you do 'ge blah'.

	Ingo

#!/bin/bash

usage ()
{
 echo 'usage: ge <commit-ID>'
 exit -1
}

if [ $# != 1 ]; then
 usage
fi

ME=    $(commit-id $1);      [ "$ME"     = "" ] && usage
PARENT=$(parent-id $ME);     [ "$PARENT" = "" ] && usage
TREE1= $(tree-id   $ME);     [ "$TREE1"  = "" ] && usage
TREE2= $(tree-id   $PARENT); [ "$TREE2"  = "" ] && usage

cat-file commit $ME
echo
git diff -r $TREE2:$TREE1


^ permalink raw reply

* Re: [GIT PATCH] I2C and W1 bugfixes for 2.6.12-rc2
From: Greg KH @ 2005-04-19 18:58 UTC (permalink / raw)
  To: Greg KH; +Cc: Linus Torvalds, Git Mailing List, linux-kernel, sensors
In-Reply-To: <20050419043938.GA23724@kroah.com>

On Mon, Apr 18, 2005 at 09:39:38PM -0700, Greg KH wrote:
> Alright, let's try some small i2c and w1 patches...
> 
> Could you merge with:
> 	kernel.org/pub/scm/linux/kernel/git/gregkh/i2c-2.6.git/

Nice, it looks like the merge of this tree, and my usb tree worked just
fine.

So, what does this now mean?  Is your kernel.org git tree now going to
be the "real" kernel tree that you will be working off of now?  Should
we crank up the nightly snapshots and emails to the -commits list?

Can I rely on the fact that these patches are now in your tree and I can
forget about them? :)

Just wondering how comfortable you feel with your git tree so far.

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH] write-tree performance problems
From: Linus Torvalds @ 2005-04-19 19:03 UTC (permalink / raw)
  To: Chris Mason; +Cc: git
In-Reply-To: <200504191412.00227.mason@suse.com>

On Tue, 19 Apr 2005, Chris Mason wrote:
> 
> Very true, you can't replace quilt with git without ruining both of them.  But 
> it would be nice to take a quilt tree and turn it into a git tree for merging 
> purposes, or to make use of whatever visualization tools might exist someday.  

Fair enough. The thing is, going from quilt->git really is a pretty "big
decision", since it's the decision that says "I will now really commit all
this quilt changes forever and ever".

Which is also why I think it's actually ok to take a minute to do 100
quilt patches. This is not something you do on a whim. It's something
you'd better think about. It's turning a very fluid environment into a
unchangable, final thing.

That said, I agree that "write-tree" is expensive. It tends to be by far
the most expensive op you normally do. I'll make sure it goes faster.

> We already have a "trust me, it hasn't changed" via update-cache.

Heh. I see "update-cache" not as a "it hasn't changed", but a "it _has_ 
changed, and now I want you to reflect that fact". In other words, 
update-cache is an active statement: it says that you're ready to commit 
your changes.

In contrast, to me your "write-tree" thing in many ways is the reverse of 
that: it's saying "don't look here, there's nothing interesting there".

Which to me smells like trying to hide problems rather than being positive 
about them.

Which it is, of course. It's trying to hide the fact that writing a tree 
is not instantaenous.

> With that said, I hate the patch too.  I didn't see how to compare against the 
> old tree without reading each tree object from the old tree, and that should 
> be slower then what write-tree does now.

Reading a tree is faster, simply because you uncompress instead of
compress. So I can read a tree in 0.28 seconds, but it takes me 0.34
seconds to write one. That said, reading the trees has disk seek issues if
it's not in the cache.

What I'd actually prefer to do is to just handle tree caching the same way
we handle file caching - in the index.

Ie we could have the index file track "what subtree is this directory
associated with", and have a "update-cache --refresh-dir" thing that
updates it (and any entry update in that directory obviously removes the
dir-cache entry).

Normally we'd not bother and it would never trigger, but it would be
useful for your scripted setup it would end up caching all the tree
information in a very efficient manner. Totally transparently, apart from
the one "--refresh-dir" at the beginning. That one would be slightly
expensive (ie would do all the stuff that "write-tree" does, but it would
be done just once).

(We could also just make "write-tree" do it _totally_ transparently, but
then we're back to having write-tree both read _and_ write the index file,
which is a situation that I've been trying to avoid. It's so much easier 
to verify the correctness of an operation if it is purely "one-way").

I'll think about it. I'd love to speed up write-tree, and keeping track of 
it in the index is a nice little trick, but it's not quite high enough up 
on my worries for me to act on it right now.

But if you want to try to see how nasty it would be to add tree index
entries to the index file at "write-tree" time automatically, hey...

		Linus

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox