Git development
 help / color / mirror / Atom feed
* Re: [idea] Converting sha1 evaluator into parser/interpreter
From: Jakub Narebski @ 2006-05-21  8:06 UTC (permalink / raw)
  To: git

Shawn Pearce wrote:

> There was just a short conversation on #git about converting
> the sha1 expression evaluator into a split parser/interpreter
> model.  The idea here would be to convert an expression such as
> 
>   HEAD@{yesterday}~3^{tree}
> 
> into a an expression tree such as (in LISP style):
> 
>   (peel-onion (walk-back 3 (date-spec yesterday (ref HEAD))))

Rather

    (peel-onion 'tree (walk-back 3 (date-spec yesterday (ref HEAD))))
 
> with such a tree it is relatively easy to evaluate the expression,
> but its also easy to determine if a ref name is valid.  Just pass
> it through the parser and see if you get back anything more complex
> then '(ref <input>)'.

Didn't you meant to see if we get correct tree (not a forest), 
and if the root of said tree is '(ref <commit-ish>)' [1]?

Interpreting said parse tree anch checking if it folds to correct object
reference is the task of interpreter, not parser...


[*1*] if I understand currectly that <commit-ish> mean direct sha1,
shortened sha1, or ref (head or tag)? commit-ish is not in 
git glossary...

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: [PATCH] Remove possible segfault in http-fetch.
From: Junio C Hamano @ 2006-05-21  7:49 UTC (permalink / raw)
  To: Sean; +Cc: git, Nick Hengeveld
In-Reply-To: <BAYC1-PASMTP082397700A9527CC2F3786AEA40@CEZ.ICE>

Sean <seanlkml@sympatico.ca> writes:

> Free the curl string lists after running http_cleanup to
> avoid an occasional segfault in the curl library.  Seems
> to only occur if the website returns a 405 error.
>...
> It comes with a big disclaimer because I don't really know the
> code in here all that well.  However gdb reports the segfault
> happens in a strncasecmp call, and seeing as we've released a
> bunch of strings prior to the call....
>
>  http-fetch.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/http-fetch.c b/http-fetch.c
> index 861644b..178f1ee 100644
> --- a/http-fetch.c
> +++ b/http-fetch.c
> @@ -1269,10 +1269,10 @@ int main(int argc, char **argv)
>  	if (pull(commit_id))
>  		rc = 1;
>  
> -	curl_slist_free_all(no_pragma_header);
> -
>  	http_cleanup();
>  
> +	curl_slist_free_all(no_pragma_header);
> +
>  	if (corrupt_object_found) {
>  		fprintf(stderr,
>  "Some loose object were found to be corrupt, but they might be just\n"

curl_easy_cleanup() which is called from http_cleanup() says it
is safe to remove the strings _after_ you call that function, so
I think the change makes sense -- it was apparently unsafe to
free them before calling cleanup.

Knowing nothing about quirks in curl libraries, one thing that
is mystery to me is that we slist_append() to other two lists
(pragma_header and range_header) but we do not seem to ever free
them.  Another slist dav_headers is allocated and then freed
inside a function, so that call-pattern seems well-formed.

Nick, care to help us out?

^ permalink raw reply

* Re: [RFC] send-pack: allow skipping delta when sending pack
From: Junio C Hamano @ 2006-05-21  6:17 UTC (permalink / raw)
  To: Jeff King; +Cc: git
In-Reply-To: <20060521054827.GA18530@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> The result is much better performance in my case. However, the method
> seems quite hack-ish, so I wanted to get comments on how this should be
> done. Possibilities I considered:

>   1. A command line option to git-send-pack. The problem with this is
>      that support is required from git-push and cg-push to pass the
>      option through.

When you pull from such a repository you would also need to be
able to control this.  The repository owner knows what's in the
repository a lot more than the downloader, so some repository
configuration that tells upload-pack to use such-and-such delta
window is also needed.  But as you say below:

>   3. Ideally, we could do some heuristic to see if deltification will
>      yield helpful results. In particular, we may already have a pack
>      with these commits in it (especially if we just repack before a
>      push). If we can re-use this information, it at least saves
>      deltifying twice (once to pack, once to push). In theory, I would
>      think the fact that we don't pass --no-reuse-delta to pack-objects
>      means that this would happen automatically, but it clearly doesn't.

The lack of --no-reuse-delta just means "if the object we are
going to send is a delta in the source, and its delta base is
also something we are going to send, then pretend that it is the
base delta for that object to skip computation".  What you want
here is "if the object we are going to send is not a delta in
the source, and there are sufficient number of other objects the
object could have been deltified against, then it is very likely
that it was not worth deltifying when it was packed; so it is
probably not worth deltifying it now".

^ permalink raw reply

* [RFC] send-pack: allow skipping delta when sending pack
From: Jeff King @ 2006-05-21  5:48 UTC (permalink / raw)
  To: git

I have a git repo where I keep relatively large files (digital photos).
I have a local repo and a "master" repo on a server which I access over
ssh.  Deltifying a bunch of large images takes a relatively long time. I
can live with this while packing (though it is slightly annoying to have
to pack separately on both repos, I understand why it might be hard to
reuse the deltification).

However, it is extremely annoying to add a large set of images and then
push them to the server. The server is on a 100Mbit LAN, so the
deltification part of the process takes up most of the time (and
typically ends up making no deltas, since the files are unrelated
images). The patch below causes the GIT_NODELTA environment variable to
set the window depth to 0 when sending a pack, preventing deltification.

The result is much better performance in my case. However, the method
seems quite hack-ish, so I wanted to get comments on how this should be
done. Possibilities I considered:
  1. A command line option to git-send-pack. The problem with this is
     that support is required from git-push and cg-push to pass the
     option through.
  2. A repo config variable that says not to deltify on sending (or
     potentially, not to deltify at all, which makes packing in general
     much nicer -- however, I don't think this is a good idea, as I do
     still want deltification rarely, it's just that it mostly will
     fail). This should probably be per-remote for the obvious reason
     that one might push to local and remote repos. One drawback is that
     sometimes deltification may be a win; it's just that I sometimes
     know that it won't be (because I added a bunch of unrelated large
     files). It's nice to selectively turn this option on for a given
     push.
  3. Ideally, we could do some heuristic to see if deltification will
     yield helpful results. In particular, we may already have a pack
     with these commits in it (especially if we just repack before a
     push). If we can re-use this information, it at least saves
     deltifying twice (once to pack, once to push). In theory, I would
     think the fact that we don't pass --no-reuse-delta to pack-objects
     means that this would happen automatically, but it clearly doesn't.

Comments?

---

f1cf653120dd492d1c86ee2a92a9c8221023cef1
 send-pack.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

f1cf653120dd492d1c86ee2a92a9c8221023cef1
diff --git a/send-pack.c b/send-pack.c
index 409f188..4ad6489 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -30,8 +30,14 @@ static void exec_pack_objects(void)
 	static const char *args[] = {
 		"pack-objects",
 		"--stdout",
+		NULL,
 		NULL
 	};
+	const char *nodelta;
+
+	nodelta = getenv("GIT_NODELTA");
+	if(nodelta && !strcmp(nodelta, "1"))
+		args[2] = "--depth=0";
 	execv_git_cmd(args);
 	die("git-pack-objects exec failed (%s)", strerror(errno));
 }
-- 
1.3.3.g288c-dirty

^ permalink raw reply related

* Re: [PATCH 0/5] More ref logging
From: Sean @ 2006-05-21  5:09 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: junkio, git
In-Reply-To: <20060521045146.GA8269@spearce.org>

On Sun, 21 May 2006 00:51:46 -0400
Shawn Pearce <spearce@spearce.org> wrote:

> Any chance we could get some details on why so many tags were useful?
> You have a good argument here but I'm not sure how a better tag
> store could be structured.  :-)

It is a conversion from another SCM.  So every one of the ~16K commits
was tagged with the reference number taken from the original SCM.  This
has some very nice benefits in that you can refer to every commit
in git by the original changeset #.  For example, if someone reports a
bug mentioning the original scm's reference id, you can say something
like: "git show p4/1234" without having to go back to the old scm.

Also, qgit, gitk and gitweb display them nicely which can be helpful
during the conversion.  And if/when they're not needed any longer,
you just delete them without having to rewrite the history etc..   
  
> Yea - despite being the author of ref log I'm still slightly unhappy
> with the fact that it doesn't make reuse of existing GIT plumbing.
> But I'm sort of OK with that right now as you can't map two indexes
> into memory at once currently, nor is there a way to easily update
> multiple refs at once if the ref log must serialize access to create
> a string of trees and commits.

Well it's not the end of the world either way, and sometimes it's just
better to implement a workable solution rather than wait for one that's
theoretically cleaner.  It just seemed like it was worth mentioning in
case you saw a way to make it happen without a lot of grief.

Sean

^ permalink raw reply

* Re: [PATCH 0/5] More ref logging
From: Shawn Pearce @ 2006-05-21  4:51 UTC (permalink / raw)
  To: Sean; +Cc: junkio, git
In-Reply-To: <20060520224344.7ebca48b.seanlkml@sympatico.ca>

Sean <seanlkml@sympatico.ca> wrote:
> On Sat, 20 May 2006 20:50:09 -0400
> Shawn Pearce <spearce@spearce.org> wrote:
> 
> > It sort of is a new way of tagging commits with extra data.  But its
> > also sort of a way of versioning your ref `database'.  Using tags
> > to save the points in time might be useful but it would generate
> > a lot of temporary files.  A commit every 5 minutes for a typical
> > working week would generate 480 tags per week.  That's just too much.
> 
> But isn't that just an implementation detail?  I've actually run
> into another situation where tags would be perfect if only they weren't
> so expensive (ie. entire repo was in a 50Mb pack including tag objects,
> but the .git/refs/tags directory was over 100Mb).

Any chance we could get some details on why so many tags were useful?
You have a good argument here but I'm not sure how a better tag
store could be structured.  :-)
 
> So, if we found a way to store tags more efficiently your 480 tags per
> week shouldn't be a problem at all.   The main point being to extend
> and optimize the existing infrastructure rather than bolting on a new
> class of objects (ie. ref log) which only serves a narrow (albeit
> important) purpose.

Yea - despite being the author of ref log I'm still slightly unhappy
with the fact that it doesn't make reuse of existing GIT plumbing.
But I'm sort of OK with that right now as you can't map two indexes
into memory at once currently, nor is there a way to easily update
multiple refs at once if the ref log must serialize access to create
a string of trees and commits.

-- 
Shawn.

^ permalink raw reply

* [PATCH] git-svn: ignore expansion of svn:keywords
From: Eric Wong @ 2006-05-21  3:03 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: git
In-Reply-To: <446A0CCF.2060903@inoi.fi>

Can you try this patch?

This weakens an integrity in order to work on certain
repositories (see comments).  It's probably safe to use,
though...  More testing and feedback would be nice.

I've split out the test for this feature to make things easier
to manage and test, too.

Also changed assert_svn_wc_clean() to only work on svn, and
require a separate call to assert_tree() to check wc integrity
against git in preparation for another change I'm planning.

Signed-off-by: Eric Wong <normalperson@yhbt.net>

---

 @Junio: please don't apply this to git.git just yet, thanks.

 contrib/git-svn/Makefile                         |    1 
 contrib/git-svn/git-svn.perl                     |   79 +++++++++++++++++-----
 contrib/git-svn/t/lib-git-svn.sh                 |   39 +++++++++++
 contrib/git-svn/t/t0000-contrib-git-svn.sh       |   43 +-----------
 contrib/git-svn/t/t0001-contrib-git-svn-props.sh |   51 ++++++++++++++
 5 files changed, 153 insertions(+), 60 deletions(-)
 create mode 100644 contrib/git-svn/t/lib-git-svn.sh
 create mode 100644 contrib/git-svn/t/t0001-contrib-git-svn-props.sh

eb8f17257c7d15ec6815baf18208af06f72b3cdb
diff --git a/contrib/git-svn/Makefile b/contrib/git-svn/Makefile
index acedf73..48f60b3 100644
--- a/contrib/git-svn/Makefile
+++ b/contrib/git-svn/Makefile
@@ -30,6 +30,7 @@ git-svn.html : git-svn.txt
 		-f ../../Documentation/asciidoc.conf $<
 test: git-svn
 	cd t && $(SHELL) ./t0000-contrib-git-svn.sh
+	cd t && $(SHELL) ./t0001-contrib-git-svn-props.sh
 
 clean:
 	rm -f git-svn *.xml *.html *.1
diff --git a/contrib/git-svn/git-svn.perl b/contrib/git-svn/git-svn.perl
index de13a96..86b687a 100755
--- a/contrib/git-svn/git-svn.perl
+++ b/contrib/git-svn/git-svn.perl
@@ -26,6 +26,7 @@ use Carp qw/croak/;
 use IO::File qw//;
 use File::Basename qw/dirname basename/;
 use File::Path qw/mkpath/;
+use File::Copy qw/cp/;
 use Getopt::Long qw/:config gnu_getopt no_ignore_case auto_abbrev/;
 use File::Spec qw//;
 use POSIX qw/strftime/;
@@ -207,7 +208,7 @@ sub rebuild {
 		push @svn_up, '--ignore-externals' unless $_no_ignore_ext;
 		sys(@svn_up,"-r$newest_rev");
 		$ENV{GIT_INDEX_FILE} = $GIT_SVN_INDEX;
-		git_addremove();
+		index_changes();
 		exec('git-write-tree');
 	}
 	waitpid $pid, 0;
@@ -249,7 +250,7 @@ sub fetch {
 		chdir $SVN_WC or croak $!;
 		read_uuid();
 		$last_commit = git_commit($base, @parents);
-		assert_svn_wc_clean($base->{revision}, $last_commit);
+		assert_tree($last_commit);
 	} else {
 		chdir $SVN_WC or croak $!;
 		read_uuid();
@@ -259,7 +260,11 @@ sub fetch {
 	push @svn_up, '--ignore-externals' unless $_no_ignore_ext;
 	my $last = $base;
 	while (my $log_msg = next_log_entry($svn_log)) {
-		assert_svn_wc_clean($last->{revision}, $last_commit);
+		# this assertion is commented out because it breaks keywords
+		# on https://svn.musicpd.org/Jamming/trunk (r166:167), but
+		# I can't seem to reproduce something like that on a test...
+		# assert_svn_wc_clean($last->{revision});
+		assert_tree($last_commit);
 		if ($last->{revision} >= $log_msg->{revision}) {
 			croak "Out of order: last >= current: ",
 				"$last->{revision} >= $log_msg->{revision}\n";
@@ -268,7 +273,8 @@ sub fetch {
 		$last_commit = git_commit($log_msg, $last_commit, @parents);
 		$last = $log_msg;
 	}
-	assert_svn_wc_clean($last->{revision}, $last_commit);
+	assert_svn_wc_clean($last->{revision});
+	assert_tree($last_commit);
 	unless (-e "$GIT_DIR/refs/heads/master") {
 		sys(qw(git-update-ref refs/heads/master),$last_commit);
 	}
@@ -314,7 +320,6 @@ sub commit {
 		$svn_current_rev = svn_commit_tree($svn_current_rev, $c);
 	}
 	print "Done committing ",scalar @revs," revisions to SVN\n";
-
 }
 
 sub show_ignore {
@@ -367,13 +372,11 @@ sub setup_git_svn {
 }
 
 sub assert_svn_wc_clean {
-	my ($svn_rev, $treeish) = @_;
+	my ($svn_rev) = @_;
 	croak "$svn_rev is not an integer!\n" unless ($svn_rev =~ /^\d+$/);
-	croak "$treeish is not a sha1!\n" unless ($treeish =~ /^$sha1$/o);
 	my $lcr = svn_info('.')->{'Last Changed Rev'};
 	if ($svn_rev != $lcr) {
 		print STDERR "Checking for copy-tree ... ";
-		# use
 		my @diff = grep(/^Index: /,(safe_qx(qw(svn diff),
 						"-r$lcr:$svn_rev")));
 		if (@diff) {
@@ -389,7 +392,6 @@ sub assert_svn_wc_clean {
 		print STDERR $_ foreach @status;
 		croak;
 	}
-	assert_tree($treeish);
 }
 
 sub assert_tree {
@@ -416,7 +418,7 @@ sub assert_tree {
 		unlink $tmpindex or croak $!;
 	}
 	$ENV{GIT_INDEX_FILE} = $tmpindex;
-	git_addremove();
+	index_changes(1);
 	chomp(my $tree = `git-write-tree`);
 	if ($old_index) {
 		$ENV{GIT_INDEX_FILE} = $old_index;
@@ -426,6 +428,7 @@ sub assert_tree {
 	if ($tree ne $expected) {
 		croak "Tree mismatch, Got: $tree, Expected: $expected\n";
 	}
+	unlink $tmpindex;
 }
 
 sub parse_diff_tree {
@@ -562,7 +565,8 @@ sub precommit_check {
 sub svn_checkout_tree {
 	my ($svn_rev, $treeish) = @_;
 	my $from = file_to_s("$REV_DIR/$svn_rev");
-	assert_svn_wc_clean($svn_rev,$from);
+	assert_svn_wc_clean($svn_rev);
+	assert_tree($from);
 	print "diff-tree $from $treeish\n";
 	my $pid = open my $diff_fh, '-|';
 	defined $pid or croak $!;
@@ -852,13 +856,50 @@ sub svn_info {
 
 sub sys { system(@_) == 0 or croak $? }
 
-sub git_addremove {
-	system( "git-diff-files --name-only -z ".
-				" | git-update-index --remove -z --stdin && ".
-		"git-ls-files -z --others ".
-			"'--exclude-from=$GIT_DIR/$GIT_SVN/info/exclude'".
-				" | git-update-index --add -z --stdin"
-		) == 0 or croak $?
+sub do_update_index {
+	my ($z_cmd, $cmd, $no_text_base) = @_;
+
+	my $z = open my $p, '-|';
+	defined $z or croak $!;
+	unless ($z) { exec @$z_cmd or croak $! }
+
+	my $pid = open my $ui, '|-';
+	defined $pid or croak $!;
+	unless ($pid) {
+		exec('git-update-index',"--$cmd",'-z','--stdin') or croak $!;
+	}
+	local $/ = "\0";
+	while (my $x = <$p>) {
+		chomp $x;
+		if (!$no_text_base && lstat $x && ! -l _) {
+			my $mode = -x _ ? 0755 : 0644;
+			my ($v,$d,$f) = File::Spec->splitpath($x);
+			my $tb = File::Spec->catfile($d, '.svn', 'tmp',
+						'text-base',"$f.svn-base");
+			$tb =~ s#^/##;
+			unless (-f $tb) {
+				$tb = File::Spec->catfile($d, '.svn',
+						'text-base',"$f.svn-base");
+				$tb =~ s#^/##;
+			}
+			unlink $x or croak $!;
+			cp($tb, $x) or croak $!;
+			chmod(($mode &~ umask), $x) or croak $!;
+		}
+		print $ui $x,"\0";
+	}
+	close $ui or croak $!;
+}
+
+sub index_changes {
+	my $no_text_base = shift;
+	do_update_index([qw/git-diff-files --name-only -z/],
+			'remove',
+			$no_text_base);
+	do_update_index([qw/git-ls-files -z --others/,
+			      "--exclude-from=$GIT_DIR/$GIT_SVN/info/exclude"],
+			'add',
+			$no_text_base);
 }
 
 sub s_to_file {
@@ -936,7 +977,7 @@ sub git_commit {
 	defined $pid or croak $!;
 	if ($pid == 0) {
 		$ENV{GIT_INDEX_FILE} = $GIT_SVN_INDEX;
-		git_addremove();
+		index_changes();
 		chomp(my $tree = `git-write-tree`);
 		croak if $?;
 		if (exists $tree_map{$tree}) {
diff --git a/contrib/git-svn/t/lib-git-svn.sh b/contrib/git-svn/t/lib-git-svn.sh
new file mode 100644
index 0000000..a98e9d1
--- /dev/null
+++ b/contrib/git-svn/t/lib-git-svn.sh
@@ -0,0 +1,39 @@
+PATH=$PWD/../:$PATH
+if test -d ../../../t
+then
+    cd ../../../t
+else
+    echo "Must be run in contrib/git-svn/t" >&2
+    exit 1
+fi
+
+. ./test-lib.sh
+
+GIT_DIR=$PWD/.git
+GIT_SVN_DIR=$GIT_DIR/git-svn
+SVN_TREE=$GIT_SVN_DIR/tree
+
+svnadmin >/dev/null 2>&1
+if test $? != 1
+then
+    test_expect_success 'skipping contrib/git-svn test' :
+    test_done
+    exit
+fi
+
+svn >/dev/null 2>&1
+if test $? != 1
+then
+    test_expect_success 'skipping contrib/git-svn test' :
+    test_done
+    exit
+fi
+
+svnrepo=$PWD/svnrepo
+
+set -e
+
+svnadmin create $svnrepo
+svnrepo="file://$svnrepo/test-git-svn"
+
+
diff --git a/contrib/git-svn/t/t0000-contrib-git-svn.sh b/contrib/git-svn/t/t0000-contrib-git-svn.sh
index 80ad357..8b3a0d9 100644
--- a/contrib/git-svn/t/t0000-contrib-git-svn.sh
+++ b/contrib/git-svn/t/t0000-contrib-git-svn.sh
@@ -3,48 +3,10 @@ #
 # Copyright (c) 2006 Eric Wong
 #
 
-
-PATH=$PWD/../:$PATH
 test_description='git-svn tests'
-if test -d ../../../t
-then
-    cd ../../../t
-else
-    echo "Must be run in contrib/git-svn/t" >&2
-    exit 1
-fi
-
-. ./test-lib.sh
-
-GIT_DIR=$PWD/.git
-GIT_SVN_DIR=$GIT_DIR/git-svn
-SVN_TREE=$GIT_SVN_DIR/tree
-
-svnadmin >/dev/null 2>&1
-if test $? != 1
-then
-    test_expect_success 'skipping contrib/git-svn test' :
-    test_done
-    exit
-fi
-
-svn >/dev/null 2>&1
-if test $? != 1
-then
-    test_expect_success 'skipping contrib/git-svn test' :
-    test_done
-    exit
-fi
-
-svnrepo=$PWD/svnrepo
-
-set -e
-
-svnadmin create $svnrepo
-svnrepo="file://$svnrepo/test-git-svn"
+. ./lib-git-svn.sh
 
 mkdir import
-
 cd import
 
 echo foo > foo
@@ -55,10 +17,9 @@ mkdir -p bar
 echo 'zzz' > bar/zzz
 echo '#!/bin/sh' > exec.sh
 chmod +x exec.sh
-svn import -m 'import for git-svn' . $svnrepo >/dev/null
+svn import -m 'import for git-svn' . "$svnrepo" >/dev/null
 
 cd ..
-
 rm -rf import
 
 test_expect_success \
diff --git a/contrib/git-svn/t/t0001-contrib-git-svn-props.sh b/contrib/git-svn/t/t0001-contrib-git-svn-props.sh
new file mode 100644
index 0000000..20c5c4e
--- /dev/null
+++ b/contrib/git-svn/t/t0001-contrib-git-svn-props.sh
@@ -0,0 +1,51 @@
+#!/bin/sh
+#
+# Copyright (c) 2006 Eric Wong
+#
+
+test_description='git-svn property tests'
+. ./lib-git-svn.sh
+
+mkdir import
+
+cd import
+	cat >> kw.c <<''
+/* Make it look like somebody copied a file from CVS into SVN: */
+/* $Id: kw.c,v 1.1.1.1 1994/03/06 00:00:00 eric Exp $ */
+
+	svn import -m 'import for git-svn' . "$svnrepo" >/dev/null
+cd ..
+
+rm -rf import
+svn co "$svnrepo" test_wc
+
+cd test_wc
+	echo 'Greetings' >> kw.c
+	svn commit -m 'Not yet an $Id$'
+	svn up
+
+	echo 'Hello world' >> kw.c
+	svn commit -m 'Modified file, but still not yet an $Id$'
+	svn up
+
+	svn propset svn:keywords Id kw.c
+	svn commit -m 'Propset $Id$'
+	svn up
+cd ..
+
+git-svn init "$svnrepo"
+git-svn fetch
+
+git checkout -b mybranch remotes/git-svn
+echo 'Hi again' >> kw.c
+name='test svn:keywords ignoring'
+
+git commit -a -m "$name"
+git-svn commit remotes/git-svn..mybranch
+git pull . remotes/git-svn
+
+expect='/* $Id$ */'
+got="`sed -ne 2p kw.c`"
+test_expect_success 'raw $Id$ found in kw.c' "test '$expect' = '$got'"
+
+test_done
-- 
1.3.2.g7d11

^ permalink raw reply related

* Re: [PATCH 0/5] More ref logging
From: Sean @ 2006-05-21  2:43 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: junkio, git
In-Reply-To: <20060521005009.GA7179@spearce.org>

On Sat, 20 May 2006 20:50:09 -0400
Shawn Pearce <spearce@spearce.org> wrote:

> It sort of is a new way of tagging commits with extra data.  But its
> also sort of a way of versioning your ref `database'.  Using tags
> to save the points in time might be useful but it would generate
> a lot of temporary files.  A commit every 5 minutes for a typical
> working week would generate 480 tags per week.  That's just too much.

But isn't that just an implementation detail?  I've actually run
into another situation where tags would be perfect if only they weren't
so expensive (ie. entire repo was in a 50Mb pack including tag objects,
but the .git/refs/tags directory was over 100Mb).

So, if we found a way to store tags more efficiently your 480 tags per
week shouldn't be a problem at all.   The main point being to extend
and optimize the existing infrastructure rather than bolting on a new
class of objects (ie. ref log) which only serves a narrow (albeit
important) purpose.
 
> I was actually thinking this morning that another way to do this
> is to keep a metadata branch within the repository which records
> all of the refs in tree objects, then save the root commit under
> the special ref `LOG` in GIT_DIR.  Every update to a logged ref
> would cause the tree to be updated and a new commit to be built.
> The branch would be a relatively simple string of pearls as its
> doubtful you would branch it.
>
> There are a number of downsides to this, not the least of which is
> I'd like to put a commit or tag SHA1 into the tree object rather than
> writing each ref as a blob (saves space).  Currently commits and tags
> aren't permitted in a tree object so that would require some effort.
> But on the other hand you could pull (and track!) someone elses
> ref log through the standard GIT protocol.
 
Yes, Linus proposed something similar earlier to hold meta data.
But i've come to see tags as a place to store any arbitrary meta
data associated with a commit.   If their implementation was more
efficient you could use them for your project and they could be used
for any number of other purposes as well.

> But this is starting to head down into the `bind commit` discussion;
> how do we record a number of commits as being related and tie them
> up into a single super commit?

Well, a tag that allowed the listing of multiple heads....

Sean

^ permalink raw reply

* Re: [PATCH] Make '@' not valid in a ref name.
From: Shawn Pearce @ 2006-05-21  2:19 UTC (permalink / raw)
  To: Eric Wong; +Cc: Junio Hamano, git, Martin Langhoff
In-Reply-To: <20060521020038.GA22926@hand.yhbt.net>

Eric Wong <normalperson@hand.yhbt.net> wrote:
> Shawn Pearce <spearce@spearce.org> wrote:
> > Now that the sha1 expression syntax supports looking up a ref's
> > value at a prior point in time through the '@' operator the '@'
> > operator should not be permitted in a ref name.
> 
> This would break git-archimport (where email addresses are the first
> part of the branch/tag names).

OK, so this patch is quite unpopular and should never make it
into GIT.  I'm glad we have many eyes on this mailing list!


There was just a short conversation on #git about converting
the sha1 expression evaluator into a split parser/interpreter
model.  The idea here would be to convert an expression such as

  HEAD@{yesterday}~3^{tree}

into a an expression tree such as (in LISP style):

  (peel-onion (walk-back 3 (date-spec yesterday (ref HEAD))))

with such a tree it is relatively easy to evaluate the expression,
but its also easy to determine if a ref name is valid.  Just pass
it through the parser and see if you get back anything more complex
then '(ref <input>)'.

Comments?

-- 
Shawn.

^ permalink raw reply

* [PATCH] Elaborate on why ':' is a bad idea in a ref name.
From: Shawn Pearce @ 2006-05-21  2:03 UTC (permalink / raw)
  To: junio; +Cc: git

With the new cat-file syntax of 'v1.3.3:refs.c' we should mention
it as part of the reason why ':' is not permitted in a ref name.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>

---

0b552e48ed1d1ce01e0c2850e90caad8150c199c
 Documentation/git-check-ref-format.txt |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

0b552e48ed1d1ce01e0c2850e90caad8150c199c
diff --git a/Documentation/git-check-ref-format.txt b/Documentation/git-check-ref-format.txt
index 7dc1bdb..3ea720d 100644
--- a/Documentation/git-check-ref-format.txt
+++ b/Documentation/git-check-ref-format.txt
@@ -45,6 +45,8 @@ refname expressions (see gitlink:git-rev
 
 . colon `:` is used as in `srcref:dstref` to mean "use srcref\'s
   value and store it in dstref" in fetch and push operations.
+  It may also be used to select a specific object such as with
+  gitlink:git-cat-file[1] "git-cat-file blob v1.3.3:refs.c".
 
 
 GIT
-- 
1.3.3.gfad60

^ permalink raw reply related

* Re: [PATCH] Make '@' not valid in a ref name.
From: Eric Wong @ 2006-05-21  2:00 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Junio Hamano, git, Martin Langhoff
In-Reply-To: <20060521013751.GA7516@spearce.org>

Shawn Pearce <spearce@spearce.org> wrote:
> Now that the sha1 expression syntax supports looking up a ref's
> value at a prior point in time through the '@' operator the '@'
> operator should not be permitted in a ref name.

This would break git-archimport (where email addresses are the first
part of the branch/tag names).

-- 
Eric Wong

^ permalink raw reply

* Re: [PATCH] Make '@' not valid in a ref name.
From: Shawn Pearce @ 2006-05-21  1:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vodxsywzk.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> wrote:
> I am not a fan of retroactively disallowing what we used to
> allow.  Is this unavoidable?
> 

We're talking about it on #git right now.  Someone actually uses
refs like 'user@host/foo' and thus doesn't like this patch either.

We were talking about disallowing '@{' instead.  Really its just
'@{<some run that smells like a date}' at the end of the ref which
would want to be disallowed; similiar to how ~ and ^ really only
need to be disallowed near the end.

The date parser grabs '@{' not '@' so 'user@host/foo@{yesterday}'
makes sense to it.  But 'user@{host}/foo@{yesterday}' is going
to cause problems as the date parser will attempt to evaluate
'host}/foo@{yesterday'.  :-(

-- 
Shawn.

^ permalink raw reply

* [PATCH] Reference git-check-ref-format in git-branch.
From: Shawn Pearce @ 2006-05-21  1:54 UTC (permalink / raw)
  To: Junio Hamano; +Cc: git

Its nice to have git-check-ref-format actually get mentioned in
git-branch's documentation as the syntax of a ref name must conform
to what is described in git-check-ref-format.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>

---
 Sorry about this patch being built on pu.  It clearly has no
 relationship to current pu, but the new -l appears in the hunk
 below...

1e2080dcf2f8e76e0fcf48684e5c6b182f695e0a
 Documentation/git-branch.txt   |    3 +++
 Documentation/git-checkout.txt |    5 ++++-
 2 files changed, 7 insertions(+), 1 deletions(-)

1e2080dcf2f8e76e0fcf48684e5c6b182f695e0a
diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt
index a7bec3c..d43ef1d 100644
--- a/Documentation/git-branch.txt
+++ b/Documentation/git-branch.txt
@@ -49,6 +49,9 @@ OPTIONS
 
 <branchname>::
 	The name of the branch to create or delete.
+	The new branch name must pass all checks defined by
+	gitlink:git-check-ref-format[1].  Some of these checks
+	may restrict the characters allowed in a branch name.
 
 <start-point>::
 	The new branch will be created with a HEAD equal to this.  It may
diff --git a/Documentation/git-checkout.txt b/Documentation/git-checkout.txt
index 0643943..fbdbadc 100644
--- a/Documentation/git-checkout.txt
+++ b/Documentation/git-checkout.txt
@@ -35,7 +35,10 @@ OPTIONS
 	Force a re-read of everything.
 
 -b::
-	Create a new branch and start it at <branch>.
+	Create a new branch named <new_branch> and start it at
+	<branch>.  The new branch name must pass all checks defined
+	by gitlink:git-check-ref-format[1].  Some of these checks
+	may restrict the characters allowed in a branch name.
 
 -l::
 	Create the new branch's ref log.  This activates recording of
-- 
1.3.3.gfad60

^ permalink raw reply related

* Re: [PATCH] Make '@' not valid in a ref name.
From: Junio C Hamano @ 2006-05-21  1:42 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git
In-Reply-To: <20060521013751.GA7516@spearce.org>

I am not a fan of retroactively disallowing what we used to
allow.  Is this unavoidable?

^ permalink raw reply

* [PATCH] Make '@' not valid in a ref name.
From: Shawn Pearce @ 2006-05-21  1:37 UTC (permalink / raw)
  To: Junio Hamano; +Cc: git

Now that the sha1 expression syntax supports looking up a ref's
value at a prior point in time through the '@' operator the '@'
operator should not be permitted in a ref name.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>

---

90d3212d5351d2f6c6ad33578c9f9df2e07af12e
 refs.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

90d3212d5351d2f6c6ad33578c9f9df2e07af12e
diff --git a/refs.c b/refs.c
index eeb1196..2530c99 100644
--- a/refs.c
+++ b/refs.c
@@ -213,14 +213,14 @@ int get_ref_sha1(const char *ref, unsign
  *
  * - any path component of it begins with ".", or
  * - it has double dots "..", or
- * - it has ASCII control character, "~", "^", ":" or SP, anywhere, or
+ * - it has ASCII control character, "@", "~", "^", ":" or SP,
  * - it ends with a "/".
  */
 
 static inline int bad_ref_char(int ch)
 {
 	return (((unsigned) ch) <= ' ' ||
-		ch == '~' || ch == '^' || ch == ':' ||
+		ch == '@' || ch == '~' || ch == '^' || ch == ':' ||
 		/* 2.13 Pattern Matching Notation */
 		ch == '?' || ch == '*' || ch == '[');
 }
-- 
1.3.3.gfad60

^ permalink raw reply related

* Re: [PATCH] Implement git-quiltimport (take 2)
From: Eric W. Biederman @ 2006-05-21  1:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v4pzk196p.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> writes:

> ebiederm@xmission.com (Eric W. Biederman) writes:
>
>> Where git-mailinfo is likely to fall down is more in the quilt
>> patches from Andi Kleen. If you look at my quoted patch header below
>> you will see the subject is a plain line, followed by a space followed
>> by a from.  On this example git-mailinfo works (except for picking up
>> the subject) but it appears to be a fluke.
>>
>> From x86_64-mm-add-abilty-to-enable-disable-nmi-watchdog-from-sysfs.patch:
>>
>
> Yeah, that's right, but in a real mailbox wouldn't that line be
> prefixed with a '>' ;-)?

That last from line was my attribution.  The first quoted line
was the first line of the patch.  In this context that was probably
a little confusing.

Eric

^ permalink raw reply

* Re: irc usage..
From: Donnie Berkholz @ 2006-05-21  1:14 UTC (permalink / raw)
  To: Donnie Berkholz; +Cc: Yann Dirson, Linus Torvalds, Git Mailing List
In-Reply-To: <446F95A2.6040909@gentoo.org>

[-- Attachment #1: Type: text/plain, Size: 542 bytes --]

Donnie Berkholz wrote:
> Somebody else tried importing it with git-cvsimport, but he said he hit
> some kind of problem and recalled that it was a cvsps segfault. Sounds
> about right, since I've never gotten cvsps to run successfully on the
> whole repo either.

Much to my surprise, a cvsps run I started earlier has just finished
without segfaulting. But attempts to actually run cvsps (e.g., cvsps -a
spyderous) spit thousands of warnings of "WARNING: revision 1.1.1.1 of
file $FILENAME on unnamed branch".

Thanks,
Donnie


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 252 bytes --]

^ permalink raw reply

* Re: git-svn vs. $Id$
From: Eric Wong @ 2006-05-21  1:12 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: Linus Torvalds, git
In-Reply-To: <446A15F8.3040106@inoi.fi>

Tommi Virtanen <tv@inoi.fi> wrote:
> Linus Torvalds wrote:
> > Isn't there some flag to svn to avoid keyword expansion, like "-ko" to 
> > CVS?
> > 
> > Any import script definitely should avoid keyword expansion (and that's 
> > true whether you end up wanting to use keywords or not).
> 
> Well, yes, I agree. But, at least git-svn.txt says this:
> 
> BUGS
> ----
> ...
> svn:keywords can't be ignored in Subversion (at least I don't know of
> a way to ignore them).
> 
> I guess one might be able to reach that information through the svn API.
> 
> Or just propget svn:keywords and sed s/\$Id\(:[^$]*\)\$/$Id$/ all files
> with keywords, for all relevant keywords. Eww.

I'm working on a solution to this (using files in .svn/text-base).

keyword expansion behavior seems inconsistent on some SVN repos and I
can't reproduce it on my local repositories, so I think I will have to
weaken some integrity checks[1] in git-svn to work around it...

1 - I don't think these integrity checks were ever tripped in the first
place.

-- 
Eric Wong

^ permalink raw reply

* Re: [PATCH] Implement git-quiltimport (take 2)
From: Junio C Hamano @ 2006-05-21  1:02 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: git
In-Reply-To: <m13bf4qjjv.fsf@ebiederm.dsl.xmission.com>

ebiederm@xmission.com (Eric W. Biederman) writes:

> Where git-mailinfo is likely to fall down is more in the quilt
> patches from Andi Kleen. If you look at my quoted patch header below
> you will see the subject is a plain line, followed by a space followed
> by a from.  On this example git-mailinfo works (except for picking up
> the subject) but it appears to be a fluke.
>
> From x86_64-mm-add-abilty-to-enable-disable-nmi-watchdog-from-sysfs.patch:
>

Yeah, that's right, but in a real mailbox wouldn't that line be
prefixed with a '>' ;-)?

^ permalink raw reply

* Re: [RFD] Git glossary: 'branch' and 'head' description
From: Shawn Pearce @ 2006-05-21  1:01 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Daniel Barkalow, Junio C Hamano, git
In-Reply-To: <Pine.LNX.4.64.0605191853570.10823@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> wrote:
> 
> On Fri, 19 May 2006, Daniel Barkalow wrote:
> > 
> > I guess I did forget that it sticks around. So you have to be doing 
> > something somewhat more complicated, like fetching the latest versions of 
> > multiple topic branches.
> 
> I actually don't think it's at all unlikely that I'd start using this.

Hey, if its useful.  :-) If its not then I'm doing something
wrong here...
 
[snip]
> I'm not entirely sure about the syntax, though. It ends up being pretty 
> command-line-unfriendly. The "gitk ORIG_HEAD.." thing is fairly easy to 
> type, but typing
> 
> 	gitk 'master@{2 hours ago}'..
> 
> on a Finnish keyboard (yeah, that's what I still use) is "interesting", 
> since all of '@', '{' and '}' are complex characters (AltGr + '2', AltGr + 
> '7' and AltGr + '0' respectively), and you have to remember the quoting.

Wow.  So what you are saying is writing any sort of C code must be
rather painful.  :-)

I received a suggestion of using ' (single quote) rather than {
as the quoting character.  I didn't make the quoting character
optional as I realized users were likely to forget they needed it
on date specs which contain ':', so I just made them required to
keep things consistent at all times.  Further {} won out over ''
as {} is also used with the ^ operator (e.g. v1.3.3^{tree}).

> Not that I see any obvious better syntax. Although allowing a shorthand 
> like "@2.hours.ago" for "current branch, at given date" might help a 
> bit, at least that wouldn't need quoting:
> 
> 	gitk @2.hours.ago..

The empty prefix for `HEAD` is simple.  The '.' part would need to
be fixed in approxidate() (and thus --since would also benefit).
Omitting the {} might be OK but see above...

-- 
Shawn.

^ permalink raw reply

* Re: [PATCH] Implement git-quiltimport (take 2)
From: Eric W. Biederman @ 2006-05-21  0:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v8xow1a6r.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> writes:

> ebiederm@xmission.com (Eric W. Biederman) writes:
>
>> Currently git-mailinfo distinguishes headers and non-headers by the
>> presence of the first blank line.  So it seems to work just fine on
>> mbox format patches.
>
> The program was designed to be fed one e-mail a time (the
> intended way for it to work is that a wrapper script uses
> git-mailsplit to break mbox up and call git-mailinfo one by
> one).

In this case what is meant is a leading "From " header (no colon)
at the start of the patch.

Where git-mailinfo is likely to fall down is more in the quilt
patches from Andi Kleen. If you look at my quoted patch header below
you will see the subject is a plain line, followed by a space followed
by a from.  On this example git-mailinfo works (except for picking up
the subject) but it appears to be a fluke.

>From x86_64-mm-add-abilty-to-enable-disable-nmi-watchdog-from-sysfs.patch:

> Add abilty to enable/disable nmi watchdog with sysctl
> 
> From: dzickus <dzickus@redhat.com>
> 
> Adds a new /proc/sys/kernel/nmi call that will enable/disable the nmi
> watchdog.
> 
> Signed-off-by:  Don Zickus <dzickus@redhat.com>
> Signed-off-by: Andi Kleen <ak@suse.de>
> 
> ---
>  arch/i386/kernel/nmi.c   |   52 +++++++++++++++++++++++++++++++++++++++++++++++
>  arch/x86_64/kernel/nmi.c |   48 +++++++++++++++++++++++++++++++++++++++++++
>  include/asm-i386/nmi.h   |    1
>  include/asm-x86_64/nmi.h |    1
>  include/linux/sysctl.h   |    1
>  kernel/sysctl.c          |   11 +++++++++
>  6 files changed, 114 insertions(+)
> 
> Index: linux/arch/i386/kernel/nmi.c


Eric

^ permalink raw reply

* Re: gitk highlight feature
From: Paul Mackerras @ 2006-05-21  0:40 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0605200934240.10823@g5.osdl.org>

Linus Torvalds writes:

> I think the "Find" field should highlight things too. Right now there's no 
> way to get highlighting for somebody having signed-off on a patch, for 
> example, even though you can _search_ for it.

Yes, I think the way to go is to extend the highlight feature to be
able to do everything that the "Find" function can do, and then I
think the "Find" function won't be needed any more.

> Also, right now it says "Author/committer", but it actually only triggers 
> on author. Which may be the right thing to do (it's certainly what I'd 
> normally want to see), but it doesn't match the description. 

If the author matches, it highlights both the headline and the
author.  If the committer matches, it highlights only the headline (as
it does for any other highlighting).  Try it with "torvalds" in the
author/committer field on commit a54c9d30 (compared to cb46c370,
say).  If that's confusing I can change it of course.

Paul.

^ permalink raw reply

* Re: [PATCH 0/5] More ref logging
From: Shawn Pearce @ 2006-05-21  0:50 UTC (permalink / raw)
  To: Sean; +Cc: junkio, git
In-Reply-To: <20060519071603.11d3be5d.seanlkml@sympatico.ca>

Sean <seanlkml@sympatico.ca> wrote:
> On Fri, 19 May 2006 05:14:56 -0400
> Shawn Pearce <spearce@spearce.org> wrote:
> 
> > * [PATCH 5/5] Enable ref log creation in git checkout -b.
> > 
> > 	Fix git checkout -b to behave like git branch.
> 
> It seems that the ref log is a new way of tagging commits with some
> extra meta data.  Conceptually this seems very close to what git tags 
> already do.  So... what about using regular git tags rather than
> creating a ref log?  All the regular git-rev-list tools could be
> used to query the tags and prune would delete them automatically etc.

It sort of is a new way of tagging commits with extra data.  But its
also sort of a way of versioning your ref `database'.  Using tags
to save the points in time might be useful but it would generate
a lot of temporary files.  A commit every 5 minutes for a typical
working week would generate 480 tags per week.  That's just too much.

I was actually thinking this morning that another way to do this
is to keep a metadata branch within the repository which records
all of the refs in tree objects, then save the root commit under
the special ref `LOG` in GIT_DIR.  Every update to a logged ref
would cause the tree to be updated and a new commit to be built.
The branch would be a relatively simple string of pearls as its
doubtful you would branch it.

There are a number of downsides to this, not the least of which is
I'd like to put a commit or tag SHA1 into the tree object rather than
writing each ref as a blob (saves space).  Currently commits and tags
aren't permitted in a tree object so that would require some effort.
But on the other hand you could pull (and track!) someone elses
ref log through the standard GIT protocol.

But this is starting to head down into the `bind commit` discussion;
how do we record a number of commits as being related and tie them
up into a single super commit?

-- 
Shawn.

^ permalink raw reply

* Re: [PATCH] Implement git-quiltimport (take 2)
From: Junio C Hamano @ 2006-05-21  0:41 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: git
In-Reply-To: <m1fyj4qkm2.fsf@ebiederm.dsl.xmission.com>

ebiederm@xmission.com (Eric W. Biederman) writes:

> Currently git-mailinfo distinguishes headers and non-headers by the
> presence of the first blank line.  So it seems to work just fine on
> mbox format patches.

The program was designed to be fed one e-mail a time (the
intended way for it to work is that a wrapper script uses
git-mailsplit to break mbox up and call git-mailinfo one by
one).

^ permalink raw reply

* Re: [PATCH] Implement git-quiltimport (take 2)
From: Eric W. Biederman @ 2006-05-21  0:36 UTC (permalink / raw)
  To: Greg KH; +Cc: Junio C Hamano, git
In-Reply-To: <20060520213257.GH24672@kroah.com>

Greg KH <greg@kroah.com> writes:

> On Fri, May 19, 2006 at 08:42:38PM -0600, Eric W. Biederman wrote:
>
>> If it is one patch per file but with mbox headers, it is relatively
>> simple to teach git-mailinfo to parse things in a slightly more intelligent
>> way.  I played with that but I didn't have any patches that helped with.
>
> Hm, I'll try playing with that.
>
> If you want, just grab my quilt trees from kernel.org and play with
> them, they should all be in mbox format for the individual patches (with
> some exceptions as noted above, just kick me about them to get me to fix
> them...)

So I just grabbed the gregkh-2.6 set of patches and with an unmodified
git-mailinfo I only have problems with the following patches:
	gregkh/gkh-version.patch
	gregkh/sysfs-test.patch
	gregkh/gregkh-usb-minors.patch
	gregkh/gregkh-debugfs_example.patch
	gregkh/gpl_future-test.patch
	usb/usb-gotemp.patch

None of which actually have from headers.

Currently git-mailinfo distinguishes headers and non-headers by the
presence of the first blank line.  So it seems to work just fine on
mbox format patches.

Eric

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox