Git development
 help / color / mirror / Atom feed
* [PATCH] builtin-grep: wildcard pathspec fixes
From: Junio C Hamano @ 2006-05-01 19:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0605010842130.21189@g5.osdl.org>

This tweaks the pathspec wildcard used in builtin-grep to match
that of ls-files.  With this:

	git grep -e DEBUG -- '*/Kconfig*'

would work like the shell script version, and you could even do:

	git grep -e DEBUG --cached -- '*/Kconfig*' ;# from index
	git grep -e DEBUG v2.6.12 -- '*/Kconfig*' ;# from rev

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 * Still haven't improved the "-e" issue (and to a lesser extent
   I think requiring -- is not right in this context either), but

 builtin-grep.c |   85 +++++++++++++++++++++++++++++++++++++++++---------------
 1 files changed, 62 insertions(+), 23 deletions(-)

diff --git a/builtin-grep.c b/builtin-grep.c
index 36150bf..653b65e 100644
--- a/builtin-grep.c
+++ b/builtin-grep.c
@@ -12,33 +12,66 @@ #include "diff.h"
 #include "revision.h"
 #include "builtin.h"
 #include <regex.h>
+#include <fnmatch.h>
 
+/*
+ * git grep pathspecs are somewhat different from diff-tree pathspecs;
+ * pathname wildcards are allowed.
+ */
 static int pathspec_matches(struct diff_options *opt, const char *name)
 {
-	int i, j;
-	int namelen;
+	int namelen, i;
 	if (!opt->nr_paths)
 		return 1;
 	namelen = strlen(name);
 	for (i = 0; i < opt->nr_paths; i++) {
 		const char *match = opt->paths[i];
 		int matchlen = opt->pathlens[i];
-		if (matchlen <= namelen) {
-			if (!strncmp(name, match, matchlen))
-				return 1;
+		const char *slash, *cp;
+
+		if ((matchlen <= namelen) &&
+		    !strncmp(name, match, matchlen) &&
+		    (match[matchlen-1] == '/' ||
+		     name[matchlen] == '\0' || name[matchlen] == '/'))
+			return 1;
+		if (!fnmatch(match, name, 0))
+			return 1;
+		if (name[namelen-1] != '/')
 			continue;
-		}
-		/* If name is "Documentation" and pathspec is
-		 * "Documentation/", they should match.  Maybe
-		 * we would want to strip it in get_pathspec()???
+
+		/* We are being asked if the name directory is worth
+		 * descending into.
+		 *
+		 * Find the longest leading directory name that does
+		 * not have metacharacter in the pathspec; the name
+		 * we are looking at must overlap with that directory.
 		 */
-		if (strncmp(name, match, namelen))
-			continue;
-		for (j = namelen; j < matchlen; j++)
-			if (match[j] != '/')
+		for (cp = match, slash = NULL; cp - match < matchlen; cp++) {
+			char ch = *cp;
+			if (ch == '/')
+				slash = cp;
+			if (ch == '*' || ch == '[')
 				break;
-		if (matchlen <= j)
-			return 1;
+		}
+		if (!slash)
+			slash = match; /* toplevel */
+		else
+			slash++;
+		if (namelen <= slash - match) {
+			/* Looking at "Documentation/" and
+			 * the pattern says "Documentation/howto/", or
+			 * "Documentation/diff*.txt".
+			 */
+			if (!memcmp(match, name, namelen))
+				return 1;
+		}
+		else {
+			/* Looking at "Documentation/howto/" and
+			 * the pattern says "Documentation/h*".
+			 */
+			if (!memcmp(match, name, slash - match))
+				return 1;
+		}
 	}
 	return 0;
 }
@@ -232,17 +265,17 @@ static int grep_tree(struct grep_opt *op
 	int hit = 0;
 	const char *path;
 	const unsigned char *sha1;
-	char *down_base;
+	char *down;
 	char *path_buf = xmalloc(PATH_MAX + strlen(tree_name) + 100);
 
 	if (tree_name[0]) {
 		int offset = sprintf(path_buf, "%s:", tree_name);
-		down_base = path_buf + offset;
-		strcat(down_base, base);
+		down = path_buf + offset;
+		strcat(down, base);
 	}
 	else {
-		down_base = path_buf;
-		strcpy(down_base, base);
+		down = path_buf;
+		strcpy(down, base);
 	}
 	len = strlen(path_buf);
 
@@ -252,7 +285,14 @@ static int grep_tree(struct grep_opt *op
 		pathlen = strlen(path);
 		strcpy(path_buf + len, path);
 
-		if (!pathspec_matches(&revs->diffopt, down_base))
+		if (S_ISDIR(mode))
+			/* Match "abc/" against pathspec to
+			 * decide if we want to descend into "abc"
+			 * directory.
+			 */
+			strcpy(path_buf + len + pathlen, "/");
+
+		if (!pathspec_matches(&revs->diffopt, down))
 			;
 		else if (S_ISREG(mode))
 			hit |= grep_sha1(opt, sha1, path_buf);
@@ -264,9 +304,8 @@ static int grep_tree(struct grep_opt *op
 			if (!data)
 				die("unable to read tree (%s)",
 				    sha1_to_hex(sha1));
-			strcpy(path_buf + len + pathlen, "/");
 			sub.buf = data;
-			hit = grep_tree(opt, revs, &sub, tree_name, down_base);
+			hit |= grep_tree(opt, revs, &sub, tree_name, down);
 			free(data);
 		}
 		update_tree_entry(tree);
-- 
1.3.1.gd233

^ permalink raw reply related

* git-bisect broken in 1.2.4
From: Olaf Hering @ 2006-05-01 18:10 UTC (permalink / raw)
  To: git


Did SuSE just pick up a bad version of git, 1.2.4?
git-bisect doesnt work correctly in the kernel sources, .git/HEAD doesnt
contain the commit id anymore, but 'ref: refs/heads/bisect'

CONFIG_LOCALVERSION_AUTO depends on the id.

^ permalink raw reply

* Re: cvsserver problem with eclipse?
From: Bill Burdick @ 2006-05-01 13:06 UTC (permalink / raw)
  To: git
In-Reply-To: <4455B863.8040808@mobilereasoning.com>

[-- Attachment #1: Type: text/plain, Size: 3527 bytes --]

FWIW, I'm using Eclipse 3.2RC1a.

Bill


Bill Burdick wrote:

> [taking this conversation to the list]
>
> OK, I changed the way I was testing this to match your usage plan and 
> I still got the same results.  It works just fine for vanilla CVS; cvs 
> status shows the re revision in the repo and cvs update brings it in.  
> Eclipse has the same funky behavior: Compare with latest at the 
> project level shows no differences, but compare with latest on the 
> changed file actually does an update instead of popping up the Eclipse 
> diff viewer.
>
> By the way, I had trouble at first accessing the repo with SSH because 
> of permissions on the sqlite db.  I'm not totally sure about the 
> implications for multiple users, but maybe just using a common group 
> will work fine?
>
> I'm really happy with git and git-cvsserver!  I'm hoping to be able to 
> standardize on it for our Eclipse work.  It seems like it should be 
> possible for us to continue to do integrations in Eclipse if we manage 
> git properly.  I've been waiting on arch, baz, and bzr for so long and 
> then suddenly, Linus pulls this out of his butt!
>
>
> Bill Burdick
>
>
>
> ---------- Forwarded message ----------
> From: *Martin Langhoff (CatalystIT)* < martin@catalyst.net.nz 
> <mailto:martin@catalyst.net.nz>>
> Date: May 1, 2006 6:41 AM
> Subject: Re: cvsserver problem with eclipse?
> To: Bill Burdick <bill.burdick@gmail.com <mailto:bill.burdick@gmail.com>>
> Cc: Martyn Smith < martyn@catalyst.net.nz 
> <mailto:martyn@catalyst.net.nz>>
>
> Hi Bill,
>
> the git repo you are using, did you set it up to just be a "naked" or
> "bare" repo, and enabled cvsserver alright? What version of git are you
> using?
>
> My usage plan would be to
>
> 1 - Have a git test project
> 2 - Publish it to a naked/bare repository where git-cvsserver is enabled
> 3 - Get the cvs checkout (try commandline cvs to make sure things work)
> 4 - Add commits in the git test project and push them to the repo
> 5 - Run cvs update
>
> If it's running ok with old school CVS client, then try with Eclipse. We
> have tested the update scenario quite a bit, and it works for us, so we
> need a bit more info.
>
> BTW, can I ask you to bounce these questions off the 
> git@vger.kernel.org <mailto:git@vger.kernel.org>
> mailing list? We will be happier answering these questions in a way that
> gets archived publicly. (Unless you are a direct client of Catalyst IT,
> that is! Are you at OpenUniversity perhaps?)
>
> cheers,
>
>
> martin
>
> Bill Burdick wrote:
>
>>  I made a test git repo with a branch called WillyTest and slurped it
>>  into Eclipse.  Then I checked out WillyTest into the git directory and
>>  committed a change.  A synchronize in Eclipse did not detect the 
>> change,
>>  nor did comparing with the latest on the head.  Showing history on the
>>  changed file did, however, list the new revision and I could access the
>>  contents from that view.  Compare with latest on the changed file
>>  actually replaces the current contents with the repository version.
>>
>>  Bill Burdick
>>
>
>
> -- 
> -----------------------------------------------------------------------
> Martin @ Catalyst .Net .NZ  Ltd, PO Box 11-053, Manners St,  Wellington
> WEB: http://catalyst.net.nz/           PHYS: Level 2, 150-154 Willis St
> OFFICE: +64(4)916-7224                              MOB: +64(21)364-017
>       Make things as simple as possible, but no simpler - Einstein
> -----------------------------------------------------------------------



[-- Attachment #2: bill.vcf --]
[-- Type: text/x-vcard, Size: 192 bytes --]

begin:vcard
fn:Bill Burdick
n:Burdick;Bill
org:Mobile Reasoning
email;internet:bill@mobilereasoning.com
title:Chief Scientist
tel;work:913-484-0172
x-mozilla-html:FALSE
version:2.1
end:vcard


^ permalink raw reply

* cvsserver problem with eclipse?
From: Bill Burdick @ 2006-05-01  7:27 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 3317 bytes --]

[taking this conversation to the list]

OK, I changed the way I was testing this to match your usage plan and I 
still got the same results.  It works just fine for vanilla CVS; cvs 
status shows the re revision in the repo and cvs update brings it in.  
Eclipse has the same funky behavior: Compare with latest at the project 
level shows no differences, but compare with latest on the changed file 
actually does an update instead of popping up the Eclipse diff viewer.

By the way, I had trouble at first accessing the repo with SSH because 
of permissions on the sqlite db.  I'm not totally sure about the 
implications for multiple users, but maybe just using a common group 
will work fine?

I'm really happy with git and git-cvsserver!  I'm hoping to be able to 
standardize on it for our Eclipse work.  It seems like it should be 
possible for us to continue to do integrations in Eclipse if we manage 
git properly.  I've been waiting on arch, baz, and bzr for so long and 
then suddenly, Linus pulls this out of his butt!


Bill Burdick



---------- Forwarded message ----------
From: *Martin Langhoff (CatalystIT)* < martin@catalyst.net.nz 
<mailto:martin@catalyst.net.nz>>
Date: May 1, 2006 6:41 AM
Subject: Re: cvsserver problem with eclipse?
To: Bill Burdick <bill.burdick@gmail.com <mailto:bill.burdick@gmail.com>>
Cc: Martyn Smith < martyn@catalyst.net.nz <mailto:martyn@catalyst.net.nz>>

Hi Bill,

the git repo you are using, did you set it up to just be a "naked" or
"bare" repo, and enabled cvsserver alright? What version of git are you
using?

My usage plan would be to

1 - Have a git test project
2 - Publish it to a naked/bare repository where git-cvsserver is enabled
3 - Get the cvs checkout (try commandline cvs to make sure things work)
4 - Add commits in the git test project and push them to the repo
5 - Run cvs update

If it's running ok with old school CVS client, then try with Eclipse. We
have tested the update scenario quite a bit, and it works for us, so we
need a bit more info.

BTW, can I ask you to bounce these questions off the git@vger.kernel.org 
<mailto:git@vger.kernel.org>
mailing list? We will be happier answering these questions in a way that
gets archived publicly. (Unless you are a direct client of Catalyst IT,
that is! Are you at OpenUniversity perhaps?)

cheers,


martin

Bill Burdick wrote:
>  I made a test git repo with a branch called WillyTest and slurped it
>  into Eclipse.  Then I checked out WillyTest into the git directory and
>  committed a change.  A synchronize in Eclipse did not detect the change,
>  nor did comparing with the latest on the head.  Showing history on the
>  changed file did, however, list the new revision and I could access the
>  contents from that view.  Compare with latest on the changed file
>  actually replaces the current contents with the repository version.
>
>  Bill Burdick
>


--
-----------------------------------------------------------------------
Martin @ Catalyst .Net .NZ  Ltd, PO Box 11-053, Manners St,  Wellington
WEB: http://catalyst.net.nz/           PHYS: Level 2, 150-154 Willis St
OFFICE: +64(4)916-7224                              MOB: +64(21)364-017
       Make things as simple as possible, but no simpler - Einstein
-----------------------------------------------------------------------

[-- Attachment #2: bill.vcf --]
[-- Type: text/x-vcard, Size: 192 bytes --]

begin:vcard
fn:Bill Burdick
n:Burdick;Bill
org:Mobile Reasoning
email;internet:bill@mobilereasoning.com
title:Chief Scientist
tel;work:913-484-0172
x-mozilla-html:FALSE
version:2.1
end:vcard


^ permalink raw reply

* Re: [PATCH] built-in "git grep" (git grip).
From: Linus Torvalds @ 2006-05-01 15:48 UTC (permalink / raw)
  To: Sam Ravnborg; +Cc: Junio C Hamano, git
In-Reply-To: <20060501145328.GA14856@mars.ravnborg.org>



On Mon, 1 May 2006, Sam Ravnborg wrote:
> 
> Seems I have confused myself.
>
>	git grep DEBUG '*/Kconfig*'
>
> does indeed work today.

Indeed. I was a bit confused about your report, since not only does it 
work today, that's how it has always worked, and it was very much designed 
that way. I use it all the time.

It takes the git-ls-files pathname syntax, which is a bit _different_ from 
the normal "limit to these paths" syntax, in that it honors '*'. And it 
honors that a bit differently than normal shell pathname expansion, 
because for git-ls-files a '*' pattern will match '/' as well.

So

	git grep pattern 'net/*.c'

will match every single C file found _recursively_ inside the "net/" 
subdirectory, not just in that single directory itself.

So "*" for git grep is a bit more like a "**" pattern in some shells.

		Linus

^ permalink raw reply

* Re: [PATCH] built-in "git grep" (git grip).
From: Sam Ravnborg @ 2006-05-01 14:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <20060501140704.GA6096@mars.ravnborg.org>

On Mon, May 01, 2006 at 04:07:04PM +0200, Sam Ravnborg wrote:
> > 
> > I usually do something stupid like:
> > git ls-files | grep Kconfig | xargs grep DEBUG
> 
> Which is indeed studip. I just learned I could say:
> git ls-files '*/Kconfig*' | xargs grep DEBUG

Seems I have confused myself.
git grep DEBUG '*/Kconfig*'
does indeed work today. And browsing the git grip code that
will also support it.

Sorry for the noise.

	Sam

^ permalink raw reply

* Re: [PATCH] built-in "git grep" (git grip).
From: Sam Ravnborg @ 2006-05-01 14:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <20060501140410.GA3505@mars.ravnborg.org>

> 
> I usually do something stupid like:
> git ls-files | grep Kconfig | xargs grep DEBUG

Which is indeed studip. I just learned I could say:
git ls-files '*/Kconfig*' | xargs grep DEBUG

	Sam

^ permalink raw reply

* Re: [PATCH] built-in "git grep" (git grip).
From: Sam Ravnborg @ 2006-05-01 14:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v1wvetfuj.fsf@assigned-by-dhcp.cox.net>

On Sun, Apr 30, 2006 at 11:32:36PM -0700, Junio C Hamano wrote:
> This attempts to set up built-in "git grep" to further reduce
> our dependence on the shell, while at the same time optionally
> allowing to run grep against object database.  You could do
> funky things like these:
> 
> 	git grep --cached -e pattern	;# grep from index
> 	git grep -e pattern master	;# or in a rev
> 	git grep -e pattern master next ;# or in multiple revs
> 	git grep -e pattern pu^@	;# even like this with an
> 					;# extension from another topic ;-)
> 	git grep -e pattern master..next ;# or even from rev ranges
> 	git grep -e pattern master~20:Documentation
> 					;# or an arbitrary tree
> 	git grep -e pattern next:git-commit.sh
>         				;# or an arbitrary blob
> 

A feature I have been missing often has been the possibility to limit
grep (and ls-files) to certain filenames.
Say:
git grip -e DEBUG 'Kconfig*'

I usually do something stupid like:
git ls-files | grep Kconfig | xargs grep DEBUG

Thought it may be trivial to extend git grip while you are there anyway.
But obviously only if this is useful for more than just me.

	Sam

^ permalink raw reply

* [PATCH] repack: honor -d even when no new pack was created
From: Martin Waitz @ 2006-05-01 10:57 UTC (permalink / raw)
  To: git

If all objects are reachable via an alternate object store then we
still have to remove all obsolete local packs.

Signed-off-by: Martin Waitz <tali@admingilde.org>

---

Without this patch I got the following behaviour:

".git/objects/info/alternates" [Neu] 1L, 38C geschrieben
admingilde:~/src/linux-2.6 > git count-objects
107852 objects, 665596 kilobytes
admingilde:~/src/linux-2.6 > git repack -a -l -d
Generating pack...
Done counting 0 objects.
Nothing new to pack.
admingilde:~/src/linux-2.6 > git prune
admingilde:~/src/linux-2.6 > git count-objects
0 objects, 0 kilobytes
admingilde:~/src/linux-2.6 > ls .git/objects/pack
pack-b3c6fbdfa36a326815de6358885c7a570a986b1b.idx
pack-b3c6fbdfa36a326815de6358885c7a570a986b1b.pack
pack-e0d76ffe354ef5665028a6cb4506ea902f72e1d0.idx
pack-e0d76ffe354ef5665028a6cb4506ea902f72e1d0.pack

After changing git-repack, objects/pack was empty, as expected.

 git-repack.sh |   14 +++++++-------
 1 files changed, 7 insertions(+), 7 deletions(-)

c58513fa76e10007fdf15b49a593a2b9c6a080be
diff --git a/git-repack.sh b/git-repack.sh
index e0c9f32..4fb3f26 100755
--- a/git-repack.sh
+++ b/git-repack.sh
@@ -48,15 +48,15 @@ name=$(git-rev-list --objects --all $rev
 	exit 1
 if [ -z "$name" ]; then
 	echo Nothing new to pack.
-	exit 0
-fi
-echo "Pack pack-$name created."
+else
+	echo "Pack pack-$name created."
 
-mkdir -p "$PACKDIR" || exit
+	mkdir -p "$PACKDIR" || exit
 
-mv .tmp-pack-$name.pack "$PACKDIR/pack-$name.pack" &&
-mv .tmp-pack-$name.idx  "$PACKDIR/pack-$name.idx" ||
-exit
+	mv .tmp-pack-$name.pack "$PACKDIR/pack-$name.pack" &&
+	mv .tmp-pack-$name.idx  "$PACKDIR/pack-$name.idx" ||
+	exit
+fi
 
 if test "$remove_redundant" = t
 then
-- 
1.3.1.g8971-dirty


-- 
Martin Waitz

^ permalink raw reply related

* [PATCH] built-in "git grep" (git grip) - quickfix
From: Junio C Hamano @ 2006-05-01  7:30 UTC (permalink / raw)
  To: git
In-Reply-To: <7v1wvetfuj.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> writes:

> Right now, it does not understand and/or obey many options grep
> should accept, and the pattern matcher using POSIX.2 regex seems
> to be excruciatingly slow...

I forgot to say that unlike the shell script version you need to
give -e in front of the pattern with this version because of the
way the option parser is structured.  Obviously this needs to be
fixed for usability's sake.

But I seem to have managed to fix the "excruciatingly slow" part
trivially.  regexec() is not re.match() but re.search(), and
there is no point looking at each character on the line.  Here
is a patch.

-- >8 --
diff --git a/builtin-grep.c b/builtin-grep.c
index adcdbaa..6230f44 100644
--- a/builtin-grep.c
+++ b/builtin-grep.c
@@ -42,26 +42,18 @@ static int grep_buffer(struct grep_opt *
 
 	while (left) {
 		regmatch_t pmatch[10];
-		int flags = 0;
-		char *eol, *cp, ch;
+		char *eol, ch;
 		eol = end_of_line(bol, &left);
 		ch = *eol;
 		*eol = 0;
-		for (cp = bol; cp < eol; cp++) { 
-			int status = regexec(&opt->regexp, cp,
-					     ARRAY_SIZE(pmatch), pmatch,
-					     flags);
-			if (status == REG_NOMATCH)
-				flags |= REG_NOTBOL;
-			else if (status == 0) {
-				/* Hit at this line */
-				printf("%s:", name);
-				if (opt->linenum)
-					printf("%d:", lno);
-				printf("%.*s\n", eol-bol, bol);
-				hit = 1;
-				break;
-			}
+		if (!regexec(&opt->regexp, bol,
+			     ARRAY_SIZE(pmatch), pmatch, 0)) {
+			/* Hit at this line */
+			printf("%s:", name);
+			if (opt->linenum)
+				printf("%d:", lno);
+			printf("%.*s\n", eol-bol, bol);
+			hit = 1;
 		}
 		*eol = ch;
 		bol = eol + 1;

^ permalink raw reply related

* Re: [PATCH] built-in "git grep" (git grip).
From: Jakub Narebski @ 2006-05-01  7:12 UTC (permalink / raw)
  To: git
In-Reply-To: <7vhd4as00i.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:

Jakub Narebski <jnareb@gmail.com> writes:

> Wouldn't "git ggrep" (from git-aware grep) or "git bgrep" (from built-in
> grep), similar to the egrep and fgrep from the grep package?

Yes, I understand, but I just don't like using 'grip'. And it would be nice
to have some convention for further not-ready-yet built-in replacements for
script versions of commands, for example adding letter 'b' as 'built-in' at
the beginning of command name: 'bgrep', 'bdiff'. Or use postfix 'n' or
'-ng' to denote transitionary not-ready-yet new version of command:
'grepn', 'diffn' or 'grep-ng', 'diff-ng'.


By the way, [my] grep is linked against libpcre - would it mean that git
would also need to use pcre library, or at least have an option to use it?

I also wonder if anyone would be interested to _force_ using external grep
(probably enhanced)... just a thought.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: [PATCH] built-in "git grep" (git grip).
From: Junio C Hamano @ 2006-05-01  6:59 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git
In-Reply-To: <e34bdf$ho4$1@sea.gmane.org>

Jakub Narebski <jnareb@gmail.com> writes:

> Wouldn't "git ggrep" (from git-aware grep) or "git bgrep" (from built-in
> grep), similar to the egrep and fgrep from the grep package?

The eventual goal is to rename it to "git grep" and remove the
shell based one, so the interim name does not matter.

^ permalink raw reply

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and
From: Junio C Hamano @ 2006-05-01  6:58 UTC (permalink / raw)
  To: Sam Vilain; +Cc: git, Jakub Narebski
In-Reply-To: <4455638A.3070802@vilain.net>

Sam Vilain <sam@vilain.net> writes:

> Junio C Hamano wrote:
>
>>To David, the commits he has in the chain between 6b426e to
>>18118c obviously suited the purpose of his tree better, and that
>>was why these commits were made.  And the fact Linus fast
>>forwarded to the tip of David is an implicit statement by Linus
>>that that results suits the purpose of Linus tree better as well
>>compared to his old tip, presumably 6b426e.
>
> Aha, now I see reason in the madness. So, the "prior" head is not stored
> in the trees, and tracking the progress of actual head transitions is
> loosely defined / a research topic. But demonstrably derivable. That
> works for me.

I do not think there is any madness involved here, but I should
point out that the above example happens to work only because
Linus and David are two different people.  If Linus did the
David's work in a separate repository, or even in the same
repository but on a separate branch, people following the Linus
tip might still want to know about the fast-forward, but that is
something you cannot truly tell by the digging like what I did
in the previous message.

That is why I earlier said this:

    *1* IOW, we _are_ losing some information by not recording the
    fact that fast-forward was done while doing so.  

    That record should _not_ be in the commit chain.  At the
    mechanical level, recording that in the commit chain means two
    criss-crossing branches never converge at the commit chain
    level, which is already bad.  At the philosophical level, the
    commit chain is a mesh of many possible "global" histories, and
    the record that somebody (a particular branch in a particular
    repository) was at what point in the mesh at given time does not
    belong there.

    But from the repository-owner's point of view, that _might_ be a
    useful information to keep.  I am just saying this preemptively
    so that if somebody wants to record it, that should not be
    recorded in the commit object.

I do not think the commit object is the place to record it, even
with a purely-comment field like "note prior".  The commit
ancestry DAG is global in nature, and the information under
discussion, "before pointing at this commit, the branch that
made this commit happened to point at this other commit", is
not.  That information describes only one-branch's view of the
world, and would not work in the fast-forward case because no
new commit is created.  An important property of a fast-forward
is that we do not create an extra commit object that makes it
impossible for two criss-crossing branches to ever converge.

On the other hand, a "note" field that records on which branch
of which repository each commit was made (you need to give each
repository-branch an UUID) when you do create a new commit would
be a sensible thing to have if somebody cares deeply enough.  It
is an information that is global in nature, and with that, you
could do the digging like I did without relying on the committer
identity, but instead using the branch identity.

^ permalink raw reply

* Re: [PATCH] built-in "git grep" (git grip).
From: Jakub Narebski @ 2006-05-01  6:56 UTC (permalink / raw)
  To: git
In-Reply-To: <7v1wvetfuj.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:

> This attempts to set up built-in "git grep" to further reduce
> our dependence on the shell, while at the same time optionally
> allowing to run grep against object database.
[...]
> In order to stay out of the way of real work people want to get
> done with the real "git grep", for now this implementation is
> called "git grip".

Wouldn't "git ggrep" (from git-aware grep) or "git bgrep" (from built-in
grep), similar to the egrep and fgrep from the grep package?

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* [PATCH] built-in "git grep" (git grip).
From: Junio C Hamano @ 2006-05-01  6:32 UTC (permalink / raw)
  To: git

This attempts to set up built-in "git grep" to further reduce
our dependence on the shell, while at the same time optionally
allowing to run grep against object database.  You could do
funky things like these:

	git grep --cached -e pattern	;# grep from index
	git grep -e pattern master	;# or in a rev
	git grep -e pattern master next ;# or in multiple revs
	git grep -e pattern pu^@	;# even like this with an
					;# extension from another topic ;-)
	git grep -e pattern master..next ;# or even from rev ranges
	git grep -e pattern master~20:Documentation
					;# or an arbitrary tree
	git grep -e pattern next:git-commit.sh
        				;# or an arbitrary blob

Right now, it does not understand and/or obey many options grep
should accept, and the pattern matcher using POSIX.2 regex seems
to be excruciatingly slow (I lifted it from Pasky's regexp
pickaxe code almost verbatim without thinking -- I was too
tired).  Help to improve things in the grep_buffer() function is
very much appreciated.

But this is going in the right direction.  The shell script
version is one of the worst Portability offender in the git
barebone Porcelainish; it uses xargs -0 to pass paths around and
shell arrays to sift flags and parameters.

In order to stay out of the way of real work people want to get
done with the real "git grep", for now this implementation is
called "git grip".

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
 Makefile       |    2 
 builtin-grep.c |  377 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 builtin.h      |    1 
 git.c          |    1 
 4 files changed, 380 insertions(+), 1 deletions(-)

diff --git a/Makefile b/Makefile
index 8ce27a6..8d5122b 100644
--- a/Makefile
+++ b/Makefile
@@ -214,7 +214,7 @@ LIB_OBJS = \
 	$(DIFF_OBJS)
 
 BUILTIN_OBJS = \
-	builtin-log.o builtin-help.o
+	builtin-log.o builtin-help.o builtin-grep.o
 
 GITLIBS = $(LIB_FILE) $(XDIFF_LIB)
 LIBS = $(GITLIBS) -lz
diff --git a/builtin-grep.c b/builtin-grep.c
new file mode 100644
index 0000000..adcdbaa
--- /dev/null
+++ b/builtin-grep.c
@@ -0,0 +1,377 @@
+/*
+ * Builtin "git grep"
+ *
+ * Copyright (c) 2006 Junio C Hamano
+ */
+#include "cache.h"
+#include "blob.h"
+#include "tree.h"
+#include "commit.h"
+#include "tag.h"
+#include "diff.h"
+#include "revision.h"
+#include "builtin.h"
+#include <regex.h>
+
+struct grep_opt {
+	const char *pattern;
+	regex_t regexp;
+	unsigned linenum:1;
+	unsigned pre_context;
+	unsigned post_context;
+};
+
+static char *end_of_line(char *cp, unsigned long *left)
+{
+	unsigned long l = *left;
+	while (l && *cp != '\n') {
+		l--;
+		cp++;
+	}
+	*left = l;
+	return cp;
+}
+
+static int grep_buffer(struct grep_opt *opt, const char *name,
+		       char *buf, unsigned long size)
+{
+	char *bol = buf;
+	unsigned long left = size;
+	unsigned lno = 1;
+	int hit = 0;
+
+	while (left) {
+		regmatch_t pmatch[10];
+		int flags = 0;
+		char *eol, *cp, ch;
+		eol = end_of_line(bol, &left);
+		ch = *eol;
+		*eol = 0;
+		for (cp = bol; cp < eol; cp++) { 
+			int status = regexec(&opt->regexp, cp,
+					     ARRAY_SIZE(pmatch), pmatch,
+					     flags);
+			if (status == REG_NOMATCH)
+				flags |= REG_NOTBOL;
+			else if (status == 0) {
+				/* Hit at this line */
+				printf("%s:", name);
+				if (opt->linenum)
+					printf("%d:", lno);
+				printf("%.*s\n", eol-bol, bol);
+				hit = 1;
+				break;
+			}
+		}
+		*eol = ch;
+		bol = eol + 1;
+		left--;
+		lno++;
+	}
+	return hit;
+}
+
+static int grep_sha1(struct grep_opt *opt, const unsigned char *sha1, const char *name)
+{
+	unsigned long size;
+	char *data;
+	char type[20];
+	int hit;
+	data = read_sha1_file(sha1, type, &size);
+	if (!data) {
+		error("'%s': unable to read %s", name, sha1_to_hex(sha1));
+		return 0;
+	}
+	hit = grep_buffer(opt, name, data, size);
+	free(data);
+	return hit;
+}
+
+static int grep_file(struct grep_opt *opt, const char *filename)
+{
+	struct stat st;
+	int i;
+	char *data;
+	if (lstat(filename, &st) < 0) {
+	err_ret:
+		if (errno != ENOENT)
+			error("'%s': %s", filename, strerror(errno));
+		return 0;
+	}
+	if (!st.st_size)
+		return 0; /* empty file -- no grep hit */
+	if (!S_ISREG(st.st_mode))
+		return 0;
+	i = open(filename, O_RDONLY);
+	if (i < 0)
+		goto err_ret;
+	data = xmalloc(st.st_size + 1);
+	if (st.st_size != xread(i, data, st.st_size)) {
+		error("'%s': short read %s", filename, strerror(errno));
+		close(i);
+		free(data);
+		return 0;
+	}
+	close(i);
+	i = grep_buffer(opt, filename, data, st.st_size);
+	free(data);
+	return i;
+}
+
+static int grep_cache(struct grep_opt *opt, struct rev_info *revs, int cached)
+{
+	int hit = 0;
+	int nr;
+	read_cache();
+
+	for (nr = 0; nr < active_nr; nr++) {
+		struct cache_entry *ce = active_cache[nr];
+		if (ce_stage(ce) || !S_ISREG(ntohl(ce->ce_mode)))
+			continue;
+		if (revs->diffopt.nr_paths) {
+			int i;
+			int namelen = ce_namelen(ce);
+			const char *name = ce->name;
+			for (i = 0; i < revs->diffopt.nr_paths; i++) {
+				const char *match = revs->diffopt.paths[i];
+				int matchlen = revs->diffopt.pathlens[i];
+				if (matchlen <= namelen)
+					if (!strncmp(name, match, matchlen))
+						break;
+			}
+			if (revs->diffopt.nr_paths <= i)
+				continue;
+		}
+		if (cached)
+			hit |= grep_sha1(opt, ce->sha1, ce->name);
+		else
+			hit |= grep_file(opt, ce->name);
+	}
+	return hit;
+}
+
+static int pathspec_matches(struct diff_options *opt, const char *name)
+{
+	int i;
+	int namelen;
+	if (!opt->nr_paths)
+		return 1;
+	namelen = strlen(name);
+	for (i = 0; i < opt->nr_paths; i++) {
+		const char *match = opt->paths[i];
+		int matchlen = opt->pathlens[i];
+		if (matchlen <= namelen)
+			if (!strncmp(name, match, matchlen))
+				return 1;
+	}
+	return 0;
+}
+
+static int grep_tree(struct grep_opt *opt, struct rev_info *revs,
+		     struct tree_desc *tree,
+		     const char *tree_name, const char *base)
+{
+	unsigned mode;
+	int len;
+	int hit = 0;
+	const char *path;
+	const unsigned char *sha1;
+	char *down_base;
+	char *path_buf = xmalloc(PATH_MAX + strlen(tree_name) + 100);
+
+	if (tree_name[0]) {
+		int offset = sprintf(path_buf, "%s:", tree_name);
+		down_base = path_buf + offset;
+		strcat(down_base, base);
+	}
+	else {
+		down_base = path_buf;
+		strcpy(down_base, base);
+	}
+	len = strlen(path_buf);
+
+	while (tree->size) {
+		int pathlen;
+		sha1 = tree_entry_extract(tree, &path, &mode);
+		pathlen = strlen(path);
+		strcpy(path_buf + len, path);
+
+		if (!pathspec_matches(&revs->diffopt, down_base))
+			;
+		else if (S_ISREG(mode))
+			hit |= grep_sha1(opt, sha1, path_buf);
+		else if (S_ISDIR(mode)) {
+			char type[20];
+			struct tree_desc sub;
+			void *data;
+			data = read_sha1_file(sha1, type, &sub.size);
+			if (!data)
+				die("unable to read tree (%s)",
+				    sha1_to_hex(sha1));
+			strcpy(path_buf + len + pathlen, "/");
+			sub.buf = data;
+			hit = grep_tree(opt, revs, &sub, tree_name, down_base);
+			free(data);
+		}
+		update_tree_entry(tree);
+	}
+	return hit;
+}
+
+static int grep_object(struct grep_opt *opt, struct rev_info *revs,
+		       struct object *obj, const char *name)
+{
+	if (!strcmp(obj->type, blob_type))
+		return grep_sha1(opt, obj->sha1, name);
+	if (!strcmp(obj->type, commit_type) ||
+	    !strcmp(obj->type, tree_type)) {
+		struct tree_desc tree;
+		void *data;
+		int hit;
+		data = read_object_with_reference(obj->sha1, tree_type,
+						  &tree.size, NULL);
+		if (!data)
+			die("unable to read tree (%s)", sha1_to_hex(obj->sha1));
+		tree.buf = data;
+		hit = grep_tree(opt, revs, &tree, name, "");
+		free(data);
+		return hit;
+	}
+	die("unable to grep from object of type %s", obj->type);
+}
+
+static const char builtin_grep_usage[] =
+"git-grep <option>* <rev>* [-e] <pattern> [<path>...]";
+
+int cmd_grep(int argc, const char **argv, char **envp)
+{
+	struct rev_info rev;
+	const char **dst, **src;
+	int err;
+	int hit = 0;
+	int no_more_arg = 0;
+	int seen_range = 0;
+	int seen_noncommit = 0;
+	int cached = 0;
+	struct grep_opt opt;
+	struct object_list *list;
+
+	memset(&opt, 0, sizeof(opt));
+
+	/*
+	 * Interpret and remove the grep options upfront.  Sigh...
+	 */
+	for (dst = src = &argv[1]; src < argc + argv; ) {
+		const char *arg = *src++;
+		if (!no_more_arg) {
+			if (!strcmp("--", arg)) {
+				no_more_arg = 1;
+				*dst++ = arg;
+				continue;
+			}
+			if (!strcmp("--cached", arg)) {
+				cached = 1;
+				continue;
+			}
+			if (!strcmp("-e", arg)) {
+				if (src < argc + argv) {
+					opt.pattern = *src++;
+					continue;
+				}
+				usage(builtin_grep_usage);
+			}
+			if (!strcmp("-n", arg)) {
+				opt.linenum = 1;
+				continue;
+			}
+			if (!strcmp("-H", arg)) {
+				/* We always show the pathname, so this
+				 * is a noop.
+				 */
+				continue;
+			}
+			if (!strcmp("-A", arg) ||
+			    !strcmp("-B", arg) ||
+			    !strcmp("-C", arg)) {
+				unsigned num;
+				if ((argc + argv <= src) ||
+				    sscanf(*src++, "%u", &num) != 1)
+					usage(builtin_grep_usage);
+				switch (arg[1]) {
+				case 'A':
+					opt.post_context = num;
+					break;
+				case 'C':
+					opt.post_context = num;
+				case 'B':
+					opt.pre_context = num;
+					break;
+				}
+				continue;
+			}
+		}
+		*dst++ = arg;
+	}
+	if (!opt.pattern)
+		die("no pattern given.");
+
+	err = regcomp(&opt.regexp, opt.pattern, REG_NEWLINE);
+	if (err) {
+		char errbuf[1024];
+		regerror(err, &opt.regexp, errbuf, 1024);
+		regfree(&opt.regexp);
+		die("'%s': %s", opt.pattern, errbuf);
+	}
+
+	init_revisions(&rev);
+	*dst = NULL;
+	argc = setup_revisions(dst - argv, argv, &rev, NULL);
+
+	/*
+	 * Do not walk "grep -e foo master next pu -- Documentation/"
+	 * but do walk "grep -e foo master..next -- Documentation/".
+	 * Ranged request mixed with a blob or tree object, like
+	 * "grep -e foo v1.0.0:Documentation/ master..next"
+	 * so detect that and complain.
+	 */
+	for (list = rev.pending_objects; list; list = list->next) {
+		struct object *real_obj;
+		if (list->item->flags & UNINTERESTING)
+			seen_range = 1;
+		real_obj = deref_tag(list->item, NULL, 0);
+		if (strcmp(real_obj->type, commit_type))
+			seen_noncommit = 1;
+	}
+	if (!rev.pending_objects)
+		return !grep_cache(&opt, &rev, cached);
+	if (cached)
+		die("both --cached and revisions given.");
+
+	if (seen_range && seen_noncommit)
+		die("both A..B and non commit are given.");
+	if (seen_range) {
+		struct commit *commit;
+		prepare_revision_walk(&rev);
+		while ((commit = get_revision(&rev)) != NULL) {
+			unsigned char *sha1 = commit->object.sha1;
+			char *n = find_unique_abbrev(sha1, DEFAULT_ABBREV);
+			char rev_name[41];
+			strcpy(rev_name, n);
+			if (grep_object(&opt, &rev, &commit->object, rev_name))
+				hit = 1;
+			commit->buffer = NULL;
+		}
+		return !hit;
+	}
+
+	/* all of them are non-commit; do not walk, and
+	 * do not lose their names.
+	 */
+	for (list = rev.pending_objects; list; list = list->next) {
+		struct object *real_obj;
+		real_obj = deref_tag(list->item, NULL, 0);
+		if (grep_object(&opt, &rev, real_obj, list->name))
+			hit = 1;
+	}
+	return !hit;
+}
diff --git a/builtin.h b/builtin.h
index 47408a0..cf5de3b 100644
--- a/builtin.h
+++ b/builtin.h
@@ -19,5 +19,6 @@ extern int cmd_version(int argc, const c
 extern int cmd_whatchanged(int argc, const char **argv, char **envp);
 extern int cmd_show(int argc, const char **argv, char **envp);
 extern int cmd_log(int argc, const char **argv, char **envp);
+extern int cmd_grep(int argc, const char **argv, char **envp);
 
 #endif
diff --git a/git.c b/git.c
index 01b7e28..18e857d 100644
--- a/git.c
+++ b/git.c
@@ -46,6 +46,7 @@ static void handle_internal_command(int 
 		{ "log", cmd_log },
 		{ "whatchanged", cmd_whatchanged },
 		{ "show", cmd_show },
+		{ "grip", cmd_grep },
 	};
 	int i;
 
-- 
1.3.1.gd233

^ permalink raw reply related

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and
From: Jakub Narebski @ 2006-05-01  4:44 UTC (permalink / raw)
  To: git
In-Reply-To: <4455638A.3070802@vilain.net>

Take a look at complexity of that explanation. And the need for additional
commit. That balanced against all the headaches of having connectivity
header other than "parent".

Perhaps it would be better (and easier) just to say

   note prior parent^1

or

   note prior <sha1>

repeating <sha1> found in parent.


Just a thought.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: More qgit defects
From: Pavel Roskin @ 2006-05-01  2:20 UTC (permalink / raw)
  To: Marco Costalba; +Cc: ydirson, git
In-Reply-To: <e5bfff550604300313n5ebe84f7nf42c88789efe1@mail.gmail.com>

Hello, Marco!

On Sun, 2006-04-30 at 12:13 +0200, Marco Costalba wrote:
> Throwing in the tabs is a *very* big change, but, just to discuss....I
> agree on the note that in qgit we have three different approaches:
> fixed frames (revisions, file tree, affected files), detachable frames
> (patch) and separate windows (annotations).
> 
> This is a bit strange and could give an odd GUI feeling.

Agreed.

> I like the tab idea because it's clear and simple and fixes the 'many
> approaches' problem.

I'm glad you liked my idea!  And thank you for copying to the list.
qgit is meant for most git users, and they should have their voices
heard.

> What I would suggest is, at least at first step,
> do not change the main view and have only three tabs:
> 
> Tab1: Main view with revisions, file tree (hide able), affected files.
> Tab2: Patch view with patch stat and diffs
> Tab3: File history + file content/annotation view

Absolutely.  It's easier to make changes incrementally than to rewrite
everything and hunt bugs for months.  This change alone would make it
easier to work with qgit.

Once qgit can deal with more than one patch view and more than one file
view, this would provide the fix for qgit's "jumpiness".  Mere selection
of objects in listboxes shouldn't affect other tabs.

> In other words just put the frames/windows as are now in browse able
> tabs. In this way main view still gives a good amount of information
> without requiring changing the tab and the tabs are reserved for 'big
> space' needed infos only.

That would be great.  I'm eagerly waiting for new commits to test :-)

-- 
Regards,
Pavel Roskin

^ permalink raw reply

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and
From: Sam Vilain @ 2006-05-01  1:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jakub Narebski, git
In-Reply-To: <7v8xpmva9x.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:

>>We can fast-forward if (1) you pulled from "pu" the last time,
>>and (2) you haven't added anything on top of it on your own, and
>>(3) you pull from "pu" again, if the previous "pu" (i.e. your
>>"pu") is a parent of the updated "pu".  We do not need "prior"
>>for that.  The old "pu" being _one_ _of_ the parents, not even
>>necessarily be the first one, would do just fine.
>>    
>>
>
>This part may want a bit more elaboration.  
>
>Often, we see in the Linus kernel tree a fast forward of his tip
>from a recent commit Linus made to bunch of networking commits
>made by David S Miller.  For example, Linus fast forwarded to
>18118c from David's tree before making this commit:
> [...]
>To David, the commits he has in the chain between 6b426e to
>18118c obviously suited the purpose of his tree better, and that
>was why these commits were made.  And the fact Linus fast
>forwarded to the tip of David is an implicit statement by Linus
>that that results suits the purpose of Linus tree better as well
>compared to his old tip, presumably 6b426e.
>  
>

Aha, now I see reason in the madness. So, the "prior" head is not stored
in the trees, and tracking the progress of actual head transitions is
loosely defined / a research topic. But demonstrably derivable. That
works for me.

Sam.

^ permalink raw reply

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and
From: Junio C Hamano @ 2006-05-01  0:50 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git
In-Reply-To: <7vfyjuwt0v.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> writes:

> We can fast-forward if (1) you pulled from "pu" the last time,
> and (2) you haven't added anything on top of it on your own, and
> (3) you pull from "pu" again, if the previous "pu" (i.e. your
> "pu") is a parent of the updated "pu".  We do not need "prior"
> for that.  The old "pu" being _one_ _of_ the parents, not even
> necessarily be the first one, would do just fine.

This part may want a bit more elaboration.  

Often, we see in the Linus kernel tree a fast forward of his tip
from a recent commit Linus made to bunch of networking commits
made by David S Miller.  For example, Linus fast forwarded to
18118c from David's tree before making this commit:

    commit 454ac778459bc70f0a9818a6a8fd974ced11de66
    Merge: 18118cd... 301dc3e...
    Author:     Linus Torvalds <torvalds@g5.osdl.org>
    AuthorDate: Mon Apr 24 20:08:08 2006 -0700
    Commit:     Linus Torvalds <torvalds@g5.osdl.org>
    CommitDate: Mon Apr 24 20:08:08 2006 -0700

The first parent of this commit is one not made by Linus; that
is how we can tell he fast forwarded.  We cannot easily tell
where the tip of Linus tree was before he made this fast forward
(it is not recorded anywhere), but if we look at 18118c commit:

    commit 18118cdbfd1f855e09ee511d764d6c9df3d4f952
    Author:     Patrick McHardy <kaber@trash.net>
    AuthorDate: Mon Apr 24 17:18:59 2006 -0700
    Commit:     David S. Miller <davem@sunset.davemloft.net>
    CommitDate: Mon Apr 24 17:27:34 2006 -0700

        [NETFILTER]: ipt action: use xt_check_target for basic verification

we could sort-of make a guess, by looking at merge-base of
18118c and 301dc3.  By looking at

	gitk 6b426e..18118c 454ac7

we can tell that David "forked" from Linus at 6b426e commit.

What does it mean for Linus to fast-forward to the tip of David?
Earlier I said that each branch has a purpose, and replacing the
current tip commit of the branch with another commit is a
statement by the repository owner that the new commit suits the
purpose of the branch better.

To David, the commits he has in the chain between 6b426e to
18118c obviously suited the purpose of his tree better, and that
was why these commits were made.  And the fact Linus fast
forwarded to the tip of David is an implicit statement by Linus
that that results suits the purpose of Linus tree better as well
compared to his old tip, presumably 6b426e.

Earlier I suggested (or at least may have sounded as if I was
suggesting) that not recording that statement in fast-forward
situation was a bad thing, but that is not necessarily so.
Having 18118c commit as part of the history that leads to the
tip is enough as such a statement by Linus.

Now, David's tree has a tendency to be extra clean (no merges
but straight commits on top of then-current tip of Linus), but
if he had his own merge from Linus's tree, such a commit would
have had a commit from Linus tree as its second parent.  If
Linus tip remained at that "second parent" commit until David is
done and asked Linus to pull, it would result in a fast forward
via non-first-parent ancestry.  But even if that happened, the
above discussion still applies.

^ permalink raw reply

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and
From: Sam Vilain @ 2006-05-01  0:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jakub Narebski, git
In-Reply-To: <7viros1585.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:

>> * "prior" - heads that represent topic branch merges
>>    
>>
>
>This is not any different from usual "parent" at all (but you
>have to think about it a bit to realize it).
> [...]
>Once you start reading the commit parent to mean " considering
>what all of these commits have, what this new commit has suits
>my purpose better", it becomes clear that the "previous" pointer
>for a branch like my "pu" is just another "parent".
>  
>

How can you look back at the merge history and determine which of these
scenarios is the case?

It still looks like to me that you are recording two distinct types of
parent using the same type of link.  You're now just expanding the
definition of parent so they look to be the same.

Actually it might be alright if you have an extra merge commit object. 
ie, make a complete merge of the new tips, then make a second merge that
merges the two heads.  It's still a little bit of a research topic to
look at that mess and figure out which type of relationship each parent
actually is, but if you really want to decide that is that and done is
done then I guess we'll all just have to live with it or fork.

Sam.

^ permalink raw reply

* Re: [PATCH] builtin-push: resurrect parsing of Push: lines
From: Johannes Schindelin @ 2006-04-30 23:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vvesqwtza.fsf@assigned-by-dhcp.cox.net>

Hi,

On Sun, 30 Apr 2006, Junio C Hamano wrote:

> So how about this as a replacement?

Looks good. Will try tomorrow.

Ciao,
Dscho

^ permalink raw reply

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and
From: Junio C Hamano @ 2006-04-30 23:19 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git
In-Reply-To: <e32kkf$amc$1@sea.gmane.org>

Jakub Narebski <jnareb@gmail.com> writes:

>>> This is not any different from usual "parent" at all (but you
>>> have to think about it a bit to realize it).
>> 
>> I would say that "prior" is not THAT different from usual "parent",
>> rather than it is not ANY different.
>> 
>> My doubts about recording previous head of a "union" (pu-like) branch
>> is that for merge (e.g. 'pu' to 'next', cherrypick to/from 'pu', 'pu'
>> rebase) is that for merge algorithm all parents are equivalent, with
>> eventual exception of first which can be treated special ('ours').
>
> Additionally with "prior" (or at least some convention on which of parents
> is to prior head of "union (pu-like) branch) I think we could fast-forward
> such branches...

This is why I said you have to think about it a bit to realize
that the "prior" is not _ANY_ different from the ordinary parent
for something like "pu".

We can fast-forward if (1) you pulled from "pu" the last time,
and (2) you haven't added anything on top of it on your own, and
(3) you pull from "pu" again, if the previous "pu" (i.e. your
"pu") is a parent of the updated "pu".  We do not need "prior"
for that.  The old "pu" being _one_ _of_ the parents, not even
necessarily be the first one, would do just fine.

If you have built on top of the last "pu", obviously we do not
want to fast-forward with or without "prior".

Your doubts about the merge is also unfounded.  The current "pu"
head is (against my own recommendation not to do so) a hydra
cap.  It is a direct child of the previous "pu" that merges all
the leftover bits along with what was in 'next' when the commit
was made, so you could do something like this to experiment:

	git branch test-1 pu^1
	echo >>Makefile '# End of Makefile'
        git commit -m 'build on top of previous "pu"' Makefile
        git pull . pu ;# Merge whatever happened in "pu"
        

^ permalink raw reply

* Re: [PATCH] builtin-push: resurrect parsing of Push: lines
From: Junio C Hamano @ 2006-04-30 22:59 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git
In-Reply-To: <7viroqyb69.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> writes:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
>> The C'ification of push left these behind.
>
>> +#define MAX_REFSPECS 10
>> +static int current_refspec = 0;
>> +static char *refspecs_[MAX_REFSPECS];
>
> While this fix makes perfect sense, I think MAX_URI set to 16 is
> reasonable hard limit, but I am not happy about giving hard
> limit to MAX_REFSPECS -- that's a regression from the shell
> script one.

So how about this as a replacement?

 - You are going to discard refspecs_[] if there were any
   command line stuff anyway, so do not even bother storing them
   in a separate array.  Just do add_refspec() on them if you
   are going to use it after you are done.

 - The function get_uri() is not about URI anymore, so rename it
   appropriately.

 - Fix the double call to get_uri() Sean noticed.

-- >8 --
diff --git a/builtin-push.c b/builtin-push.c
index 4e659f0..9a861b5 100644
--- a/builtin-push.c
+++ b/builtin-push.c
@@ -68,14 +68,11 @@ static void set_refspecs(const char **re
 	expand_refspecs();
 }
 
-#define MAX_REFSPECS 10
-static int current_refspec = 0;
-static char *refspecs_[MAX_REFSPECS];
-
 static int get_remotes_uri(const char *repo, const char *uri[MAX_URI])
 {
 	int n = 0;
 	FILE *f = fopen(git_path("remotes/%s", repo), "r");
+	int has_explicit_refspec = refspec_nr;
 
 	if (!f)
 		return -1;
@@ -103,10 +100,14 @@ static int get_remotes_uri(const char *r
 		while (isspace(p[-1]))
 			*--p = 0;
 
-		if (!is_refspec && n < MAX_URI)
-			uri[n++] = strdup(s);
-		else if (is_refspec && current_refspec < MAX_REFSPECS)
-			refspecs_[current_refspec++] = strdup(s);
+		if (!is_refspec) {
+			if (n < MAX_URI)
+				uri[n++] = strdup(s);
+			else
+				error("more than %d URL's specified, ignoreing the rest", MAX_URI);
+		}
+		else if (is_refspec && !has_explicit_refspec)
+			add_refspec(strdup(s));
 	}
 	fclose(f);
 	if (!n)
@@ -146,13 +147,17 @@ static int get_branches_uri(const char *
 	return 1;
 }
 
-static int get_uri(const char *repo, const char *uri[MAX_URI])
+/*
+ * Read remotes and branches file, fill the push target URI
+ * list.  If there is no command line refspecs, read Push: lines
+ * to set up the *refspec list as well.
+ * return the number of push target URIs
+ */
+static int read_config(const char *repo, const char *uri[MAX_URI])
 {
 	int n;
 
 	if (*repo != '/') {
-		current_refspec = 0;
-
 		n = get_remotes_uri(repo, uri);
 		if (n > 0)
 			return n;
@@ -169,18 +174,15 @@ static int get_uri(const char *repo, con
 static int do_push(const char *repo)
 {
 	const char *uri[MAX_URI];
-	int i, n = get_uri(repo, uri);
+	int i, n;
 	int remote;
 	const char **argv;
 	int argc;
 
-	n = get_uri(repo, uri);
+	n = read_config(repo, uri);
 	if (n <= 0)
 		die("bad repository '%s'", repo);
 
-	if (refspec_nr == 0)
-		set_refspecs((const char**)refspecs_, current_refspec);
-
 	argv = xmalloc((refspec_nr + 10) * sizeof(char *));
 	argv[0] = "dummy-send-pack";
 	argc = 1;

^ permalink raw reply related

* Re: [PATCH 3/3] fetch: optionally store the current remote information in the config
From: Junio C Hamano @ 2006-04-30 22:21 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git
In-Reply-To: <Pine.LNX.4.63.0604301743370.3641@wbgn013.biozentrum.uni-wuerzburg.de>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> On Sun, 30 Apr 2006, sean wrote:
>
>> Well I agree with you that doing something like this is important.  We
>> should take this moment of moving things to the config file to correct
>> the terminology and help make things clear.  We're not storing "Pull:"
>> information, we're storing config/remote.$NICK.fetch data.  It's really
>> used just by fetch, pull just happens to call fetch.
>
> I have no strong feelings either way.

I have a strong feeling that naming them "Pull: " was a mistake
to begin with and they should have been called "Fetch: " ;-).

>> Along that same line of reasoning, it seems more appropriate to use 
>> git fetch --store ...  rather than git pull --store ... to set this
>> information.
>
> Both works.

I'd rather not see --store added to either fetch nor pull.

While I would agree --store would appear to be a convenient
short-hand for first timers, I strongly suspect this will cause
more confusion and complexity down the line.  If a topic that
interests you appears today at the remote, and you start
following it by adding --store when you fetch from it, and later
when the topic disappears at the remote in two weeks because it
is fully cooked and merged into somewhere else, you would need
to have a way to --unstore it somehow, and at that time the
first timer needs to learn repo-config to deal with that failure
anyway.  Or you need to teach git-fetch that fetch failure of a
branch is actually OK as long as all of the following holds
true:

 - the branch was added with --store in the past; the user is
   not actively asking for it in the current request, saying "I
   want to fetch from there THIS TIME".

 - the fetch failed only because the remote repository droped
   the branch (IOW you need to tell connection and protocol
   failure from "branch disappeared"case),

 - the fetch is not being done to merge it into the current branch,

or something complicated like that, and unstore it automatically
for the user.

Other than that, I do not have strong feelings against using the
standard .git/config to store what we store in .git/remotes/*
currently, and I also suspect doing so would be a prerequiste
first step to do "per local branch configuration" (e.g. "when on
this branch, merge from this branch by default") some people
seem to want.

^ permalink raw reply

* Re: [PATCH] builtin-push: resurrect parsing of Push: lines
From: Junio C Hamano @ 2006-04-30 22:02 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git
In-Reply-To: <Pine.LNX.4.63.0604301405150.2026@wbgn013.biozentrum.uni-wuerzburg.de>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> The C'ification of push left these behind.

> +#define MAX_REFSPECS 10
> +static int current_refspec = 0;
> +static char *refspecs_[MAX_REFSPECS];

While this fix makes perfect sense, I think MAX_URI set to 16 is
reasonable hard limit, but I am not happy about giving hard
limit to MAX_REFSPECS -- that's a regression from the shell
script one.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox