git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* git diff: support "-U" and "--unified" options properly
@ 2006-05-13 20:23 Linus Torvalds
  2006-05-13 20:30 ` Junio C Hamano
  2006-05-13 20:59 ` Junio C Hamano
  0 siblings, 2 replies; 21+ messages in thread
From: Linus Torvalds @ 2006-05-13 20:23 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List


We used to parse "-U" and "--unified" as part of the GIT_DIFF_OPTS 
environment variable, but strangely enough we would _not_ parse them as 
part of the normal diff command line (where we only accepted "-u").

This adds parsing of -U and --unified, both with an optional numeric 
argument. So now you can just say

	git diff --unified=5

to get a unified diff with a five-line context, instead of having to do 
something silly like

	GIT_DIFF_OPTS="--unified=5" git diff -u

(that silly format does continue to still work, of course).

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---

[ Maybe this can still hit 1.3.3? ]

diff --git a/combine-diff.c b/combine-diff.c
index 8a8fe38..64b20cc 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -608,6 +608,7 @@ static int show_patch_diff(struct combin
 	int abbrev = opt->full_index ? 40 : DEFAULT_ABBREV;
 	mmfile_t result_file;
 
+	context = opt->context;
 	/* Read the result of merge first */
 	if (!working_tree_file)
 		result = grab_blob(elem->sha1, &result_size);
diff --git a/diff.c b/diff.c
index 7a7b839..be925a3 100644
--- a/diff.c
+++ b/diff.c
@@ -558,7 +558,7 @@ static void builtin_diff(const char *nam
 
 		ecbdata.label_path = lbl;
 		xpp.flags = XDF_NEED_MINIMAL;
-		xecfg.ctxlen = 3;
+		xecfg.ctxlen = o->context;
 		xecfg.flags = XDL_EMIT_FUNCNAMES;
 		if (!diffopts)
 			;
@@ -1182,6 +1182,7 @@ void diff_setup(struct diff_options *opt
 	options->line_termination = '\n';
 	options->break_opt = -1;
 	options->rename_limit = -1;
+	options->context = 3;
 
 	options->change = diff_change;
 	options->add_remove = diff_addremove;
@@ -1222,11 +1223,60 @@ int diff_setup_done(struct diff_options 
 	return 0;
 }
 
+int opt_arg(const char *arg, int arg_short, const char *arg_long, int *val)
+{
+	char c, *eq;
+	int len;
+
+	if (*arg != '-')
+		return 0;
+	c = *++arg;
+	if (!c)
+		return 0;
+	if (c == arg_short) {
+		c = *++arg;
+		if (!c)
+			return 1;
+		if (val && isdigit(c)) {
+			char *end;
+			int n = strtoul(arg, &end, 10);
+			if (*end)
+				return 0;
+			*val = n;
+			return 1;
+		}
+		return 0;
+	}
+	if (c != '-')
+		return 0;
+	arg++;
+	eq = strchr(arg, '=');
+	if (eq)
+		len = eq - arg;
+	else
+		len = strlen(arg);
+	if (!len || strncmp(arg, arg_long, len))
+		return 0;
+	if (eq) {
+		int n;
+		char *end;
+		if (!isdigit(*++eq))
+			return 0;
+		n = strtoul(eq, &end, 10);
+		if (*end)
+			return 0;
+		*val = n;
+	}
+	return 1;
+}
+
 int diff_opt_parse(struct diff_options *options, const char **av, int ac)
 {
 	const char *arg = av[0];
 	if (!strcmp(arg, "-p") || !strcmp(arg, "-u"))
 		options->output_format = DIFF_FORMAT_PATCH;
+	else if (opt_arg(arg, 'U', "unified", &options->context))
+		options->output_format = DIFF_FORMAT_PATCH;
 	else if (!strcmp(arg, "--patch-with-raw")) {
 		options->output_format = DIFF_FORMAT_PATCH;
 		options->with_raw = 1;
diff --git a/diff.h b/diff.h
index d052608..bef586d 100644
--- a/diff.h
+++ b/diff.h
@@ -32,6 +32,7 @@ struct diff_options {
 		 full_index:1,
 		 silent_on_remove:1,
 		 find_copies_harder:1;
+	int context;
 	int break_opt;
 	int detect_rename;
 	int line_termination;

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: git diff: support "-U" and "--unified" options properly
  2006-05-13 20:23 git diff: support "-U" and "--unified" options properly Linus Torvalds
@ 2006-05-13 20:30 ` Junio C Hamano
  2006-05-13 20:59 ` Junio C Hamano
  1 sibling, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2006-05-13 20:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List

Linus Torvalds <torvalds@osdl.org> writes:

> We used to parse "-U" and "--unified" as part of the GIT_DIFF_OPTS 
> environment variable, but strangely enough we would _not_ parse them as 
> part of the normal diff command line (where we only accepted "-u").
>
> This adds parsing of -U and --unified, both with an optional numeric 
> argument. So now you can just say
>
> 	git diff --unified=5
>
> [ Maybe this can still hit 1.3.3? ]

I think so.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git diff: support "-U" and "--unified" options properly
  2006-05-13 20:23 git diff: support "-U" and "--unified" options properly Linus Torvalds
  2006-05-13 20:30 ` Junio C Hamano
@ 2006-05-13 20:59 ` Junio C Hamano
  2006-05-13 21:05   ` Linus Torvalds
  1 sibling, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2006-05-13 20:59 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List

Linus Torvalds <torvalds@osdl.org> writes:

> [ Maybe this can still hit 1.3.3? ]

Ah, we did not pass the diffopt to the function builtin_diff() in
1.3.X series, so not really.  We could forward port but I do not
know if it is worth the effort of backporting while stripping
the updates of the whole chain of diff generation from the post
1.3.0 "master" work.  I have to think a bit.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git diff: support "-U" and "--unified" options properly
  2006-05-13 20:59 ` Junio C Hamano
@ 2006-05-13 21:05   ` Linus Torvalds
  2006-05-13 21:22     ` Junio C Hamano
  0 siblings, 1 reply; 21+ messages in thread
From: Linus Torvalds @ 2006-05-13 21:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List



On Sat, 13 May 2006, Junio C Hamano wrote:
>
> Linus Torvalds <torvalds@osdl.org> writes:
> 
> > [ Maybe this can still hit 1.3.3? ]
> 
> Ah, we did not pass the diffopt to the function builtin_diff() in
> 1.3.X series, so not really.

Ahh, ok. Never mind. It's not like people have been clamoring for it, it 
just seemed to be such a _silly_ thing.

Might as well just go into the curren development tree, and then we'll 
have it fixed eventually (1.4.0?)

		Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git diff: support "-U" and "--unified" options properly
  2006-05-13 21:05   ` Linus Torvalds
@ 2006-05-13 21:22     ` Junio C Hamano
  2006-05-13 22:39       ` Linus Torvalds
                         ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Junio C Hamano @ 2006-05-13 21:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> On Sat, 13 May 2006, Junio C Hamano wrote:
>>
>> Linus Torvalds <torvalds@osdl.org> writes:
>> 
>> > [ Maybe this can still hit 1.3.3? ]
>> 
>> Ah, we did not pass the diffopt to the function builtin_diff() in
>> 1.3.X series, so not really.
>
> Ahh, ok. Never mind. It's not like people have been clamoring for it, it 
> just seemed to be such a _silly_ thing.
>
> Might as well just go into the curren development tree, and then we'll 
> have it fixed eventually (1.4.0?)

No question about that part.  I've been meaning to start drawing
the line of what to have in 1.4 and what to leave out, but the
last week was shot so I haven't got around to.

A rough outline.

I'd like to have the following topics from "next":

 * cvsserver and cvsexportcommit updates (ml/cvs)

   Ready.

 * config syntax (lt/config)

   Ready.

 * built-in grep (jc/grep)

   Ready.

 * built-in format-patch (js/fmt-patch)

   Some features are still missing compared to the script
   version.

 * remotes/ information from .git/config (js/fetchconfig)

   This by itself is more or less ready, but I would like to
   further adjust it to the "per branch configuration"
   discussion before pushing it out.

   I'd like to eventually arrange things like this.

   [branch "master"]
	remote = "ko-private"
	; prevent "reset --hard" from rewinding past this.
        rewind-barrier = refs/heads/ko-master

   ; my private build areas on the kernel.org machines
   [remote "ko-private"]
   	url = "x86-64-build.kernel.org:git"
   	url = "i386-build.kernel.org:git"
        push = master:origin
        push = next:next
        push = +pu:pu
        push = maint:maint

   ; for publishing and keeping track of what I pushed there last time
   [remote "ko"]
   	url = "kernel.org:/pub/scm/git/git.git"
        push = master:master
        push = next:next
        push = +pu:pu
        push = maint:maint
	fetch = master:ko-master
	fetch = next:ko-next
        fetch = +pu:ko-pu
        fetch = maint:ko-maint

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git diff: support "-U" and "--unified" options properly
  2006-05-13 21:22     ` Junio C Hamano
@ 2006-05-13 22:39       ` Linus Torvalds
  2006-05-15  0:39         ` Junio C Hamano
  2006-05-14  7:00       ` Martin Langhoff
  2006-05-14 12:57       ` Josef Weidendorfer
  2 siblings, 1 reply; 21+ messages in thread
From: Linus Torvalds @ 2006-05-13 22:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Sat, 13 May 2006, Junio C Hamano wrote:
> 
>  * built-in grep (jc/grep)
> 
>    Ready.

I'm not entirely convinced.

For the kernel, I currently can do a 

	git grep some-random-string

in about half a second.

The new built-in grep is about ten times slower.

Before:

   [torvalds@g5 linux]$ /usr/bin/time git grep some-random-string
   Command exited with non-zero status 123
   0.29user 0.30system 0:00.52elapsed 113%CPU (0avgtext+0avgdata 0maxresident)k
   0inputs+0outputs (0major+3206minor)pagefaults 0swaps

After:

   [torvalds@g5 linux]$ /usr/bin/time git grep some-random-string
   Command exited with non-zero status 1
   4.60user 0.33system 0:04.98elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
   0inputs+0outputs (0major+7369minor)pagefaults 0swaps

and that "half a second" vs "five seconds" difference is really 
noticeable.

Right now I do "git grep" as a random "ok, where was it", and it works 
very well, because it's basically instantaneous. And the difference 
between "instantaneous" and "5 seconds" is very big (the five seconds is 
also long enough that the CPU fan comes on on my G5, which is my sign of 
"too much work for the CPU".

I haven't looked at _why_ the builtin grep is ten times slower. I suspect 
it's just the regexp library being a lot slower than the external 
optimized grep, but it may also be just overhead (it looks, for example, 
like the builtin grep does all matches just one line at a time. And it 
actually reads the file in, when mmap might be more efficient, I dunno). 

Regardless, it's a huge downer.

		Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git diff: support "-U" and "--unified" options properly
  2006-05-13 21:22     ` Junio C Hamano
  2006-05-13 22:39       ` Linus Torvalds
@ 2006-05-14  7:00       ` Martin Langhoff
  2006-05-14 12:57       ` Josef Weidendorfer
  2 siblings, 0 replies; 21+ messages in thread
From: Martin Langhoff @ 2006-05-14  7:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

On 5/14/06, Junio C Hamano <junkio@cox.net> wrote:
> I'd like to have the following topics from "next":
>
>  * cvsserver and cvsexportcommit updates (ml/cvs)
>
>    Ready.

Yeah! What's the timeline for 1.4.0?

cheers,



martin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git diff: support "-U" and "--unified" options properly
  2006-05-13 21:22     ` Junio C Hamano
  2006-05-13 22:39       ` Linus Torvalds
  2006-05-14  7:00       ` Martin Langhoff
@ 2006-05-14 12:57       ` Josef Weidendorfer
  2006-05-14 17:36         ` Junio C Hamano
  2 siblings, 1 reply; 21+ messages in thread
From: Josef Weidendorfer @ 2006-05-14 12:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Saturday 13 May 2006 23:22, you wrote:
>  * remotes/ information from .git/config (js/fetchconfig)
> ...
>    [branch "master"]
> 	remote = "ko-private"

Why is this line needed? In this example, what is the relationship
of local "master" with the remote? I think it is enough to specify
the local upstream branch:

 [branch "master"]
    origin = "ko-master"

For the default pull action when on "master", we would have to look
up for remotes with a fetch line fetching into "ko-master", which
could be cumbersome. Besides, the fetch lines specify default actions
when fetching from a given remote, and there is no garantuee that we
want fetching into "ko-master" as any default action.

So we need

 [branch "ko-master"]
    tracksremote = "master of ko-private"

This also would specify that we are not allowed to commit on "ko-master".

If we do not want to have a local tracking branch at all, we would have

 [branch "master"]
    origin = "master of ko-private"

or

 [branch "master"]
    origin = "master of kernel.org:/pub/scm/git/git.git"


For a default push action when on master, I would add

 [branch "master"]
    push = "master of ko-private"

or alternatively

 [branch "master"]
    push = "master of kernel.org:/pub/scm/git/git.git"

> ...
>    [remote "ko"]
>    	url = "kernel.org:/pub/scm/git/git.git"
>       push = master:master
> ...
> 	fetch = master:ko-master

These specifications more or less are independent from the above,
as it specifies the defaults when fetching/pushing to the specified remote.

Josef

PS: Patch pending...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git diff: support "-U" and "--unified" options properly
  2006-05-14 12:57       ` Josef Weidendorfer
@ 2006-05-14 17:36         ` Junio C Hamano
  2006-05-14 20:49           ` Branch relationships (was:Re: git diff: support "-U" and "--unified" options properly) Josef Weidendorfer
  0 siblings, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2006-05-14 17:36 UTC (permalink / raw)
  To: Josef Weidendorfer; +Cc: git

Josef Weidendorfer <Josef.Weidendorfer@gmx.de> writes:

> On Saturday 13 May 2006 23:22, you wrote:
>>  * remotes/ information from .git/config (js/fetchconfig)
>> ...
>>    [branch "master"]
>> 	remote = "ko-private"
>
> Why is this line needed? In this example, what is the relationship
> of local "master" with the remote? I think it is enough to specify
> the local upstream branch:
>
>  [branch "master"]
>     origin = "ko-master"

I confused you by not describing the flow. 

There are four repositories involved:

 a. siamese:/git (my local development repo)
 b. x86-64.kernel.org:~/git (my kernel.org build repo)
 c. i386.kernel.org:~/git   (ditto but for i386)
 d. kernel.org:/pub/scm/git/git.git (public distribution point)

The workflow is:

 1. maint/master/next/pu are prepared on a.

 2. maint/master/next/pu are pushed to b. and c. as
    maint/origin/next/pu.  b. and c. always have their "master"
    checked out.

 3. I go to b. and c., and "pull . origin" to start my build
    there on all four branches.

 4. When things go well, I come back to a.

 5. Then maint/master/next/pu are pushed to d. without renaming.

 6. To keep track of what have been pushed, maint/master/next/pu
    from d. are fetched to a., with ko- prefixed.  There is no
    "when on this branch" involved in this step.  Regardless of
    which branch I currently on, the fetch goes fine.  I never
    check out ko-* branches on a, nor merge from ko-* branches.

 7. Go back to step 1 and start the next cycle.  ko-maint, ko-master
    and ko-next reminds me not to rewind maint/master/next while I
    shuffle the changes to discard botched commits beyond them.

> So we need
>
>  [branch "ko-master"]
>     tracksremote = "master of ko-private"
>
> This also would specify that we are not allowed to commit on "ko-master".

For my workflow, it is "master of ko"; your notation expresses
the same constraints more explicitly by being more special
purpose: "This tracks that one so never touch it any way other
than fetching into it" (we may not even allow checking out
"ko-master" -- I dunno).  

One issue you might want to think about is it is far more
efficient to fetch multiple branches from the same git://... URL
is than fetching them one by one.  The push has exactly the same
property.

Another thing is the above talks only about constraints, and the
user has to go all over the config file to find "xxx to
ko-private" in order to figure out what happens when he says
"pull ko-private".

>> ...
>>    [remote "ko"]
>>    	url = "kernel.org:/pub/scm/git/git.git"
>>       push = master:master
>> ...

>> 	fetch = master:ko-master
>

> These specifications more or less are independent from the above,
> as it specifies the defaults when fetching/pushing to the specified remote.

Not really; and what you introduced can conflict with [remote]
specification.  If you have [branch "ko-master"] that says it
tracks remote "master" from "ko" repository, your [remote "ko"]
should have not say "fetch = foobla:ko-master", so in that sense
it is redundant and inviting later inconsistency.  The only
information you added with "tracksremote" is the branch is used
to track what is on remote (implying we'd better not touch it
ourselves), so I'd say this would make sense

	[branch "ko-master"]
        	tracksremote ; bool!
	[remote "ko"]
        	url = git://git.kernel.org/pub/scm/git/git.git
        	fetch = master:ko-master

or alternatively this would:

	[branch "ko-master"]
		tracksremote = "master of ko"
	[remote "ko"]
        	url = git://git.kernel.org/pub/scm/git/git.git

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Branch relationships (was:Re: git diff: support "-U" and "--unified" options properly)
  2006-05-14 17:36         ` Junio C Hamano
@ 2006-05-14 20:49           ` Josef Weidendorfer
  2006-05-14 21:20             ` Branch relationships Junio C Hamano
  0 siblings, 1 reply; 21+ messages in thread
From: Josef Weidendorfer @ 2006-05-14 20:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Sunday 14 May 2006 19:36, Junio C Hamano wrote:
> Josef Weidendorfer <Josef.Weidendorfer@gmx.de> writes:
> 
> > On Saturday 13 May 2006 23:22, you wrote:
> >>  * remotes/ information from .git/config (js/fetchconfig)
> >> ...
> >>    [branch "master"]
> >> 	remote = "ko-private"

I still do not understand the semantic of this line.
Is this supposed to do "git pull ko-private" as default pull
action and "git push ko-private" as default push?

So using

>   ; my private build areas on the kernel.org machines
>    [remote "ko-private"]
>         url = "x86-64-build.kernel.org:git"
>         url = "i386-build.kernel.org:git"
>         push = master:origin
> ...

specifies that "git push" should push to both URLs?
This is really confusing: Is the remote "ko-private" now at
"x86-64-build.kernel.org:git" or at "i386-build.kernel.org:git" ?

Probably its better to have
  [branch "master"]
    push-remote = "ko-private-x86-64 ko-private-i386"
and entries for 2 remotes.

Neverless, I missed the info "Which branch should be merged in a default
pull after fetching the given branches from remote". I understand that
this is not needed in your workflow, as you have no upstream.

> >  [branch "ko-master"]
> >     tracksremote = "master of ko-private"
> >
> > This also would specify that we are not allowed to commit on "ko-master".
> 
> For my workflow, it is "master of ko"; your notation expresses
> the same constraints more explicitly by being more special
> purpose

Why is this more special purpose?
IMHO the only difference is that your proposal puts this information into
the remote, and I am putting it to the branch as attribute.

I chose it this way because I always thought that the Pull-lines in .git/remotes
where only about giving a default fetch action, ie. they are optionally. But
even if I do not want to have a remote branch fetched on default, I want to put
the information "XXX of remote RRR is the upstream of local YYY" into some place
to be able to do a pull when on that branch. 

> : "This tracks that one so never touch it any way other 
> than fetching into it" (we may not even allow checking out
> "ko-master" -- I dunno).  
> 
> One issue you might want to think about is it is far more
> efficient to fetch multiple branches from the same git://... URL
> is than fetching them one by one.  The push has exactly the same
> property.

Perhaps an option "when pulling into this branch from a remote, also fetch
all branches tracked from this remote", or another one "when fetching/pulling
into this branch, update all other branches, too".

> Another thing is the above talks only about constraints, and the
> user has to go all over the config file to find "xxx to
> ko-private" in order to figure out what happens when he says
> "pull ko-private".

I wasn't intending to change the current semantics of the remote section,
which specifies what to do when pulling/fetching/pushing from/into this
remote.

But you are right; this could get inconsistent. Better find a way to
not be able to specify an inconsistent config in the first place.

However, such an inconsistent config is already possible today:

 [remote "r1"]
    fetch = master:master
 [remote "r2"]
    fetch = master:master

This is not really better than

 [branch "master"]
    tracksremote "master of r1"
 [remote "r1"]
    fetch = "notmaster:master"

We already print out a warning when updating a branch tracking a remote
is not a fast-forward.
So I would leave it as it is, and allow both ways.

This all would be mood if we use refs/remotes consequently: existance of
a branch "refs/remotes/r1/master" implicitly specifies that this is
a branch tracking a remote, where the remote is named "r1" and the
remote branch is "master". 

With

 [core]
   useSeparateRemote ; bool

 [r1]
   url = ...
   fetch = master branch2 branch3

a "git fetch r1" would track given remote branches in "remotes/r1/<branch>",
and a

 [branch "master"]
   origin = "remotes/r1/master"

would work even without the "fetch" line in the remote section.
Here, we can allow both ways without getting inconsistent, as we implicitly
have

 [branch "remotes/r1/master"]
   tracksremote = "master of r1"

Regarding syntax: Instead of the "... of ..." values I would prefer two keys
"remotebranch"/"remote".

Josef

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Branch relationships
  2006-05-14 20:49           ` Branch relationships (was:Re: git diff: support "-U" and "--unified" options properly) Josef Weidendorfer
@ 2006-05-14 21:20             ` Junio C Hamano
  2006-05-14 22:01               ` Josef Weidendorfer
  0 siblings, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2006-05-14 21:20 UTC (permalink / raw)
  To: Josef Weidendorfer; +Cc: git

Josef Weidendorfer <Josef.Weidendorfer@gmx.de> writes:

> On Sunday 14 May 2006 19:36, Junio C Hamano wrote:
>> Josef Weidendorfer <Josef.Weidendorfer@gmx.de> writes:
>> 
>> > On Saturday 13 May 2006 23:22, you wrote:
>> >>  * remotes/ information from .git/config (js/fetchconfig)
>> >> ...
>> >>    [branch "master"]
>> >> 	remote = "ko-private"
>
> I still do not understand the semantic of this line.
> Is this supposed to do "git pull ko-private" as default pull
> action and "git push ko-private" as default push?
>
> So using
>
>>   ; my private build areas on the kernel.org machines
>>    [remote "ko-private"]
>>         url = "x86-64-build.kernel.org:git"
>>         url = "i386-build.kernel.org:git"
>>         push = master:origin
>> ...
>
> specifies that "git push" should push to both URLs?

Exactly.  I would _want_ to push to both with single action when
I say "git push ko-private".  Actually I have _never_ felt need
to, but Linus wanted it first and I think it makes sense.

> This is really confusing: Is the remote "ko-private" now at
> "x86-64-build.kernel.org:git" or at "i386-build.kernel.org:git" ?

Neither.

Perhaps reading my comment on #git log from yesterday or so,
where I talk about ".git/branches is about configuration
per-local branches, and .git/remotes is about giving a
short-hand to remote repositories that are used often",
might be helpful.  

> Neverless, I missed the info "Which branch should be merged in a default
> pull after fetching the given branches from remote". I understand that
> this is not needed in your workflow, as you have no upstream.

Not with git but in other projects I do.

That is what "branch.foo.remote = this-remote" is about.  When
working on foo branch, use what is described in
remote."this-remote" section for "git pull".
remote."this-remote" would have url and one or more fetch lines,
and as usual the first fetch line would say "merge this thing".
This gives the continuity in semantics while migrating from
.git/remotes/foo to [remote "foo"] section of the config file.

>> >  [branch "ko-master"]
>> >     tracksremote = "master of ko-private"
>> >
>> > This also would specify that we are not allowed to commit on "ko-master".
>> 
>> For my workflow, it is "master of ko"; your notation expresses
>> the same constraints more explicitly by being more special
>> purpose
>
> Why is this more special purpose?

It is more special purpose than just saying "do not touch this
myself".  You are saying "do not touch this myself because it
tracks a remote".  I do not think it is necessary to state the
reason.

> Perhaps an option "when pulling into this branch from a remote, also fetch
> all branches tracked from this remote", or another one "when fetching/pulling
> into this branch, update all other branches, too".

You already have "when fetching from this remote, fetch all
these branches and store them in the tracking branches", and
that is what .git/remotes/ is about.  The [remote] section in
the configuration file has the same semantics.  What problem are
you trying to fix by doing things differently?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Branch relationships
  2006-05-14 21:20             ` Branch relationships Junio C Hamano
@ 2006-05-14 22:01               ` Josef Weidendorfer
  2006-05-14 22:19                 ` Junio C Hamano
  0 siblings, 1 reply; 21+ messages in thread
From: Josef Weidendorfer @ 2006-05-14 22:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Sunday 14 May 2006 23:20, Junio C Hamano wrote:
> Josef Weidendorfer <Josef.Weidendorfer@gmx.de> writes:
> >>   ; my private build areas on the kernel.org machines
> >>    [remote "ko-private"]
> >>         url = "x86-64-build.kernel.org:git"
> >>         url = "i386-build.kernel.org:git"
> >>         push = master:origin
> >> ...
> >
> > specifies that "git push" should push to both URLs?
> 
> Exactly.  I would _want_ to push to both with single action when
> I say "git push ko-private".  Actually I have _never_ felt need
> to, but Linus wanted it first and I think it makes sense.

Hmmm. Isn't this a solution for a very special use-case?
You even can not specify different push lines for the 2 URLs.
I think you want an alias name for a group of remotes here?
Why not

 [remotealias "ko-private"]
   remote = "ko-private-x86-64"
   remote = "ko-private-i386"

Neverless, this wasn't the thing I was after.

> That is what "branch.foo.remote = this-remote" is about.

OK. Sorry, I somehow missed this.

> When 
> working on foo branch, use what is described in
> remote."this-remote" section for "git pull".
> remote."this-remote" would have url and one or more fetch lines,
> and as usual the first fetch line would say "merge this thing".
> This gives the continuity in semantics while migrating from
> .git/remotes/foo to [remote "foo"] section of the config file.

The only thing I wanted to discuss in this thread is exactly the
limitation of "as usual the first fetch line..." by the branch
attribute "origin", which lead me to the alternative "tracksremote"
attribute. Sorry about any confusion.

I suppose "branch.<branch name>.origin" is still the way to go for
specifying the upstream?

Josef

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Branch relationships
  2006-05-14 22:01               ` Josef Weidendorfer
@ 2006-05-14 22:19                 ` Junio C Hamano
  2006-05-14 23:04                   ` Josef Weidendorfer
  0 siblings, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2006-05-14 22:19 UTC (permalink / raw)
  To: Josef Weidendorfer; +Cc: git

Josef Weidendorfer <Josef.Weidendorfer@gmx.de> writes:

>> Exactly.  I would _want_ to push to both with single action when
>> I say "git push ko-private".  Actually I have _never_ felt need
>> to, but Linus wanted it first and I think it makes sense.
>
> Hmmm. Isn't this a solution for a very special use-case?
> You even can not specify different push lines for the 2 URLs.
> I think you want an alias name for a group of remotes here?

Perhaps.  

The "push to multiple places" is mostly for Linus backing things
up, and I tend to agree that your "alias" notation makes things
appear to be more general.  However, I do not think you would
want to push to two different places with different branch
mappings, so I suspect that generality is not buying you much
while making things more easily misconfigured.

> I suppose "branch.<branch name>.origin" is still the way to go for
> specifying the upstream?

Probably "origin" is a better name for it; I was assuming
"branch.<branch name>.remote = foo" refers to a [remote "foo"]
section and means "when on this branch, pull from foo and merge
from it".

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Branch relationships
  2006-05-14 22:19                 ` Junio C Hamano
@ 2006-05-14 23:04                   ` Josef Weidendorfer
  2006-05-14 23:55                     ` Junio C Hamano
  0 siblings, 1 reply; 21+ messages in thread
From: Josef Weidendorfer @ 2006-05-14 23:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Monday 15 May 2006 00:19, you wrote:
> > I suppose "branch.<branch name>.origin" is still the way to go for
> > specifying the upstream?
> 
> Probably "origin" is a better name for it; I was assuming
> "branch.<branch name>.remote = foo" refers to a [remote "foo"]
> section and means "when on this branch, pull from foo and merge
> from it".

Maybe.

But there is a misunderstanding. I wanted the branch attribute "origin"
to specify the upstream _branch_, not a remote.
After a "git clone", we would have

 [remote "origin"]
   url = ...
   fetch = master:origin

 [branch "master"]
   origin = "origin" ; upstream of master is local branch "origin"

 [branch "origin"]
   tracksremote ; bool

Now adding a further development branch for remote branch "topic", we would add

 [remote "origin"]
   ...
   fetch = topic:tracking-topic

 [branch "local-topic"]
   origin = "tracking-topic"

 [branch "tracking-topic"]
   tracksremote

Now, a "git pull" on branch "local-topic" does the right thing: it fetches
from remote "origin", as "tracking-topic" is given in a refspec there, and merges
"tracking-topic" to the current branch "local-topic", as given by the origin
attribute.

This also extends to local upstreams: a "git checkout -b topic2 master" would
append

 [branch "topic2"]
   origin = "master"

and a "git pull" on topic2 would merge the upstream "master".

Josef

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Branch relationships
  2006-05-14 23:04                   ` Josef Weidendorfer
@ 2006-05-14 23:55                     ` Junio C Hamano
  2006-05-15  1:48                       ` Josef Weidendorfer
  0 siblings, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2006-05-14 23:55 UTC (permalink / raw)
  To: Josef Weidendorfer; +Cc: git

Josef Weidendorfer <Josef.Weidendorfer@gmx.de> writes:

> On Monday 15 May 2006 00:19, you wrote:
>> > I suppose "branch.<branch name>.origin" is still the way to go for
>> > specifying the upstream?
>> 
>> Probably "origin" is a better name for it; I was assuming
>> "branch.<branch name>.remote = foo" refers to a [remote "foo"]
>> section and means "when on this branch, pull from foo and merge
>> from it".
>
> Maybe.
>
> But there is a misunderstanding. I wanted the branch attribute "origin"
> to specify the upstream _branch_, not a remote.
>
> After a "git clone", we would have
>
>  [remote "origin"]
>    url = ...
>    fetch = master:origin
>
>  [branch "master"]
>    origin = "origin" ; upstream of master is local branch "origin"

Doesn't that arrangement force people to use tracking branches?
IOW, "git pull somewhere that-head" fetches that-head, without
storing it anywhere in the local repository as a tracking
branch, and immediately merges it to the current branch.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git diff: support "-U" and "--unified" options properly
  2006-05-13 22:39       ` Linus Torvalds
@ 2006-05-15  0:39         ` Junio C Hamano
  2006-05-15  0:58           ` Linus Torvalds
  0 siblings, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2006-05-15  0:39 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> On Sat, 13 May 2006, Junio C Hamano wrote:
>> 
>>  * built-in grep (jc/grep)
>> 
>>    Ready.
>
> I'm not entirely convinced.

I am not either after seeing your numbers and trying them
myself.  I was going to look at it myself over the weekend, but
I ended up spending most of the time migrating my environment,
so no progress on that front yet.  Sorry.



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git diff: support "-U" and "--unified" options properly
  2006-05-15  0:39         ` Junio C Hamano
@ 2006-05-15  0:58           ` Linus Torvalds
  2006-05-15  1:51             ` Junio C Hamano
  2006-05-15  3:49             ` Linus Torvalds
  0 siblings, 2 replies; 21+ messages in thread
From: Linus Torvalds @ 2006-05-15  0:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Sun, 14 May 2006, Junio C Hamano wrote:
> 
> I am not either after seeing your numbers and trying them
> myself.  I was going to look at it myself over the weekend, but
> I ended up spending most of the time migrating my environment,
> so no progress on that front yet.  Sorry.

I was looking at it, and I just suspect that "grep" is very optimized.

At the same time, the built-in one had many good features, notably that 
you could grep the cached copy and from a specific version. So I think it 
makes sense.

I have this suspicion that the best solution is to just handle the "grep 
against current tree" separately, and even make that something that only 
gets enabled on UNIX - I assume that trying to execve() grep externally 
would suck on windows again.

So I would actually assume that the solution is to simply make 
grep_cache() have a simple

	#ifdef __unix__
		if (!cached) {
			hit = external_grep(opt, paths, cached);
			if (hit >= 0)
				return hit;
		}
	#endif

at the top, so that we'd have the best of both worlds.

The "external_grep()" should look something like this:

	#define MAXARGS 1000

	static int external_grep(struct grep_opt *opt, const char **paths, int cached)
	{
		int nr;
		char *argv[MAXARGS];

		read_cache();
		argv[0] = "grep";
		argv[1] = "-e";
		argv[2] = opt->pattern;	/* whatever */
		argv[3] = "--";
		argc = 4;

		for (nr = 0; nr < active_nr; nr++) {
			struct cache_entry *ce = active_cache[nr];
			if (ce_stage(ce) || !S_ISREG(ntohl(ce->ce_mode)))
				continue;
			if (!pathspec_matches(paths, ce->name))
				continue;
			argv[argc++] = ce->name;
			if (argc < MAXARGS)
				continue;
			hit += exec_grep(argv, argc);
			argc = 4;
		}
		if (argc > 4)
			hit += exec_grep(argv, argc);
		return hit;
	}

and you're all done. Except for testing, and fixing my stupid bugs, which 
I'm too lazy to do.

The whole "exec_grep()" should basically be the same as "spawn_prog()". 
You get the idea.

Anybody?

		Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Branch relationships
  2006-05-14 23:55                     ` Junio C Hamano
@ 2006-05-15  1:48                       ` Josef Weidendorfer
  2006-05-15  2:11                         ` Junio C Hamano
  0 siblings, 1 reply; 21+ messages in thread
From: Josef Weidendorfer @ 2006-05-15  1:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Monday 15 May 2006 01:55, you wrote:
> > After a "git clone", we would have
> >
> >  [remote "origin"]
> >    url = ...
> >    fetch = master:origin
> >
> >  [branch "master"]
> >    origin = "origin" ; upstream of master is local branch "origin"
> 
> Doesn't that arrangement force people to use tracking branches?

I think we already had the same discussion some time ago ;-)
This asks for support of a way to specify a remote origin branch
without any local tracking, like
  origin = "<remotebranch> of <remote>"

But not supporting this in a first step should be no problem, as
you always can do "git pull somewhere that-head" directly.

A side-effect of the origin-arrangement would be that branching
off a tracking branch would automatically update it on pull from
the new branch. I would see this as improvement of usability as
"generating a local development branch from a remote one" does
not need any special command.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git diff: support "-U" and "--unified" options properly
  2006-05-15  0:58           ` Linus Torvalds
@ 2006-05-15  1:51             ` Junio C Hamano
  2006-05-15  3:49             ` Linus Torvalds
  1 sibling, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2006-05-15  1:51 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> On Sun, 14 May 2006, Junio C Hamano wrote:
>> 
>> I am not either after seeing your numbers and trying them
>> myself.  I was going to look at it myself over the weekend, but
>> I ended up spending most of the time migrating my environment,
>> so no progress on that front yet.  Sorry.
>
> I was looking at it, and I just suspect that "grep" is very optimized.
>
> At the same time, the built-in one had many good features, notably that 
> you could grep the cached copy and from a specific version. So I think it 
> makes sense.
> 
> ... Except for testing, and fixing my stupid bugs, which 
> I'm too lazy to do.

The command line flags needs to be
unparsed before feeding the external grep.

> The whole "exec_grep()" should basically be the same as "spawn_prog()". 
> You get the idea.
>
> Anybody?

I think somebody wanted to have his own funky grep spawned off
from his PATH, so I'd leave that to him (or her I do not
remember which) ;-).

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Branch relationships
  2006-05-15  1:48                       ` Josef Weidendorfer
@ 2006-05-15  2:11                         ` Junio C Hamano
  0 siblings, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2006-05-15  2:11 UTC (permalink / raw)
  To: Josef Weidendorfer; +Cc: git

Josef Weidendorfer <Josef.Weidendorfer@gmx.de> writes:

> But not supporting this in a first step should be no problem, as
> you always can do "git pull somewhere that-head" directly.

That defeats the point of .git/remotes/ setup, and I suspect is
unacceptable to people who pull from a given branch of a given
repository repeatedly without ever tracking it.  Think of the
maintainer pulling from subsystem maintainers.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git diff: support "-U" and "--unified" options properly
  2006-05-15  0:58           ` Linus Torvalds
  2006-05-15  1:51             ` Junio C Hamano
@ 2006-05-15  3:49             ` Linus Torvalds
  1 sibling, 0 replies; 21+ messages in thread
From: Linus Torvalds @ 2006-05-15  3:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Sun, 14 May 2006, Linus Torvalds wrote:
> 
> So I would actually assume that the solution is to simply make 
> grep_cache() have a simple
> 
> 	#ifdef __unix__
> 		if (!cached) {
> 			hit = external_grep(opt, paths, cached);
> 			if (hit >= 0)
> 				return hit;
> 		}
> 	#endif
> 
> at the top, so that we'd have the best of both worlds.

Ok. Here's a slightly tested version, my out-lined thing from the email 
was pretty close to working already.

It's not perfect, but it gets the "git grep some-random-string" down to 
the good old half-a-second range for the kernel.

It should convert more of the argument flags for "grep", that should be 
trivial to expand (I did a few just as an example). It should also bother 
to try to return the right "hit" value (which it doesn't, right now - the 
code is kind of there, but I didn't actually bother to do it _right_).

Also, right now it _just_ limits by number of arguments, but it should 
also strictly speaking limit by total argument size (ie add up the length 
of the filenames, and do the "exec_grep()" flush call if it's bigger than 
some random value like 32kB).

But I think that it's _conceptually_ doing all the right things, and it 
seems to work. So maybe somebody else can do some of the final polish.

		Linus

----
diff --git a/builtin-grep.c b/builtin-grep.c
index fead356..14471db 100644
--- a/builtin-grep.c
+++ b/builtin-grep.c
@@ -12,6 +12,7 @@ #include "tree-walk.h"
 #include "builtin.h"
 #include <regex.h>
 #include <fnmatch.h>
+#include <sys/wait.h>
 
 /*
  * git grep pathspecs are somewhat different from diff-tree pathspecs;
@@ -409,12 +410,90 @@ static int grep_file(struct grep_opt *op
 	return i;
 }
 
+static int exec_grep(int argc, const char **argv)
+{
+	pid_t pid;
+	int status;
+
+	argv[argc] = NULL;
+	pid = fork();
+	if (pid < 0)
+		return pid;
+	if (!pid) {
+		execvp("grep", (char **) argv);
+		exit(255);
+	}
+	while (waitpid(pid, &status, 0) < 0) {
+		if (errno == EINTR)
+			continue;
+		return -1;
+	}
+	if (WIFEXITED(status)) {
+		if (!WEXITSTATUS(status))
+			return 1;
+		return 0;
+	}
+	return -1;
+}
+
+#define MAXARGS 1000
+
+static int external_grep(struct grep_opt *opt, const char **paths, int cached)
+{
+	int i, nr, argc, hit;
+	const char *argv[MAXARGS+1];
+	struct grep_pat *p;
+
+	nr = 0;
+	argv[nr++] = "grep";
+	if (opt->word_regexp)
+		argv[nr++] = "-w";
+	if (opt->name_only)
+		argv[nr++] = "-l";
+	for (p = opt->pattern_list; p; p = p->next) {
+		argv[nr++] = "-e";
+		argv[nr++] = p->pattern;
+	}
+	argv[nr++] = "--";
+
+	hit = 0;
+	argc = nr;
+	for (i = 0; i < active_nr; i++) {
+		struct cache_entry *ce = active_cache[i];
+		if (ce_stage(ce) || !S_ISREG(ntohl(ce->ce_mode)))
+			continue;
+		if (!pathspec_matches(paths, ce->name))
+			continue;
+		argv[argc++] = ce->name;
+		if (argc < MAXARGS)
+			continue;
+		hit += exec_grep(argc, argv);
+		argc = nr;
+	}
+	if (argc > nr)
+		hit += exec_grep(argc, argv);
+	return 0;
+}
+
 static int grep_cache(struct grep_opt *opt, const char **paths, int cached)
 {
 	int hit = 0;
 	int nr;
 	read_cache();
 
+#ifdef __unix__
+	/*
+	 * Use the external "grep" command for the case where
+	 * we grep through the checked-out files. It tends to
+	 * be a lot more optimized
+	 */
+	if (!cached) {
+		hit = external_grep(opt, paths, cached);
+		if (hit >= 0)
+			return hit;
+	}
+#endif
+
 	for (nr = 0; nr < active_nr; nr++) {
 		struct cache_entry *ce = active_cache[nr];
 		if (ce_stage(ce) || !S_ISREG(ntohl(ce->ce_mode)))

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2006-05-15  3:49 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-13 20:23 git diff: support "-U" and "--unified" options properly Linus Torvalds
2006-05-13 20:30 ` Junio C Hamano
2006-05-13 20:59 ` Junio C Hamano
2006-05-13 21:05   ` Linus Torvalds
2006-05-13 21:22     ` Junio C Hamano
2006-05-13 22:39       ` Linus Torvalds
2006-05-15  0:39         ` Junio C Hamano
2006-05-15  0:58           ` Linus Torvalds
2006-05-15  1:51             ` Junio C Hamano
2006-05-15  3:49             ` Linus Torvalds
2006-05-14  7:00       ` Martin Langhoff
2006-05-14 12:57       ` Josef Weidendorfer
2006-05-14 17:36         ` Junio C Hamano
2006-05-14 20:49           ` Branch relationships (was:Re: git diff: support "-U" and "--unified" options properly) Josef Weidendorfer
2006-05-14 21:20             ` Branch relationships Junio C Hamano
2006-05-14 22:01               ` Josef Weidendorfer
2006-05-14 22:19                 ` Junio C Hamano
2006-05-14 23:04                   ` Josef Weidendorfer
2006-05-14 23:55                     ` Junio C Hamano
2006-05-15  1:48                       ` Josef Weidendorfer
2006-05-15  2:11                         ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).