Git development
 help / color / mirror / Atom feed
* Re: Gitk strangeness..
From: Junio C Hamano @ 2006-03-29  6:41 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: git
In-Reply-To: <17449.48630.370867.10251@cargo.ozlabs.ibm.com>

Paul Mackerras <paulus@samba.org> writes:

> That's brilliant.  Thank you!  With the patch to gitk below, the
> graph display on Linus' example looks much saner.

Indeed this looks much saner.  Thanks.

^ permalink raw reply

* Re: How to switch kernel customizations from 2.6.15.6 to 2.6.16?
From: Junio C Hamano @ 2006-03-29  3:58 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0603281749060.15714@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

>   Personally, I think the rebase syntax sucks, because the _natural_ way 
>   to do it is to just describe the set of commits to rebase the same way 
>   we describe all _other_ commit sets: as a "begin..end" sequence.

I'd agree in general, and I am not happy about them.

But I have an excuse.

rev-parse's A..B notation was invented on June 13th (178cb24).
But format-patch was originally posted on May 30th:

	http://article.gmane.org/gmane.comp.version-control.git/4279

before the convenience of rev-parse was invented ;-).

>   So I think rebase _should_ work something like this:
>
> 	git rebase origin.. [--onto] linus
>
>   ie just giving an arbitrary range.

In addition, both rebase and format-patch does a bit more than
straight his..mine.

    *---x---x---o---o---o---o
     \                      ^mine
      .---.---.---.
                  ^his

We do _not_ want to process all six of his..mine commits when
doing "format-patch his mine" in the above picture, because
upstream might have accepted some of them already, and we filter
them out with git-cherry.  

>   This is even more noticeable for "git-format-patch", where
>   that insane "<his> [<mine>]" syntax is even worse, for no
>   good reason, when again it should really just work like "git
>   diff" where giving a single revision implies a single
>   revision, and giving a range implies a range, and no strange
>   "mine" vs "his" rules ]

Having said that, you have been able to say format-patch A..B
C..D E..F for quite some time (since November 21, 2005).

Rebase is even more strange, especially with --onto.  When you do

    $ rebase --onto his origin mine

in this picture,

    *---x---x---o---o---o---o
     \      ^origin         ^mine
      .---.---.---.
                  ^his

you are discarding two 'x' commits, and lost-found is the only
thing that would help you to recover them.

Unlike format-patch which takes ranges, rebase does not let you
say "rebase --onto base A..B C..D E..F"; what happens might be
too confusing, especially if B, D, F are not coming from the
current branch.  The current branch is rewound to base and then
the chosen sets of patches are applied, which is kind-of scary.
It would feel safer to do:

	$ git checkout -b newbranch base
        $ git format-patch --stdout A..B C..D E..F | git am -3

and after making sure the result is really what you want
resetting the original branch to the current (newbranch) head.

^ permalink raw reply

* Re: How to switch kernel customizations from 2.6.15.6 to 2.6.16?
From: Junio C Hamano @ 2006-03-29  2:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0603281749060.15714@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> Oh well. Syntax rant over.

Yeah, inertia, backward compatibility wart, craziness.

^ permalink raw reply

* Re: How to switch kernel customizations from 2.6.15.6 to 2.6.16?
From: Junio C Hamano @ 2006-03-29  2:26 UTC (permalink / raw)
  To: Matt McCutchen; +Cc: git
In-Reply-To: <1143596622.2481.10.camel@mattlaptop.metaesthetics.net>

Matt McCutchen <hashproduct@verizon.net> writes:

> I made a customized Linux kernel based on 2.6.15.6 by cloning the stable
> 2.6.15 kernel repository (which was then at version 2.6.15.6) and making
> several commits.  Now I would like a Linux kernel based on 2.6.16 with
> the same customizations.

Drawing ancestry graph would help visualizing what you want to
achieve.  You have:

       v2.6.15
    ---o
        \ 
         \
          \                 
           o---o---o v2.6.15.6
                    \
                     x---x---x v2.6.15.6-matt

where x---x---x are your own changes, and you want:

       v2.6.15           v2.6.16
    ---o---o---o---...---o---o
        \                     \
         \                     y---y---y 2.6.16-matt
          \                 
           o---o---o v2.6.15.6
                    \
                     x---x---x 2.6.15.6-matt

to happen, where y---y---y are analogous to x---x---x.

Assuming your branches are:

        origin - v2.6.15.6 (from stable team)
        master - your changes (2.6.15.6-matt)

you could:

        $ git fetch git://../torvalds/linux-2.6.git tag v2.6.16
        $ git checkout -b 2.6.16-matt v2.6.16
        $ git format-patch origin master | git am -3

Alternatively, you might want to do a real merge:

       v2.6.15           v2.6.16
    ---o---o---o---...---o---o
        \                     \
         \                     \ 
          \                     m 2.6.16-matt
           o---o---o v2.6.15.6 /
                    \         /
                     x---x---x 2.6.15.6-matt

Presumably the stable team backported safer changes from the
history between v2.6.15-v2.6.16, and the way things are fixed
are probably quite different from the equivalent fixes in the
development track that led to v2.6.16 (because what's being
patched has also changed), so it is very likely you would see
serious conflicts during this merge.  If you do not understand
what the stable team did in order to reimplement certain fixes,
you would have a very difficult time deciding on how to resolve
conflicts with this merge.

At that point it would not be a git question but the kernel
question I am not qualified to answer ;-), but it might be an
interesting exercise.

^ permalink raw reply

* Re: How to switch kernel customizations from 2.6.15.6 to 2.6.16?
From: Shawn Pearce @ 2006-03-29  2:23 UTC (permalink / raw)
  To: Matt McCutchen; +Cc: git
In-Reply-To: <1143596622.2481.10.camel@mattlaptop.metaesthetics.net>

Matt McCutchen <hashproduct@verizon.net> wrote:
> Dear git people,
> 
> I made a customized Linux kernel based on 2.6.15.6 by cloning the stable
> 2.6.15 kernel repository (which was then at version 2.6.15.6) and making
> several commits.  Now I would like a Linux kernel based on 2.6.16 with
> the same customizations.  This seems to be a very simple task, but I
> have been trying various combinations of git commands for several days
> and have not figured out how to do it.
> 
> I believe that means I should pull the 2.6.16 kernel into the "origin"
> branch and then rebase the "master" branch, merging my customizations
> with 2.6.16.  To this end, I switched my remote file to point to the
> 2.6.16 stable repository and tried to pull.  The result was not what I
> wanted.  The situation is complicated by the fact that 2.6.15.6 is not
> an ancestor of 2.6.16.  The warning in the man page about branches that
> are modified nonlinearly seems to apply.
> 
> How do I make my customized 2.6.16 kernel?

I think you want to use `git-fetch --force` to download origin but
not immediately merge it yet.  This will bypass the not-an-ancestor
check you are running into.

Then you can perform the rebase yourself with:

	# Export your local changes into a series of patches.
	#
	git-format-patch -k --stdout --full-index v2.6.16.6 >changes.mbox

	# Checkout the new origin (2.6.16) into master.
	#
	git-reset --hard origin

	# Now apply your patches.
	#
	git-am --binary -3 changes.mbox

If you get merge conflicts fix them up and restart with
`git-am --resolved`.


Note this is the logic of `git-rebase` except it doesn't require
you to actually have a common ancestor, while `git-rebase` does.

-- 
Shawn.

^ permalink raw reply

* Re: How to switch kernel customizations from 2.6.15.6 to 2.6.16?
From: Linus Torvalds @ 2006-03-29  2:10 UTC (permalink / raw)
  To: Matt McCutchen; +Cc: git
In-Reply-To: <1143596622.2481.10.camel@mattlaptop.metaesthetics.net>



On Tue, 28 Mar 2006, Matt McCutchen wrote:
> 
> I made a customized Linux kernel based on 2.6.15.6 by cloning the stable
> 2.6.15 kernel repository (which was then at version 2.6.15.6) and making
> several commits.  Now I would like a Linux kernel based on 2.6.16 with
> the same customizations.  This seems to be a very simple task, but I
> have been trying various combinations of git commands for several days
> and have not figured out how to do it.
> 
> I believe that means I should pull the 2.6.16 kernel into the "origin"
> branch and then rebase the "master" branch, 

Don't "Pull". "Fetch".

"Pull" implies a merge, which is not what you want.

>		 To this end, I switched my remote file to point to the
> 2.6.16 stable repository and tried to pull.  The result was not what I
> wanted.  The situation is complicated by the fact that 2.6.15.6 is not
> an ancestor of 2.6.16.  The warning in the man page about branches that
> are modified nonlinearly seems to apply.

Just realize that you can have any number of branches, and instead of 
forcing "origin" to be something that it simply is not, just create a new 
branch called "linus".

Make that point to my tree, and do

	git fetch linus

to update it. It really is that easy.

Exact commands something like this:

 (1) Edit your .git/remotes/linus file so that it has the following 
     contents:

	URL: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
	Pull: refs/heads/master:refs/heads/linus

 (2) Use that to fetch my tree as the "linus" branch:

	git fetch linus

and now you have at least a nice new branch that contains the "standard" 
kernel. At which point you just need to do

 (3) rebase the work since "origin" onto "linus":

	git rebase --onto linus origin

I really really despise the "git rebase" command line syntax, and I find 
it very non-inuitive, but what this does is take your current branch, 
compare its contents to "origin" (ie where you started from), and then 
just rebase the commits onto the state that is "linus".

[ Junio: syntax comments:

  Personally, I think the rebase syntax sucks, because the _natural_ way 
  to do it is to just describe the set of commits to rebase the same way 
  we describe all _other_ commit sets: as a "begin..end" sequence.

  So I think rebase _should_ work something like this:

	git rebase origin.. [--onto] linus

  ie just giving an arbitrary range. This is even more noticeable for 
  "git-format-patch", where that insane "<his> [<mine>]" syntax is even 
  worse, for no good reason, when again it should really just work like 
  "git diff" where giving a single revision implies a single revision, and 
  giving a range implies a range, and no strange "mine" vs "his" rules ]

Oh well. Syntax rant over.

			Linus

^ permalink raw reply

* How to switch kernel customizations from 2.6.15.6 to 2.6.16?
From: Matt McCutchen @ 2006-03-29  1:43 UTC (permalink / raw)
  To: git

Dear git people,

I made a customized Linux kernel based on 2.6.15.6 by cloning the stable
2.6.15 kernel repository (which was then at version 2.6.15.6) and making
several commits.  Now I would like a Linux kernel based on 2.6.16 with
the same customizations.  This seems to be a very simple task, but I
have been trying various combinations of git commands for several days
and have not figured out how to do it.

I believe that means I should pull the 2.6.16 kernel into the "origin"
branch and then rebase the "master" branch, merging my customizations
with 2.6.16.  To this end, I switched my remote file to point to the
2.6.16 stable repository and tried to pull.  The result was not what I
wanted.  The situation is complicated by the fact that 2.6.15.6 is not
an ancestor of 2.6.16.  The warning in the man page about branches that
are modified nonlinearly seems to apply.

How do I make my customized 2.6.16 kernel?

-- 
Matt McCutchen
hashproduct@verizon.net
http://hashproduct.metaesthetics.net/

^ permalink raw reply

* Re: git pull fails
From: Petr Baudis @ 2006-03-29  0:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vu09igk1t.fsf@assigned-by-dhcp.cox.net>

Dear diary, on Wed, Mar 29, 2006 at 02:40:30AM CEST, I got a letter
where Junio C Hamano <junkio@cox.net> said that...
> Petr Baudis <pasky@suse.cz> writes:
> 
> > If your current branch would really be a remote branch and you simply
> > git-fetched, your HEAD would change but not your working tree, and at
> > that moment things would become very confusing. Cogito would start
> > showing nonsensical stuff for cg-status and cg-diff (as well as
> > git-diff-tree HEAD output), but your index would at least still be
> > correct so I'm not sure how much attention do tools like git-diff pay to
> > it, the level of messup would be proportional to that.
> 
> People want to leave tracking branches checked out, especially
> when they are not developers but are "update to the latest and
> compile the bleeding edge" types.  Support for that mode of
> operation was invented long time ago and git-pull knows about
> it, and the idea was ported to git-cvsimport recently.

Why can't such people just have two branches, _especially_ if they are
the "update to the latest and compile the bleeding edge" types?
(Therefore well not likely to be familiar with the Git branching model
at all.)

I mean, sure, it's Core Git so the extra flexibility is nice. But I now
wonder, can you think of any plausible workflow where having one branch
instead of two would be an advantage?

Waah, cg-log git-fetch.sh, /update-head just showed me the change in
git-fetch-script from last August, with no extra work for me. The big
rename barrier annoyances finally gone forever!

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply

* Re: Gitk strangeness..
From: Junio C Hamano @ 2006-03-29  0:50 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: git, Linus Torvalds
In-Reply-To: <17449.48630.370867.10251@cargo.ozlabs.ibm.com>

Paul Mackerras <paulus@samba.org> writes:

> Junio C Hamano writes:
>
>> How about this alternative patch, then?  It turned out to be
>> quite convoluted as I feared.
>
> That's brilliant.  Thank you!  With the patch to gitk below, the
> graph display on Linus' example looks much saner.
>
> Could you check in your patch to the git.git repository, please?

The patch I sent was a total mess, and the one in "pu" right now
was somewhat cleaned up but was still far suboptimal.  **Blush**

Most notably, the code from yesterday was re-injecting the
parents of the boundary commits into the list marked as
UNINTERESTING, which was unnecessary and stupid.  This one just
pops boundary commits off the list after consuming it.

Here is a cleaned-up one for eyeballing.

Although I am reasonably sure that this does not affect the way
it works when --boundary is not given, I'd pretty much
appreciate an independent sanity check on this one.  rev-list is
so fundamental to git.

-- >8 --
diff --git a/rev-list.c b/rev-list.c
index 441c437..f3a989c 100644
--- a/rev-list.c
+++ b/rev-list.c
@@ -7,9 +7,9 @@ #include "blob.h"
 #include "diff.h"
 #include "revision.h"
 
-/* bits #0-4 in revision.h */
+/* bits #0-5 in revision.h */
 
-#define COUNTED		(1u<<5)
+#define COUNTED		(1u<<6)
 
 static const char rev_list_usage[] =
 "git-rev-list [OPTION] <commit-id>... [ -- paths... ]\n"
@@ -51,6 +51,8 @@ static void show_commit(struct commit *c
 		printf("%lu ", commit->date);
 	if (commit_prefix[0])
 		fputs(commit_prefix, stdout);
+	if (commit->object.flags & BOUNDARY)
+		putchar('-');
 	fputs(sha1_to_hex(commit->object.sha1), stdout);
 	if (show_parents) {
 		struct commit_list *parents = commit->parents;
diff --git a/revision.c b/revision.c
index d7678cf..745b0d2 100644
--- a/revision.c
+++ b/revision.c
@@ -419,6 +419,27 @@ static void limit_list(struct rev_info *
 			continue;
 		p = &commit_list_insert(commit, p)->next;
 	}
+	if (revs->boundary) {
+		list = newlist;
+		while (list) {
+			struct commit *commit = list->item;
+			struct object *obj = &commit->object;
+			struct commit_list *parent = commit->parents;
+			if (obj->flags & (UNINTERESTING|BOUNDARY)) {
+				list = list->next;
+				continue;
+			}
+			while (parent) {
+				struct commit *pcommit = parent->item;
+				parent = parent->next;
+				if (!(pcommit->object.flags & UNINTERESTING))
+					continue;
+				pcommit->object.flags |= BOUNDARY;
+				p = &commit_list_insert(pcommit, p)->next;
+			}
+			list = list->next;
+		}
+	}
 	revs->commits = newlist;
 }
 
@@ -591,6 +612,10 @@ int setup_revisions(int argc, const char
 				revs->no_merges = 1;
 				continue;
 			}
+			if (!strcmp(arg, "--boundary")) {
+				revs->boundary = 1;
+				continue;
+			}
 			if (!strcmp(arg, "--objects")) {
 				revs->tag_objects = 1;
 				revs->tree_objects = 1;
@@ -731,13 +756,17 @@ struct commit *get_revision(struct rev_i
 	do {
 		struct commit *commit = revs->commits->item;
 
-		if (commit->object.flags & (UNINTERESTING|SHOWN))
+		if (commit->object.flags & SHOWN)
+			goto next;
+		if (!(commit->object.flags & BOUNDARY) &&
+		    (commit->object.flags & UNINTERESTING))
 			goto next;
 		if (revs->min_age != -1 && (commit->date > revs->min_age))
 			goto next;
 		if (revs->max_age != -1 && (commit->date < revs->max_age))
 			return NULL;
-		if (revs->no_merges && commit->parents && commit->parents->next)
+		if (revs->no_merges &&
+		    commit->parents && commit->parents->next)
 			goto next;
 		if (revs->prune_fn && revs->dense) {
 			if (!(commit->object.flags & TREECHANGE))
@@ -745,8 +774,19 @@ struct commit *get_revision(struct rev_i
 			rewrite_parents(commit);
 		}
 		/* More to go? */
-		if (revs->max_count)
-			pop_most_recent_commit(&revs->commits, SEEN);
+		if (revs->max_count) {
+			if (commit->object.flags & BOUNDARY) {
+				/* this is already uninteresting,
+				 * so there is no point popping its
+				 * parents into the list.
+				 */
+				struct commit_list *it = revs->commits;
+				revs->commits = it->next;
+				free(it);
+			}
+			else
+				pop_most_recent_commit(&revs->commits, SEEN);
+		}
 		commit->object.flags |= SHOWN;
 		return commit;
 next:
diff --git a/revision.h b/revision.h
index 6c2beca..61e6bc9 100644
--- a/revision.h
+++ b/revision.h
@@ -6,6 +6,7 @@ #define UNINTERESTING   (1u<<1)
 #define TREECHANGE	(1u<<2)
 #define SHOWN		(1u<<3)
 #define TMP_MARK	(1u<<4) /* for isolated cases; clean after use */
+#define BOUNDARY	(1u<<5)
 
 struct rev_info;
 
@@ -32,7 +33,8 @@ struct rev_info {
 			blob_objects:1,
 			edge_hint:1,
 			limited:1,
-			unpacked:1;
+			unpacked:1,
+			boundary:1;
 
 	/* special limits */
 	int max_count;

^ permalink raw reply related

* Re: git pull fails
From: Junio C Hamano @ 2006-03-29  0:40 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git
In-Reply-To: <20060329002415.GG27689@pasky.or.cz>

Petr Baudis <pasky@suse.cz> writes:

> If your current branch would really be a remote branch and you simply
> git-fetched, your HEAD would change but not your working tree, and at
> that moment things would become very confusing. Cogito would start
> showing nonsensical stuff for cg-status and cg-diff (as well as
> git-diff-tree HEAD output), but your index would at least still be
> correct so I'm not sure how much attention do tools like git-diff pay to
> it, the level of messup would be proportional to that.

People want to leave tracking branches checked out, especially
when they are not developers but are "update to the latest and
compile the bleeding edge" types.  Support for that mode of
operation was invented long time ago and git-pull knows about
it, and the idea was ported to git-cvsimport recently.

^ permalink raw reply

* Re: git pull fails
From: Petr Baudis @ 2006-03-29  0:24 UTC (permalink / raw)
  To: Timo Hirvonen; +Cc: astralstorm, git, ralf
In-Reply-To: <20060329031136.e0389c00.tihirvon@gmail.com>

Dear diary, on Wed, Mar 29, 2006 at 02:11:36AM CEST, I got a letter
where Timo Hirvonen <tihirvon@gmail.com> said that...
> Exactly.  Maybe git-fetch should abort only if it could not update the
> currently checked out branch?

That should _never_ be the case. Any modern porcelain shouldn't let you
switch your current branch to a remote one, hopefully. It's just wrong.
The supported setup is that you have a remote branch reflecting where
the upstream is and a local branch reflecting where your current tree
is, and you update your local branch by git-pull (or git-merge if you
want to avoid fetching).

If your current branch would really be a remote branch and you simply
git-fetched, your HEAD would change but not your working tree, and at
that moment things would become very confusing. Cogito would start
showing nonsensical stuff for cg-status and cg-diff (as well as
git-diff-tree HEAD output), but your index would at least still be
correct so I'm not sure how much attention do tools like git-diff pay to
it, the level of messup would be proportional to that.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply

* Re: git pull fails
From: Junio C Hamano @ 2006-03-29  0:22 UTC (permalink / raw)
  To: Timo Hirvonen; +Cc: git
In-Reply-To: <20060329031136.e0389c00.tihirvon@gmail.com>

Timo Hirvonen <tihirvon@gmail.com> writes:

> Petr Baudis <pasky@suse.cz> wrote:
>
>> If I understand it right, Timo complains that git-fetch got
>> non-fastforward commits for "pu" and "next" and a good fastforward
>> commit for "master", but it didn't update the ref for ANY head, not even
>> the "master".
>
> Exactly.  Maybe git-fetch should abort only if it could not update the
> currently checked out branch?

The erroring-out is there so that the user can take notice.

^ permalink raw reply

* Re: [PATCH] xdiff: Show function names in hunk headers.
From: Junio C Hamano @ 2006-03-29  0:21 UTC (permalink / raw)
  To: Mark Wooding; +Cc: git
In-Reply-To: <slrne2ik1i.s3g.mdw@metalzone.distorted.org.uk>

Mark Wooding <mdw@distorted.org.uk> writes:

> Junio C Hamano <junkio@cox.net> wrote:
>
>> GNU diff -p does "^[[:alpha:]$_]"; personally I think any line
>> that does not begin with a whitespace is good enough.
>...
> The second suggestion is slightly refined, but a little more
> complicated.  We ask for a line which starts /either/ with two
> non-whitespace characters, or with an alphanumeric.  Why?  Because text
> documents have a tendency to have headings of the form `7 Heading!' and
> I want to catch them.

Asciidoc?

        . enumerated one
          this is one item

        . enumerated two
          this is another item

> I think I like option 2 best, as a nice compromise between stupidity and
> actually working.  Opinions, anyone?

It's just a heuristic, so there are only two things we could
sensibly do.  Either we keep it absolutely stupid to save our
code and sanity, or we give full configurability via -F regexp
to the end users.

I suspect feeping creaturism would eventually push us to go the
latter route, but for now I'd vote for doing exactly the same as
what default GNU does, by looking at the first letter without
using regexps.  When we add regexps later, the users can
customize the pattern to match the languages they use, and we
might end up having to have a set of (file-suffix -> default
regexp) mappings, with full end user configurability via
.git/config -- gaaah but true X-<.

	[diff]
        	functionline = "^\w" for .c
                functionline = "^(?i)\s*(?:function|procedure)" for .f77
                functionline = "^\(defun " for .el
                ...

^ permalink raw reply

* [PATCH] Support for pickaxe matching regular expressions
From: Petr Baudis @ 2006-03-29  0:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Mark Wooding, git
In-Reply-To: <Pine.LNX.4.64.0603281500280.15714@g5.osdl.org>

Dear diary, on Wed, Mar 29, 2006 at 01:03:05AM CEST, I got a letter
where Linus Torvalds <torvalds@osdl.org> said that...
> On Tue, 28 Mar 2006, Mark Wooding wrote:
> > Urgh.  So, which regex library do people want to use? ;-)  (My vote's
> > for pcre.)
> 
> ... No regexps, ...

To toss a random feature idea around, in the recent days I've found
myself thinking about regexp pickaxe several times.

And while already tossing stuff, what about a naive proof-of-concept
patch?  A silly example:

	git-whatchanged --pickaxe-regex -p -S' +$' | less -p '^[-+ ].* +$'

Then keep hitting 'n'. Good that most of the matches are deletions. :)
(Or commit messages.)

---

git-diff-* --pickaxe-regex will change the -S pickaxe to match
POSIX extended regular expressions instead of fixed strings.

The regex.h library is a rather stupid interface and I like pcre too, but
with any luck it will be everywhere we will want to run Git on, it being
POSIX.2 and all. I'm not sure if we can expect platforms like AIX to
conform to POSIX.2 or if win32 has regex.h. We might add a flag to
Makefile if there is a portability trouble potential.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 Documentation/diff-options.txt |    4 ++
 diff.c                         |    2 +
 diff.h                         |    1 +
 diffcore-pickaxe.c             |   68 ++++++++++++++++++++++++++++++----------
 4 files changed, 58 insertions(+), 17 deletions(-)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 2a0275e..ec6811c 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -69,6 +69,10 @@
 	changeset, not just the files that contain the change
 	in <string>.
 
+--pickaxe-regex::
+	Make the <string> not a plain string but an extended POSIX
+	regex to match.
+
 -O<orderfile>::
 	Output the patch in the order specified in the
 	<orderfile>, which has one shell glob pattern per line.
diff --git a/diff.c b/diff.c
index 8b37477..e006adb 100644
--- a/diff.c
+++ b/diff.c
@@ -883,6 +883,8 @@ int diff_opt_parse(struct diff_options *
 		options->filter = arg + 14;
 	else if (!strcmp(arg, "--pickaxe-all"))
 		options->pickaxe_opts = DIFF_PICKAXE_ALL;
+	else if (!strcmp(arg, "--pickaxe-regex"))
+		options->pickaxe_opts = DIFF_PICKAXE_REGEX;
 	else if (!strncmp(arg, "-B", 2)) {
 		if ((options->break_opt =
 		     diff_scoreopt_parse(arg)) == -1)
diff --git a/diff.h b/diff.h
index 8fac465..564c94f 100644
--- a/diff.h
+++ b/diff.h
@@ -112,6 +112,7 @@ #define DIFF_DETECT_RENAME	1
 #define DIFF_DETECT_COPY	2
 
 #define DIFF_PICKAXE_ALL	1
+#define DIFF_PICKAXE_REGEX	2
 
 extern void diffcore_std(struct diff_options *);
 
diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
index 50e46ab..d89f314 100644
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -1,12 +1,15 @@
 /*
  * Copyright (C) 2005 Junio C Hamano
  */
+#include <regex.h>
+
 #include "cache.h"
 #include "diff.h"
 #include "diffcore.h"
 
 static unsigned int contains(struct diff_filespec *one,
-			     const char *needle, unsigned long len)
+			     const char *needle, unsigned long len,
+			     regex_t *regexp)
 {
 	unsigned int cnt;
 	unsigned long offset, sz;
@@ -17,15 +20,28 @@ static unsigned int contains(struct diff
 	sz = one->size;
 	data = one->data;
 	cnt = 0;
-
-	/* Yes, I've heard of strstr(), but the thing is *data may
-	 * not be NUL terminated.  Sue me.
-	 */
-	for (offset = 0; offset + len <= sz; offset++) {
-		/* we count non-overlapping occurrences of needle */
-		if (!memcmp(needle, data + offset, len)) {
-			offset += len - 1;
+
+	if (regexp) {
+		regmatch_t regmatch;
+		int flags = 0;
+
+		while (*data && !regexec(regexp, data, 1, &regmatch, flags)) {
+			flags |= REG_NOTBOL;
+			data += regmatch.rm_so;
+			if (*data) data++;
 			cnt++;
+		}
+
+	} else { /* Classic exact string match */
+		/* Yes, I've heard of strstr(), but the thing is *data may
+		 * not be NUL terminated.  Sue me.
+		 */
+		for (offset = 0; offset + len <= sz; offset++) {
+			/* we count non-overlapping occurrences of needle */
+			if (!memcmp(needle, data + offset, len)) {
+				offset += len - 1;
+				cnt++;
+			}
 		}
 	}
 	return cnt;
@@ -36,10 +52,24 @@ void diffcore_pickaxe(const char *needle
 	struct diff_queue_struct *q = &diff_queued_diff;
 	unsigned long len = strlen(needle);
 	int i, has_changes;
+	regex_t regex, *regexp = NULL;
 	struct diff_queue_struct outq;
 	outq.queue = NULL;
 	outq.nr = outq.alloc = 0;
 
+	if (opts & DIFF_PICKAXE_REGEX) {
+		int err;
+		err = regcomp(&regex, needle, REG_EXTENDED | REG_NEWLINE);
+		if (err) {
+			/* The POSIX.2 people are surely sick */
+			char errbuf[1024];
+			regerror(err, &regex, errbuf, 1024);
+			regfree(&regex);
+			die("invalid pickaxe regex: %s", errbuf);
+		}
+		regexp = &regex;
+	}
+
 	if (opts & DIFF_PICKAXE_ALL) {
 		/* Showing the whole changeset if needle exists */
 		for (i = has_changes = 0; !has_changes && i < q->nr; i++) {
@@ -48,16 +78,16 @@ void diffcore_pickaxe(const char *needle
 				if (!DIFF_FILE_VALID(p->two))
 					continue; /* ignore unmerged */
 				/* created */
-				if (contains(p->two, needle, len))
+				if (contains(p->two, needle, len, regexp))
 					has_changes++;
 			}
 			else if (!DIFF_FILE_VALID(p->two)) {
-				if (contains(p->one, needle, len))
+				if (contains(p->one, needle, len, regexp))
 					has_changes++;
 			}
 			else if (!diff_unmodified_pair(p) &&
-				 contains(p->one, needle, len) !=
-				 contains(p->two, needle, len))
+				 contains(p->one, needle, len, regexp) !=
+				 contains(p->two, needle, len, regexp))
 				has_changes++;
 		}
 		if (has_changes)
@@ -80,16 +110,16 @@ void diffcore_pickaxe(const char *needle
 				if (!DIFF_FILE_VALID(p->two))
 					; /* ignore unmerged */
 				/* created */
-				else if (contains(p->two, needle, len))
+				else if (contains(p->two, needle, len, regexp))
 					has_changes = 1;
 			}
 			else if (!DIFF_FILE_VALID(p->two)) {
-				if (contains(p->one, needle, len))
+				if (contains(p->one, needle, len, regexp))
 					has_changes = 1;
 			}
 			else if (!diff_unmodified_pair(p) &&
-				 contains(p->one, needle, len) !=
-				 contains(p->two, needle, len))
+				 contains(p->one, needle, len, regexp) !=
+				 contains(p->two, needle, len, regexp))
 				has_changes = 1;
 
 			if (has_changes)
@@ -97,6 +127,10 @@ void diffcore_pickaxe(const char *needle
 			else
 				diff_free_filepair(p);
 		}
+
+	if (opts & DIFF_PICKAXE_REGEX) {
+		regfree(&regex);
+	}
 
 	free(q->queue);
 	*q = outq;



-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply related

* Re: git pull fails
From: Timo Hirvonen @ 2006-03-29  0:11 UTC (permalink / raw)
  To: Petr Baudis; +Cc: astralstorm, git, ralf
In-Reply-To: <20060328224807.GC27689@pasky.or.cz>

Petr Baudis <pasky@suse.cz> wrote:

> If I understand it right, Timo complains that git-fetch got
> non-fastforward commits for "pu" and "next" and a good fastforward
> commit for "master", but it didn't update the ref for ANY head, not even
> the "master".

Exactly.  Maybe git-fetch should abort only if it could not update the
currently checked out branch?

-- 
http://onion.dynserv.net/~timo/

^ permalink raw reply

* Re: [PATCH] Add ALL_LDFLAGS to the git target.
From: Junio C Hamano @ 2006-03-29  0:01 UTC (permalink / raw)
  To: Jason Riedy; +Cc: git
In-Reply-To: <16397.1143590377@lotus.CS.Berkeley.EDU>

Jason Riedy <ejr@EECS.Berkeley.EDU> writes:

> And Junio C Hamano writes:
>  - My preference is to ignore FORTRAN, keep Mark's current rules,
>  - perhaps with a way to turn it off if people really find it
>  - annoying (I do not mind having it always on).

Sorry I forgot to add smiley to the above ;-).

^ permalink raw reply

* Re: [PATCH] Add ALL_LDFLAGS to the git target.
From: Jason Riedy @ 2006-03-28 23:59 UTC (permalink / raw)
  To: git
In-Reply-To: <7vbqvqjgvi.fsf@assigned-by-dhcp.cox.net>

And Junio C Hamano writes:
 - My preference is to ignore FORTRAN, keep Mark's current rules,
 - perhaps with a way to turn it off if people really find it
 - annoying (I do not mind having it always on).

Sorry; I had meant my comment as an aside and not a 
request.  I had never noticed the function definition 
in patches, and now I typically use Emacs's tools.

And as of Fortran 90, it's now officially Fortran and
not FORTRAN.

Jason

^ permalink raw reply

* Re: Cherry-pick particular object
From: Petr Baudis @ 2006-03-28 23:28 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: sean, Sébastien Pierre, git
In-Reply-To: <Pine.LNX.4.64.0603281512260.15714@g5.osdl.org>

Dear diary, on Wed, Mar 29, 2006 at 01:24:13AM CEST, I got a letter
where Linus Torvalds <torvalds@osdl.org> said that...
> On Wed, 29 Mar 2006, Petr Baudis wrote:
> 
> > Dear diary, on Wed, Mar 29, 2006 at 12:44:02AM CEST, I got a letter
> > where Linus Torvalds <torvalds@osdl.org> said that...
> > > Ie you can have a tree like this:
> > > 
> > > 	100644 blob f2ba8f84ab5c1bce84a7b441cb1959cfc7093b7f    abc
> > > 	120000 blob f2ba8f84ab5c1bce84a7b441cb1959cfc7093b7f    file
> > > 
> > > where the first one is a regular file called "abc" (which contains the 
> > > string "abc"), and the second is the _symlink_ that points to "abc".
> > > 
> > > They share the exact same blob, and what distinguishes them is the 
> > > filemode info from git-read-tree.
> > 
> > Huh? Didn't you rather want to say that "file" will point to a blob
> > containing just the "abc" string (the symlink target)? ;-)
> 
> Well no, maybe I should have called the first file something else.
> 
> Both "abc" and "file" from a git perspective have the same _contents_ (the 
> blob containing the data 'abc'). 

Oh, I've totally missed the '(which contains the string "abc")' part,
sorry. Apparently it's time to sleep for me. :/

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply

* Re: Cherry-pick particular object
From: Linus Torvalds @ 2006-03-28 23:24 UTC (permalink / raw)
  To: Petr Baudis; +Cc: sean, Sébastien Pierre, git
In-Reply-To: <20060328225429.GD27689@pasky.or.cz>



On Wed, 29 Mar 2006, Petr Baudis wrote:

> Dear diary, on Wed, Mar 29, 2006 at 12:44:02AM CEST, I got a letter
> where Linus Torvalds <torvalds@osdl.org> said that...
> > Ie you can have a tree like this:
> > 
> > 	100644 blob f2ba8f84ab5c1bce84a7b441cb1959cfc7093b7f    abc
> > 	120000 blob f2ba8f84ab5c1bce84a7b441cb1959cfc7093b7f    file
> > 
> > where the first one is a regular file called "abc" (which contains the 
> > string "abc"), and the second is the _symlink_ that points to "abc".
> > 
> > They share the exact same blob, and what distinguishes them is the 
> > filemode info from git-read-tree.
> 
> Huh? Didn't you rather want to say that "file" will point to a blob
> containing just the "abc" string (the symlink target)? ;-)

Well no, maybe I should have called the first file something else.

Both "abc" and "file" from a git perspective have the same _contents_ (the 
blob containing the data 'abc'). 

But the filemode means that those contents have totally different meaning. 
For the pth "file", it means that it's a _symlink_ to "abc", while for the 
path "abc" it's a regular file that just has the _contents_ "abc".

So the end _result_ of this is that "file" points to a file called "abc" 
that also has the contents "abc", and "cat file abc" will result in 
"abcabc".

IOW, this is the result of doing

	echo -n abc > abc
	ln -s abc file

and importing the mess into git.

		Linus

^ permalink raw reply

* Re: [PATCH] Add ALL_LDFLAGS to the git target.
From: Mark Wooding @ 2006-03-28 23:21 UTC (permalink / raw)
  To: git
In-Reply-To: <Pine.LNX.4.64.0603281500280.15714@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> wrote:

> I'd really just prefer to make the "-p" switch configurable, the way
> it was before. No regexps, just the same rules as for GNU diff,

The rules for GNU diff aren't actually good enough if you can't
configure them.  We used to be able to put runes in GIT_DIFF_OPTS.

> perhaps with the difference being that it would be on by default.

I thought it /was/ on by default:

: static const char *diff_opts = "-pu";

(killed in cebff98db).

> Another possible approach is to say
>  - if the first line of the real diff matches the rules, do NOT add 
>    another line that matches the rule at the @@-line.
>
> since the simple @@-line rule really doesn't make sense for any file that 
> is "dense" (ie where most lines start with non-whitespace).

It's true, and that's an easy fix.  But it doesn't do any actual harm.

-- [mdw]

^ permalink raw reply

* Re: [PATCH] Add ALL_LDFLAGS to the git target.
From: Junio C Hamano @ 2006-03-28 23:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0603281500280.15714@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> On Tue, 28 Mar 2006, Mark Wooding wrote:
>
>> Jason Riedy <ejr@EECS.Berkeley.EDU> wrote:
>> 
>> > P.S. For the whole finding-a-function-name business, some of 
>> > us are using git on fixed-format Fortran.  Every non-comment
>> > line begins with whitespace...  ;)  And in free format, many
>> > people don't add that first indentation within subroutines.
>> 
>> Urgh.  So, which regex library do people want to use? ;-)  (My vote's
>> for pcre.)
>
> I'd really just prefer to make the "-p" switch configurable, the way it 
> was before. No regexps, just the same rules as for GNU diff, perhaps with 
> the difference being that it would be on by default.

Strictly speaking, "No regexps" and "same rules as for GNU diff"
are mutually incompatible, since GNU diff -p defaults to
"^[[:alpha:]$_]" but the regexp is configurable.

My preference is to ignore FORTRAN, keep Mark's current rules,
perhaps with a way to turn it off if people really find it
annoying (I do not mind having it always on).

> Another possible approach is to say
>  - if the first line of the real diff matches the rules, do NOT add 
>    another line that matches the rule at the @@-line.
>
> since the simple @@-line rule really doesn't make sense for any file that 
> is "dense" (ie where most lines start with non-whitespace).

I think this is a good rule.  If "the first non-empty line" may
be even better; we do not want to see the name of previous
function for a huke like this:

	@@ -a,b +c,d @@

        int frotz(void)
        {
            ...

^ permalink raw reply

* Re: [PATCH] Add ALL_LDFLAGS to the git target.
From: Linus Torvalds @ 2006-03-28 23:03 UTC (permalink / raw)
  To: Mark Wooding; +Cc: git
In-Reply-To: <slrne2jf9t.s3g.mdw@metalzone.distorted.org.uk>



On Tue, 28 Mar 2006, Mark Wooding wrote:

> Jason Riedy <ejr@EECS.Berkeley.EDU> wrote:
> 
> > P.S. For the whole finding-a-function-name business, some of 
> > us are using git on fixed-format Fortran.  Every non-comment
> > line begins with whitespace...  ;)  And in free format, many
> > people don't add that first indentation within subroutines.
> 
> Urgh.  So, which regex library do people want to use? ;-)  (My vote's
> for pcre.)

I'd really just prefer to make the "-p" switch configurable, the way it 
was before. No regexps, just the same rules as for GNU diff, perhaps with 
the difference being that it would be on by default.

Another possible approach is to say
 - if the first line of the real diff matches the rules, do NOT add 
   another line that matches the rule at the @@-line.

since the simple @@-line rule really doesn't make sense for any file that 
is "dense" (ie where most lines start with non-whitespace).

		Linus

^ permalink raw reply

* Re: Cherry-pick particular object
From: Petr Baudis @ 2006-03-28 22:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: sean, Sébastien Pierre, git
In-Reply-To: <Pine.LNX.4.64.0603281435410.15714@g5.osdl.org>

Dear diary, on Wed, Mar 29, 2006 at 12:44:02AM CEST, I got a letter
where Linus Torvalds <torvalds@osdl.org> said that...
> Ie you can have a tree like this:
> 
> 	100644 blob f2ba8f84ab5c1bce84a7b441cb1959cfc7093b7f    abc
> 	120000 blob f2ba8f84ab5c1bce84a7b441cb1959cfc7093b7f    file
> 
> where the first one is a regular file called "abc" (which contains the 
> string "abc"), and the second is the _symlink_ that points to "abc".
> 
> They share the exact same blob, and what distinguishes them is the 
> filemode info from git-read-tree.

Huh? Didn't you rather want to say that "file" will point to a blob
containing just the "abc" string (the symlink target)? ;-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply

* Re: Gitk strangeness..
From: Paul Mackerras @ 2006-03-28 22:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Linus Torvalds
In-Reply-To: <7vzmjbj9a1.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano writes:

> How about this alternative patch, then?  It turned out to be
> quite convoluted as I feared.

That's brilliant.  Thank you!  With the patch to gitk below, the
graph display on Linus' example looks much saner.

Could you check in your patch to the git.git repository, please?

Thanks,
Paul.

diff --git a/gitk b/gitk
index 03cd475..1989aa5 100755
--- a/gitk
+++ b/gitk
@@ -46,7 +46,7 @@ proc start_rev_list {rlargs} {
     }
     if {[catch {
 	set commfd [open [concat | git-rev-list --header $order \
-			      --parents $rlargs] r]
+			      --parents --boundary $rlargs] r]
     } err]} {
 	puts stderr "Error executing git-rev-list: $err"
 	exit 1
@@ -114,8 +114,13 @@ proc getcommitlines {commfd}  {
 	set start [expr {$i + 1}]
 	set j [string first "\n" $cmit]
 	set ok 0
+	set listed 1
 	if {$j >= 0} {
 	    set ids [string range $cmit 0 [expr {$j - 1}]]
+	    if {[string range $ids 0 0] == "-"} {
+		set listed 0
+		set ids [string range $ids 1 end]
+	    }
 	    set ok 1
 	    foreach id $ids {
 		if {[string length $id] != 40} {
@@ -133,8 +138,12 @@ proc getcommitlines {commfd}  {
 	    exit 1
 	}
 	set id [lindex $ids 0]
-	set olds [lrange $ids 1 end]
-	set commitlisted($id) 1
+	if {$listed} {
+	    set olds [lrange $ids 1 end]
+	    set commitlisted($id) 1
+	} else {
+	    set olds {}
+	}
 	updatechildren $id $olds
 	set commitdata($id) [string range $cmit [expr {$j + 1}] end]
 	set commitrow($id) $commitidx

^ permalink raw reply related

* Re: [PATCH] Add ALL_LDFLAGS to the git target.
From: Mark Wooding @ 2006-03-28 22:48 UTC (permalink / raw)
  To: git
In-Reply-To: <15693.1143575188@lotus.CS.Berkeley.EDU>

Jason Riedy <ejr@EECS.Berkeley.EDU> wrote:

> P.S. For the whole finding-a-function-name business, some of 
> us are using git on fixed-format Fortran.  Every non-comment
> line begins with whitespace...  ;)  And in free format, many
> people don't add that first indentation within subroutines.

Urgh.  So, which regex library do people want to use? ;-)  (My vote's
for pcre.)

-- [mdw]

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox