Git development

Git development
 help / color / mirror / Atom feed

* [STGIT] stg refresh wish (splitting patches/removing files from a patch)
From: Peter Oberndorfer @ 2007-12-30 19:03 UTC (permalink / raw)
  To: Git Mailing List

Hi,
I recently tried to split a stgit patch into 2 parts
and it was not as easy as i would like it to be.

How do i exclude a file from a patch(use version of file present in HEAD^)
without modifying the working dir?

with plain git i would use something like
git reset HEAD^ files_i_do_not_want_in_first_patch
git commit --amend
git add files_i_do_not_want_in_first_patch
git commit

So my idea was to add a --use-index [1] option to stg refresh
When it is passed stg refresh will use the current index for the contenst of the refreshed patch 
instead of looking at the working dir.
This would solve my problem[2] and also make it possible to use git-gui for 
staging hunks.

Do you think this would be a useful/good idea?
Or do we want a separate command for removing files from a patch anyway?

Another thing that might be useful (in my scenario) would be a stg commit --top extension
which commits at the top end of the stack
(unfortunately this will loose the patch history for splitting commits)
then i can edit this commits without being afraid of confusing stgit
and then stg assimilate /stg repair to make them managed by stg again

Greetings Peter

[1] rename by desire

[2] new way for splitting a patch with extension
git reset HEAD^ files_i_do_not_want_in_first_patch
stg refresh --use-index
stg refresh -e
git add files_i_do_not_want_in_first_patch
stg new
stg refresh --use-index

^ permalink raw reply

* [PATCH] git-filter-branch could be confused by similar names
From: Dmitry Potapov @ 2007-12-30 18:51 UTC (permalink / raw)
  To: git, Johannes Schindelin; +Cc: Dmitry Potapov
In-Reply-To: <Pine.LNX.4.64.0712301700580.14355@wbgn129.biozentrum.uni-wuerzburg.de>

'git-filter-branch branch' could fail producing the error:
"Which ref do you want to rewrite?" if existed another branch
or tag, which name was 'branch-something' or 'something/branch'.

Signed-off-by: Dmitry Potapov <dpotapov@gmail.com>
---

I have corrected my previous patch to allow "heads" or "tags"
in the name of a branch or tag, i.e. to write it like this:
   git filter-branch heads/master

 git-filter-branch.sh     |    2 +-
 t/t7003-filter-branch.sh |   10 ++++++++++
 2 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index dbab1a9..5de8b12 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -219,7 +219,7 @@ do
 	;;
 	*)
 		ref="$(git for-each-ref --format='%(refname)' |
-			grep /"$ref")"
+			grep '^refs/\([^/]\+/\)\?'"$ref"'$')"
 	esac
 
 	git check-ref-format "$ref" && echo "$ref"
diff --git a/t/t7003-filter-branch.sh b/t/t7003-filter-branch.sh
index 5f60b22..c3e5207 100755
--- a/t/t7003-filter-branch.sh
+++ b/t/t7003-filter-branch.sh
@@ -36,6 +36,16 @@ test_expect_success 'result is really identical' '
 	test $H = $(git rev-parse HEAD)
 '
 
+test_expect_success 'rewrite branch with similar names' '
+	git branch my &&
+	git tag my/orig &&
+	git tag my-orig &&
+	git tag orig/my &&
+	git tag orig-my &&
+	git-filter-branch my &&
+	test $H = $(git rev-parse HEAD)
+'
+
 test_expect_success 'rewrite, renaming a specific file' '
 	git-filter-branch -f --tree-filter "mv d doh || :" HEAD
 '
-- 
1.5.3.5

^ permalink raw reply related

* Re: [PATCH] git-filter-branch could be confused by similar names
From: Dmitry Potapov @ 2007-12-30 18:40 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0712301700580.14355@wbgn129.biozentrum.uni-wuerzburg.de>

On Sun, Dec 30, 2007 at 05:03:32PM +0100, Johannes Schindelin wrote:
> 
> On Sun, 30 Dec 2007, Dmitry Potapov wrote:
> 
> > How about this:
> > 
> > +			grep '^refs/\([^/]\+/\)\?'"$ref"'$')"
> 
> Maybe.  I wonder whether just adding a "$" (which I obviously forgot) 
> would not be enough...

Adding '$' will certainly make things much better, but you will still
have the same problem if you want to filter "master", but you have
"origin/master" in your repo.

Dmitry

^ permalink raw reply

* Re: git-svn in 1.5.4~rc2 somewhat broken?
From: Florian Weimer @ 2007-12-30 16:33 UTC (permalink / raw)
  To: Steven Walter; +Cc: git
In-Reply-To: <20071230160758.GA7520@dervierte>

* Steven Walter:

>> Last fetched revision of refs/remotes/git-svn was r45313, but we are
>> about to fetch: r851!
>
> Messages like these usually mean you've changed refs/remotes/trunk,
> which will confuse git-svn unless you know what you're doing.

Uhm, I don't recall doing such a thing.

> Fortunately, you can usually "rm -rf .git/svn" and git-svn will sort
> itself out on the next fetch.

Yeah, this has fixed it for me.  Thanks.

^ permalink raw reply

* Re: git-svn in 1.5.4~rc2 somewhat broken?
From: Steven Walter @ 2007-12-30 16:07 UTC (permalink / raw)
  To: Florian Weimer; +Cc: git
In-Reply-To: <87wsqw49dj.fsf@mid.deneb.enyo.de>

On Sun, Dec 30, 2007 at 02:09:28PM +0100, Florian Weimer wrote:
> I just tried to run "git svn fetch" on a clone of a Subversion
> repository that used to work fine with 1.5.3.  After trying to fix some
> things up (sorry, scrollback buffer is not deep enough), it now reports
> an index mismatch:
> 
> Index mismatch: efcc3165fb64519ff1784903112d725c8682d5d2 != b3e7f07b5e4b79f682718fe6a31107d74ca35ce6
> 
> And it finally bails out with:
> 
> Last fetched revision of refs/remotes/git-svn was r45313, but we are about to fetch: r851!

Messages like these usually mean you've changed refs/remotes/trunk,
which will confuse git-svn unless you know what you're doing.
Fortunately, you can usually "rm -rf .git/svn" and git-svn will sort
itself out on the next fetch.
-- 
-Steven Walter <stevenrwalter@gmail.com>
Freedom is the freedom to say that 2 + 2 = 4
B2F1 0ECC E605 7321 E818  7A65 FC81 9777 DC28 9E8F 

^ permalink raw reply

* Re: [PATCH] git-filter-branch could be confused by similar names
From: Johannes Schindelin @ 2007-12-30 16:03 UTC (permalink / raw)
  To: Dmitry Potapov; +Cc: git
In-Reply-To: <20071230135428.GW13968@dpotapov.dyndns.org>

Hi,

On Sun, 30 Dec 2007, Dmitry Potapov wrote:

> How about this:
> 
> +			grep '^refs/\([^/]\+/\)\?'"$ref"'$')"

Maybe.  I wonder whether just adding a "$" (which I obviously forgot) 
would not be enough...

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH] Optimize prefixcmp()
From: Johannes Schindelin @ 2007-12-30 15:54 UTC (permalink / raw)
  To: Marco Costalba; +Cc: Git Mailing List
In-Reply-To: <e5bfff550712300502p543680b9jbeb9469a5a970f0@mail.gmail.com>

Hi,

On Sun, 30 Dec 2007, Marco Costalba wrote:

> Initial patch by Johannes Schindelin.

Not true ;-)

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH] Force new line at end of commit message
From: Johannes Schindelin @ 2007-12-30 15:51 UTC (permalink / raw)
  To: しらいしななこ
  Cc: Junio C Hamano, Shawn O. Pearce, Bernt Hansen, git
In-Reply-To: <200712301158.lBUBwT3u004608@mi1.bluebottle.com>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 880 bytes --]

Hi,

On Sun, 30 Dec 2007, しらいしななこ wrote:

> Quoting Junio C Hamano <gitster@pobox.com>:
> 
> > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> >
> >> Not that I care too deeply, but does that not add a newline regardless 
> >> whether it is needed or not?
> >
> > Heh, I can see that you do not care---the original did not even
> > add a newline when necessary (and that is why we have this
> > thread).  Instead you were adding a newline regardless to the
> > end of the first commit, but not doing so for the other ones.
> 
> Aren't you being too harsh on Johannes these days?
> 
> Everybody knows that you are capable of rewriting that part in Perl or 
> Python yourself to fix the issue.

Hehe.  I think _that_ would be harsh on me ;-)

As it is, I am quite fine with the communication between Junio and me, but 
thanks for your concern.

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH] Force new line at end of commit message
From: Johannes Schindelin @ 2007-12-30 15:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, Bernt Hansen, git
In-Reply-To: <7vprwo8kzd.fsf@gitster.siamese.dyndns.org>

Hi,

[Bernt, your mail filter is less than intelligent and rejects my mails.]

On Sun, 30 Dec 2007, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > Not that I care too deeply, but does that not add a newline regardless 
> > whether it is needed or not?
> 
> Heh, I can see that you do not care---the original did not even
> add a newline when necessary (and that is why we have this
> thread).

Umm.  It was on purpose, since I found the empty lines between the commit 
messages and the comment more pleasing than no empty space.

> The patch just moves that unconditional "echo"; instead of adding one to 
> the end of the first commit (and only the first one), it adds before the 
> new commit's title message.

Well, ACK from me, then.

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH] Optimize prefixcmp()
From: Marco Costalba @ 2007-12-30 15:17 UTC (permalink / raw)
  To: Pierre Habouzit, Marco Costalba, Johannes Schindelin,
	Git Mailing List
In-Reply-To: <e5bfff550712300650j2ea70032jaca893b734592184@mail.gmail.com>

On Dec 30, 2007 3:50 PM, Marco Costalba <mcostalba@gmail.com> wrote:
>
> If *prefix == "" case is to be considered I vote for your/Johannes
> version because it's "better code" (tm).
>
Ok this is fast and correct

static inline int prefixcmp(const char *str, const char *prefix)
{
	while (*str == *prefix && *prefix)
    		str++, prefix++;

	return (*prefix ? *(unsigned const char *)prefix - *(unsigned const
char *)str : 0);
}


This is the last one, I promise ;-)

Marco

^ permalink raw reply

* How to bypass the post-commit hook?
From: Ping Yin @ 2007-12-30 15:12 UTC (permalink / raw)
  To: Git Mailing List

--no-verify can bypass pre-commit hook? Then how to bypass post-commit hook?

Usually I want post-commit take effect. However, in the middle of
git-rebase, i want to bypass post-commit when 'git-commit --amend'
since my post-commit hooks will modify the working directory and so
make following rebase troubesome.

-- 
Ping Yin

^ permalink raw reply

* Re: [PATCH] Optimize prefixcmp()
From: Marco Costalba @ 2007-12-30 14:50 UTC (permalink / raw)
  To: Pierre Habouzit, Marco Costalba, Johannes Schindelin,
	Git Mailing List
In-Reply-To: <20071230135820.GB25917@artemis.madism.org>

On Dec 30, 2007 2:58 PM, Pierre Habouzit <madcoder@debian.org> wrote:
> >
> >   This code doesn't work if prefix is "". You want something like:
> >
> >     for (; *prefix; prefix++, str++) {
> >         if (*str != *prefix)
> >             return *(unsigned const char *)prefix - *(unsigned const char *)str;
> >     }
> >     return 0;
>
>   Which happens to be basically the same than what Dscho wrote, though I
> suppose the compiler can compile that more efficiently than his code.
>

Yes, your version covers the *prefix == "" case too. If this case is
important for us we could use something as

static inline int prefixcmp(const char *str, const char *prefix)
{
	do {
		if (*str != *prefix)
			return (!*prefix ? 0 : *(unsigned const char *)prefix - *(unsigned
const char *)str);

		if (!*(++prefix))
			return 0;

		str++;

	} while (1);
}

But your code is *surely* nicer then this one. But, for unknown
reasons, this code happens to be faster, probably as you say the
compiler optimizes away the second check in the return statement so
that this version is slightly faster then the 'for' loop one, but
admitelly we are going to much in the academic now.

If *prefix == "" case is to be considered I vote for your/Johannes
version because it's "better code" (tm).

Marco

^ permalink raw reply

* Re: [PATCH] Optimize prefixcmp()
From: Pierre Habouzit @ 2007-12-30 13:58 UTC (permalink / raw)
  To: Marco Costalba, Johannes Schindelin, Git Mailing List
In-Reply-To: <20071230135557.GA25917@artemis.madism.org>

[-- Attachment #1: Type: text/plain, Size: 1748 bytes --]

On Sun, Dec 30, 2007 at 01:55:57PM +0000, Pierre Habouzit wrote:
> On Sun, Dec 30, 2007 at 01:02:28PM +0000, Marco Costalba wrote:
> > Subject: [PATCH] Certain codepaths (notably "git log --pretty=format...") use
> > 
> > prefixcmp() extensively, with very short prefixes.  In those cases,
> > calling strlen() is a wasteful operation, so avoid it.
> > 
> > Initial patch by Johannes Schindelin.
> > 
> > Signed-off-by: Marco Costalba <mcostalba@gmail.com>
> > ---
> >  git-compat-util.h |   11 ++++++++++-
> >  1 files changed, 10 insertions(+), 1 deletions(-)
> > 
> > diff --git a/git-compat-util.h b/git-compat-util.h
> > index 79eb10e..843a8f5 100644
> > --- a/git-compat-util.h
> > +++ b/git-compat-util.h
> > @@ -398,7 +398,16 @@ static inline int sane_case(int x, int high)
> > 
> >  static inline int prefixcmp(const char *str, const char *prefix)
> >  {
> > -	return strncmp(str, prefix, strlen(prefix));
> > +	do {
> > +		if (*str != *prefix)
> > +			return *(unsigned const char *)prefix - *(unsigned const char *)str;
> > +
> > +		if (!*(++prefix))
> > +			return 0;
> > +
> > +		str++;
> > +
> > +	} while (1);
> 
>   This code doesn't work if prefix is "". You want something like:
> 
>     for (; *prefix; prefix++, str++) {
>         if (*str != *prefix)
>             return *(unsigned const char *)prefix - *(unsigned const char *)str;
>     }
>     return 0;

  Which happens to be basically the same than what Dscho wrote, though I
suppose the compiler can compile that more efficiently than his code.


-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: [PATCH] Optimize prefixcmp()
From: Pierre Habouzit @ 2007-12-30 13:55 UTC (permalink / raw)
  To: Marco Costalba; +Cc: Johannes Schindelin, Git Mailing List
In-Reply-To: <e5bfff550712300502p543680b9jbeb9469a5a970f0@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1453 bytes --]

On Sun, Dec 30, 2007 at 01:02:28PM +0000, Marco Costalba wrote:
> Subject: [PATCH] Certain codepaths (notably "git log --pretty=format...") use
> 
> prefixcmp() extensively, with very short prefixes.  In those cases,
> calling strlen() is a wasteful operation, so avoid it.
> 
> Initial patch by Johannes Schindelin.
> 
> Signed-off-by: Marco Costalba <mcostalba@gmail.com>
> ---
>  git-compat-util.h |   11 ++++++++++-
>  1 files changed, 10 insertions(+), 1 deletions(-)
> 
> diff --git a/git-compat-util.h b/git-compat-util.h
> index 79eb10e..843a8f5 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -398,7 +398,16 @@ static inline int sane_case(int x, int high)
> 
>  static inline int prefixcmp(const char *str, const char *prefix)
>  {
> -	return strncmp(str, prefix, strlen(prefix));
> +	do {
> +		if (*str != *prefix)
> +			return *(unsigned const char *)prefix - *(unsigned const char *)str;
> +
> +		if (!*(++prefix))
> +			return 0;
> +
> +		str++;
> +
> +	} while (1);

  This code doesn't work if prefix is "". You want something like:

    for (; *prefix; prefix++, str++) {
        if (*str != *prefix)
            return *(unsigned const char *)prefix - *(unsigned const char *)str;
    }
    return 0;

-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: [PATCH] git-filter-branch could be confused by similar names
From: Dmitry Potapov @ 2007-12-30 13:54 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0712301145360.14355@wbgn129.biozentrum.uni-wuerzburg.de>

On Sun, Dec 30, 2007 at 11:46:59AM +0100, Johannes Schindelin wrote:
> 
> On Sun, 30 Dec 2007, Dmitry Potapov wrote:
> 
> > On Sat, Dec 29, 2007 at 11:36:51PM +0100, Johannes Schindelin wrote:
> > > 
> > > On Tue, 25 Dec 2007, Dmitry Potapov wrote:
> > > 
> > > > 'git-filter-branch branch' could fail producing the error: "Which 
> > > > ref do you want to rewrite?" if existed another branch or tag, which 
> > > > name was 'branch-something' or 'something/branch'.
> > > > 
> > > > Signed-off-by: Dmitry Potapov <dpotapov@gmail.com>
> > > > ---
> > > >  git-filter-branch.sh     |    2 +-
> > > >  t/t7003-filter-branch.sh |   10 ++++++++++
> > > >  2 files changed, 11 insertions(+), 1 deletions(-)
> > > > 
> > > > diff --git a/git-filter-branch.sh b/git-filter-branch.sh
> > > > index dbab1a9..b89a720 100755
> > > > --- a/git-filter-branch.sh
> > > > +++ b/git-filter-branch.sh
> > > > @@ -219,7 +219,7 @@ do
> > > >  	;;
> > > >  	*)
> > > >  		ref="$(git for-each-ref --format='%(refname)' |
> > > > -			grep /"$ref")"
> > > > +			grep '^refs/[^/]\+/'"$ref"'$')"
> > > 
> > > Hmm.  I wonder if this is a proper solution.  It still does not error 
> > > out when you have a tag and a branch of the same name.
> > 
> > Are you sure? I had created a tag and a branch with the same name, and
> > then tried git filter-branch on it, and it did error out:
> > ===
> > warning: refname 'test1' is ambiguous.
> > Which ref do you want to rewrite?
> > ===
> 
> Okay, bad example.  But try "heads/master". 

You are right. Somehow, I forgot about this possibility. How about this:

+			grep '^refs/\([^/]\+/\)\?'"$ref"'$')"

> Or "origin" in a repository 
> which has "refs/remotes/origin/HEAD".

Well, it does not work, but it would not work before either, because you
are very likely to have something else in origin. Actually, I doubt that
anyone will want to filter "origin", but if you insist, here is another
grep expression, which should accommodate that case too:

+			grep '^refs/\([^/]\+/\)\?'"$ref"'\(/HEAD\)\?$')"

In any case, I believe it would be better to have a more strict grep
expression than one that is used by git-filter-branch now, because now
you either have a very confusing error message, or accidentally you
could filter a wrong branch. And as you said before, the proper C
solution is not feasible for 1.5.4, so I believe a better grep
expression is the right thing to do for now.

If you have no other objection, I will resent the patch with the
corrected version of the grep expression.

Dmitry

^ permalink raw reply

* [PATCH] Avoid a useless prefix lookup in strbuf_expand()
From: Marco Costalba @ 2007-12-30 13:46 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Junio C Hamano, Johannes Schindelin

Currently the --prett=format prefix is looked up in a
tight loop in strbuf_expand(), if found is passed as parameter
to format_commit_item() that does another search using a
switch statement to select the proper operation according to
the kind of prefix.

Because the switch statement is already able to discard unknown
matches we don't need the prefix lookup before to call format_commit_item()

This patch removes an useless loop in a very fasth path,
used by, as example, by 'git log' with --pretty=format option

Signed-off-by: Marco Costalba <mcostalba@gmail.com>
---

This patch is somewhat experimental and is not intended to be merged as is.

That's what is missing:

- Matching of multi char prefixes is not 100% reliable, as example to match
  prefix "Cgreen" only the first 'C' and the third char 'e' is
checked, this could
  lead to aliases in case of malformed prefixes, as example something like
  "Cxxexxxx" will match the same.


- With this patch placeholders array defined in format_commit_message() becames
  useless. That code should be refactored to remove the vector and
perhaps add some
  stricter checking rules directly inside format_commit_item()


Anyhow with this patch we pass from


marco@localhost linux-2.6]$ time git log --topo-order --no-color
--parents -z --log-size --boundary
--pretty=format:"%m%HX%PX%n%an<%ae>%n%at%n%s%n%b" HEAD > /dev/null
2.89user 0.07system 0:02.96elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+27154minor)pagefaults 0swaps


With the super optimized prefixcmp() patch (see the other thread)

to the current

[marco@localhost linux-2.6]$ time git log --topo-order --no-color
--parents -z --log-size --boundary
--pretty=format:"%m%HX%PX%n%an<%ae>%n%at%n%s%n%b" HEAD > /dev/null
2.76user 0.08system 0:02.85elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+27153minor)pagefaults 0swaps


 pretty.c |   43 ++++++++++++++++++++++---------------------
 strbuf.c |   16 +++++++---------
 strbuf.h |    2 +-
 3 files changed, 30 insertions(+), 31 deletions(-)

diff --git a/pretty.c b/pretty.c
index 5b1078b..6225042 100644
--- a/pretty.c
+++ b/pretty.c
@@ -432,7 +432,7 @@ static void parse_commit_header(struct
format_commit_context *context)
 	context->commit_header_parsed = 1;
 }

-static void format_commit_item(struct strbuf *sb, const char *placeholder,
+static int format_commit_item(struct strbuf *sb, const char *placeholder,
                                void *context)
 {
 	struct format_commit_context *c = context;
@@ -446,20 +446,20 @@ static void format_commit_item(struct strbuf *sb,
 		switch (placeholder[3]) {
 		case 'd':	/* red */
 			strbuf_addstr(sb, "\033[31m");
-			return;
+			return 4;
 		case 'e':	/* green */
 			strbuf_addstr(sb, "\033[32m");
-			return;
+			return 6;
 		case 'u':	/* blue */
 			strbuf_addstr(sb, "\033[34m");
-			return;
+			return 5;
 		case 's':	/* reset color */
 			strbuf_addstr(sb, "\033[m");
-			return;
+			return 6;
 		}
 	case 'n':		/* newline */
 		strbuf_addch(sb, '\n');
-		return;
+		return 1;
 	}

 	/* these depend on the commit */
@@ -469,34 +469,34 @@ static void format_commit_item(struct strbuf *sb,
 	switch (placeholder[0]) {
 	case 'H':		/* commit hash */
 		strbuf_addstr(sb, sha1_to_hex(commit->object.sha1));
-		return;
+		return 1;
 	case 'h':		/* abbreviated commit hash */
 		if (add_again(sb, &c->abbrev_commit_hash))
-			return;
+			return 1;
 		strbuf_addstr(sb, find_unique_abbrev(commit->object.sha1,
 		                                     DEFAULT_ABBREV));
 		c->abbrev_commit_hash.len = sb->len - c->abbrev_commit_hash.off;
-		return;
+		return 1;
 	case 'T':		/* tree hash */
 		strbuf_addstr(sb, sha1_to_hex(commit->tree->object.sha1));
-		return;
+		return 1;
 	case 't':		/* abbreviated tree hash */
 		if (add_again(sb, &c->abbrev_tree_hash))
-			return;
+			return 1;
 		strbuf_addstr(sb, find_unique_abbrev(commit->tree->object.sha1,
 		                                     DEFAULT_ABBREV));
 		c->abbrev_tree_hash.len = sb->len - c->abbrev_tree_hash.off;
-		return;
+		return 1;
 	case 'P':		/* parent hashes */
 		for (p = commit->parents; p; p = p->next) {
 			if (p != commit->parents)
 				strbuf_addch(sb, ' ');
 			strbuf_addstr(sb, sha1_to_hex(p->item->object.sha1));
 		}
-		return;
+		return 1;
 	case 'p':		/* abbreviated parent hashes */
 		if (add_again(sb, &c->abbrev_parent_hashes))
-			return;
+			return 1;
 		for (p = commit->parents; p; p = p->next) {
 			if (p != commit->parents)
 				strbuf_addch(sb, ' ');
@@ -505,14 +505,14 @@ static void format_commit_item(struct strbuf *sb,
 		}
 		c->abbrev_parent_hashes.len = sb->len -
 		                              c->abbrev_parent_hashes.off;
-		return;
+		return 1;
 	case 'm':		/* left/right/bottom */
 		strbuf_addch(sb, (commit->object.flags & BOUNDARY)
 		                 ? '-'
 		                 : (commit->object.flags & SYMMETRIC_LEFT)
 		                 ? '<'
 		                 : '>');
-		return;
+		return 1;
 	}

 	/* For the rest we have to parse the commit header. */
@@ -522,22 +522,23 @@ static void format_commit_item(struct strbuf *sb,
 	switch (placeholder[0]) {
 	case 's':
 		strbuf_add(sb, msg + c->subject.off, c->subject.len);
-		return;
+		return 1;
 	case 'a':
 		format_person_part(sb, placeholder[1],
 		                   msg + c->author.off, c->author.len);
-		return;
+		return 2;
 	case 'c':
 		format_person_part(sb, placeholder[1],
 		                   msg + c->committer.off, c->committer.len);
-		return;
+		return 2;
 	case 'e':
 		strbuf_add(sb, msg + c->encoding.off, c->encoding.len);
-		return;
+		return 1;
 	case 'b':
 		strbuf_addstr(sb, msg + c->body_off);
-		return;
+		return 1;
 	}
+	return 0; /* unknown prefix */
 }

 void format_commit_message(const struct commit *commit,
diff --git a/strbuf.c b/strbuf.c
index b9b194b..3c2a3a7 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -141,7 +141,8 @@ void strbuf_expand(struct strbuf *sb, const char
                    const char **placeholders, expand_fn_t fn, void *context)
 {
 	for (;;) {
-		const char *percent, **p;
+		const char *percent;
+		int prefix_len;

 		percent = strchrnul(format, '%');
 		strbuf_add(sb, format, percent - format);
@@ -149,14 +150,11 @@ void strbuf_expand(struct strbuf *sb, const char
 			break;
 		format = percent + 1;

-		for (p = placeholders; *p; p++) {
-			if (!prefixcmp(format, *p))
-				break;
-		}
-		if (*p) {
-			fn(sb, *p, context);
-			format += strlen(*p);
-		} else
+		prefix_len = fn(sb, format, context);
+
+		if (prefix_len)
+			format += prefix_len;
+		else
 			strbuf_addch(sb, '%');
 	}
 }
diff --git a/strbuf.h b/strbuf.h
index 36d61db..e6d09fc 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -103,7 +103,7 @@ static inline void strbuf_addbuf(struct strbuf *sb,
 }
 extern void strbuf_adddup(struct strbuf *sb, size_t pos, size_t len);

-typedef void (*expand_fn_t) (struct strbuf *sb, const char
*placeholder, void *context);
+typedef int (*expand_fn_t) (struct strbuf *sb, const char
*placeholder, void *context);
 extern void strbuf_expand(struct strbuf *sb, const char *format,
const char **placeholders, expand_fn_t fn, void *context);

 __attribute__((format(printf,2,3)))
-- 
1.5.4.rc2.1.gec59-dirty

^ permalink raw reply related

* git-svn in 1.5.4~rc2 somewhat broken?
From: Florian Weimer @ 2007-12-30 13:09 UTC (permalink / raw)
  To: git

I just tried to run "git svn fetch" on a clone of a Subversion
repository that used to work fine with 1.5.3.  After trying to fix some
things up (sorry, scrollback buffer is not deep enough), it now reports
an index mismatch:

Index mismatch: efcc3165fb64519ff1784903112d725c8682d5d2 != b3e7f07b5e4b79f682718fe6a31107d74ca35ce6

And it finally bails out with:

Last fetched revision of refs/remotes/git-svn was r45313, but we are about to fetch: r851!

This doesn't make sense because all paths (both log messages and
.git/config) refer to:

  http://llvm.org/svn/llvm-project/llvm/trunk

And this repository certainly contains revisions after r851.

With other repositories, it also peforms an index rebuild, but succeeds.

^ permalink raw reply

* Re: [PATCH] Optimize prefixcmp()
From: Marco Costalba @ 2007-12-30 13:02 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0712292307210.14355@wbgn129.biozentrum.uni-wuerzburg.de>

On Dec 29, 2007 11:15 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> However, since you already seem to have a profiling setup ready, I would
> be interested in some numbers, i.e. if this patch is faster for you or
> slower, or shows no effect at all.
>

Yes Johannes, your patch is faster then mine ;-)


These are the results tested on Linux tree:

Vanilla

[marco@localhost linux-2.6]$ time git log --topo-order --no-color
--parents -z --log-size --boundary
--pretty=format:"%m%HX%PX%n%an<%ae>%n%at%n%s%n%b" HEAD > /dev/null
3.61user 0.09system 0:03.70elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+27155minor)pagefaults 0swaps


Marco's path

[marco@localhost linux-2.6]$ time git log --topo-order --no-color
--parents -z --log-size --boundary
--pretty=format:"%m%HX%PX%n%an<%ae>%n%at%n%s%n%b" HEAD > /dev/null
3.21user 0.08system 0:03.30elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+27154minor)pagefaults 0swaps


Johannes's patch

[marco@localhost linux-2.6]$ time git log --topo-order --no-color
--parents -z --log-size --boundary
--pretty=format:"%m%HX%PX%n%an<%ae>%n%at%n%s%n%b" HEAD > /dev/null
2.92user 0.08system 0:03.01elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+27155minor)pagefaults 0swaps



But that's not the end of the story....

After profiling I have found a better yet patch :-)

-------------------- CUT ABOVE --------------------

Subject: [PATCH] Certain codepaths (notably "git log --pretty=format...") use

prefixcmp() extensively, with very short prefixes.  In those cases,
calling strlen() is a wasteful operation, so avoid it.

Initial patch by Johannes Schindelin.

Signed-off-by: Marco Costalba <mcostalba@gmail.com>
---
 git-compat-util.h |   11 ++++++++++-
 1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 79eb10e..843a8f5 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -398,7 +398,16 @@ static inline int sane_case(int x, int high)

 static inline int prefixcmp(const char *str, const char *prefix)
 {
-	return strncmp(str, prefix, strlen(prefix));
+	do {
+		if (*str != *prefix)
+			return *(unsigned const char *)prefix - *(unsigned const char *)str;
+
+		if (!*(++prefix))
+			return 0;
+
+		str++;
+
+	} while (1);
 }

 static inline int strtoul_ui(char const *s, int base, unsigned int *result)
-- 
1.5.4.rc2-dirty

BTW the results with this profiled patch are the followings:

Marco's patch TAKE 2 (profiled one)

[marco@localhost linux-2.6]$ time git log --topo-order --no-color
--parents -z --log-size --boundary
--pretty=format:"%m%HX%PX%n%an<%ae>%n%at%n%s%n%b" HEAD > /dev/null
2.89user 0.07system 0:02.96elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+27154minor)pagefaults 0swaps


Not a big improvement, but an improvement in any case because the
check for (*prefix==0) and for (*str != *prefix) are swapped regarding
your patch, this means that in the common case of a failing match (as
happens where you are looking for a specific prefix in a string
vector) with this patch you avoid the (*prefix==0) comparison because
prefixcmp() exsits just after the (*str != *prefix).


Of course we need that the *prefix is not "", but we have already
ruled out prefix == NULL, so It does not seem a biggie...

Thanks...it was very fun!
Marco

^ permalink raw reply related

* Re: [PATCH] Force new line at end of commit message
From: Junio C Hamano @ 2007-12-30 12:21 UTC (permalink / raw)
  To: しらいしななこ
  Cc: Johannes Schindelin, Shawn O. Pearce, Bernt Hansen, git
In-Reply-To: <200712301158.lBUBwT3r004608@mi1.bluebottle.com>

しらいしななこ  <nanako3@bluebottle.com> writes:

>> Heh, I can see that you do not care---the original did not even
>> add a newline when necessary (and that is why we have this
>> thread).  Instead you were adding a newline regardless to the
>> end of the first commit, but not doing so for the other ones.
>
> Aren't you being too harsh on Johannes these days?

Not on purpose, but perhaps I might have been.

> Everybody knows that you are capable of rewriting that part in Perl or Python yourself to fix the issue.

I actually have been trying to avoid Perl (let alone Python nor
Ruby) as "rebase -i" is primarily Johannes's bailiwick, and I
had an impression that he avoided them for Windows portability.

Unfortunately, sed does not handle incomplete lines well, at
least portably.  POSIX says very little about it, except that
its input shall be "text files" (i.e. no NUL is allowed, each
line separated with <newline> and with less than {LINE_MAX}
bytes in length), and its default operation shall read each line
less its terminating <newline> and after manipulation spit it
out and immediately follow it with a <newline>.  But a popular
implementation (e.g. GNU) actually does not follow the output
with a <newline> if the input was incomplete line [*1*]

[Footnote]

*1* Otherwise, this would have been a way to add a
missing newline to a file that could end with an incomplete
line:

    $ sed -e '' <$file_that_may_end_with_an_incomplete_line

^ permalink raw reply

* Re: [PATCH] Force new line at end of commit message
From: Junio C Hamano @ 2007-12-30 12:05 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Shawn O. Pearce, Bernt Hansen, git
In-Reply-To: <7vprwo8kzd.fsf@gitster.siamese.dyndns.org>

Junio C Hamano <gitster@pobox.com> writes:

> ...  Instead you were adding a newline regardless to the
> end of the first commit, but not doing so for the other ones.

To illustrate, this is what I get when trying to squash four
commits:

    # This is a combination of 4 commits.
    # The first commit's message is:

    Documentation/git-submodule.txt: typofix

    Signed-off-by: Junio C Hamano <gitster@pobox.com>

    # This is the 2nd commit message:

    git-sh-setup: document git_editor() and get_author_ident_from_commit()

    These 2 functions were missing from the manpage.

    Signed-off-by: Miklos Vajna <vmiklos@frugalware.org>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    # This is the 3rd commit message:

    "git pull --tags": error out with a better message.

    When "git pull --tags" is run without any other arguments, the
    ...

Notice that there is a gap before "# This is the 2nd commit" but
there isn't any gap before "# This is the 3rd commit"?

The patch under discussion happens to fix this inconsistency as
a side effect.

^ permalink raw reply

* Re: [PATCH] Force new line at end of commit message
From: しらいしななこ @ 2007-12-30 11:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin, Shawn O. Pearce, Bernt Hansen, git
In-Reply-To: <7vprwo8kzd.fsf@gitster.siamese.dyndns.org>

Quoting Junio C Hamano <gitster@pobox.com>:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
>> Not that I care too deeply, but does that not add a newline regardless 
>> whether it is needed or not?
>
> Heh, I can see that you do not care---the original did not even
> add a newline when necessary (and that is why we have this
> thread).  Instead you were adding a newline regardless to the
> end of the first commit, but not doing so for the other ones.

Aren't you being too harsh on Johannes these days?

Everybody knows that you are capable of rewriting that part in Perl or Python yourself to fix the issue.

-- 
Nanako Shiraishi
http://ivory.ap.teacup.com/nanako3/

----------------------------------------------------------------------
Get a free email address with REAL anti-spam protection.
http://www.bluebottle.com/tag/1

^ permalink raw reply

* Re: [PATCH] Force new line at end of commit message
From: Junio C Hamano @ 2007-12-30 11:45 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Shawn O. Pearce, Bernt Hansen, git
In-Reply-To: <Pine.LNX.4.64.0712301201570.14355@wbgn129.biozentrum.uni-wuerzburg.de>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Not that I care too deeply, but does that not add a newline regardless 
> whether it is needed or not?

Heh, I can see that you do not care---the original did not even
add a newline when necessary (and that is why we have this
thread).  Instead you were adding a newline regardless to the
end of the first commit, but not doing so for the other ones.

The patch just moves that unconditional "echo"; instead of
adding one to the end of the first commit (and only the first
one), it adds before the new commit's title message.

^ permalink raw reply

* [PATCH WIP] sha1-lookup: make selection of 'middle' less aggressive
From: Junio C Hamano @ 2007-12-30 11:38 UTC (permalink / raw)
  To: git
In-Reply-To: <7vd4soa3cw.fsf@gitster.siamese.dyndns.org>

If we pick 'mi' between 'lo' and 'hi' at 50%, which was what the
simple binary search did, we are halving the search space
whether the entry at 'mi' is lower or higher than the target.

The previous patch was about picking not the middle but closer
to 'hi', when we know the target is a lot closer to 'hi' than it
is to 'lo'.  However, if it turns out that the entry at 'mi' is
higher than the target, we would end up reducing the search
space only by the difference between 'mi' and 'hi' (which by
definition is less than 50% --- that was the whole point of not
using the simple binary search), which made the search less
efficient.  And the risk of overshooting is high, because we try
to be too precise.

This tweaks the selection of 'mi' to be a bit closer to the
middle than we would otherwise pick to avoid the problem.

With this patch, we actually see slight improvements in
execution time as well.  In the same partial kde repository
(3.0GB pack, 95MB idx; the numbers are from the same machine as
before, best of 5 runs):

    $ GIT_USE_LOOKUP=t git log -800 --stat HEAD >/dev/null
    3.88user 0.18system 0:04.07elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+56378minor)pagefaults 0swaps

    $ git log -800 --stat HEAD >/dev/null
    3.93user 0.18system 0:04.11elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+60258minor)pagefaults 0swaps

    $ GIT_USE_LOOKUP=t git log -2000 HEAD >/dev/null
    0.05user 0.00system 0:00.06elapsed 95%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+4517minor)pagefaults 0swaps

    $ git log -2000 HEAD >/dev/null
    0.10user 0.03system 0:00.14elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+8505minor)pagefaults 0swaps

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * This is no way close to even 'pu' yet, but I found it an
   interesting mental exercise with a bit of random hackery.

 sha1-lookup.c |   30 +++++++++++++++++++++++++-----
 1 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/sha1-lookup.c b/sha1-lookup.c
index f5c9094..b309270 100644
--- a/sha1-lookup.c
+++ b/sha1-lookup.c
@@ -50,6 +50,12 @@
  * the midway of the table.  It can reasonably be expected to be near
  * 87% (222/256) from the top of the table.
  *
+ * However, we do not want to pick "mi" too precisely.  If the entry at
+ * the 87% in the above example turns out to be higher than the target
+ * we are looking for, we would end up narrowing the search space down
+ * only by 13%, instead of 50% we would get if we did a simple binary
+ * search.  So we would want to hedge our bets by being less aggressive.
+ *
  * The table at "table" holds at least "nr" entries of "elem_size"
  * bytes each.  Each entry has the SHA-1 key at "key_offset".  The
  * table is sorted by the SHA-1 key of the entries.  The caller wants
@@ -119,11 +125,25 @@ int sha1_entry_pos(const void *table,
 		if (hiv < kyv)
 			return -1 - hi;
 
-		if (kyv == lov && lov < hiv - 1)
-			kyv++;
-		else if (kyv == hiv - 1 && lov < kyv)
-			kyv--;
-
+		/*
+		 * Even if we know the target is much closer to 'hi'
+		 * than 'lo', if we pick too precisely and overshoot
+		 * (e.g. when we know 'mi' is closer to 'hi' than to
+		 * 'lo', pick 'mi' that is higher than the target), we
+		 * end up narrowing the search space by a smaller
+		 * amount (i.e. the distance between 'mi' and 'hi')
+		 * than what we would have (i.e. about half of 'lo'
+		 * and 'hi').  Hedge our bets to pick 'mi' less
+		 * aggressively, i.e. make 'mi' a bit closer to the
+		 * middle than we would otherwise pick.
+		 */
+		kyv = (kyv * 1022 + lov + hiv) / 1024;
+		if (lov < hiv - 1) {
+			if (kyv == lov)
+				kyv++;
+			else if (kyv == hiv)
+				kyv--;
+		}
 		mi = (range - 1) * (kyv - lov) / (hiv - lov) + lo;
 
 		if (debug_lookup) {
-- 
1.5.4.rc2.3.g441ed

^ permalink raw reply related

* Re: Why 'git commit --amend' generates different HEAD sha1 each time when no content changes
From: Ping Yin @ 2007-12-30 11:20 UTC (permalink / raw)
  To: Matthias Kestenholz; +Cc: Git Mailing List
In-Reply-To: <1199012360.15996.6.camel@futex>

On Dec 30, 2007 6:59 PM, Matthias Kestenholz <mk@spinlock.ch> wrote:
>
> On Sun, 2007-12-30 at 18:56 +0800, Ping Yin wrote:
> > AFAIK, commit sha1 is only determined by commit object content (say
> > parent commit, tree sha1 and so on). So why 'git commit --amend'
> > changes the commit sha1 when no content changes as following shows.
> >
> The full commit includes a timestamp too, which changed. Try setting the
> GIT_AUTHOR_DATE and GIT_COMMITTER_DATE environment variables, you should
> get the same SHA-1 everytime.
>

Thanks. With 'git show --pretty=fuller', I find that commit date
changes each time and that author date keeps the same.



-- 
Ping Yin

^ permalink raw reply

* Re: Why 'git commit --amend' generates different HEAD sha1 each time when no content changes
From: Ping Yin @ 2007-12-30 11:19 UTC (permalink / raw)
  To: Matthias Kestenholz; +Cc: Git Mailing List
In-Reply-To: <1199012360.15996.6.camel@futex>

On Dec 30, 2007 6:59 PM, Matthias Kestenholz <mk@spinlock.ch> wrote:
>
> On Sun, 2007-12-30 at 18:56 +0800, Ping Yin wrote:
> > AFAIK, commit sha1 is only determined by commit object content (say
> > parent commit, tree sha1 and so on). So why 'git commit --amend'
> > changes the commit sha1 when no content changes as following shows.
> >
>
> The full commit includes a timestamp too, which changed. Try setting the
> GIT_AUTHOR_DATE and GIT_COMMITTER_DATE environment variables, you should
> get the same SHA-1 everytime.
>



-- 
Ping Yin

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox