Git development
 help / color / mirror / Atom feed
* Re: VCS comparison table
From: Jakub Narebski @ 2006-10-20 15:28 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, bazaar-ng
In-Reply-To: <Pine.LNX.4.63.0610201715040.14200@wbgn013.biozentrum.uni-wuerzburg.de>

Johannes Schindelin wrote:

> On Fri, 20 Oct 2006, Jakub Narebski wrote:
> 
>> Christian MICHON wrote:
>> 
>>> - git is the fastest scm around
>> 
>> Mercurial also claims that.
> 
> Funny. When you type in "mercurial" and "benchmark" into Google, the 
> _first_ hit is into "git Archives: Mercurial 0.4b vs git patchbomb 
> benchmark". Performed by the good Mercurial people.
> 
> Leaving git as winner.
 
Check out http://git.or.cz/gitwiki/GitBenchmarks section "Quilt import 
comparison of Git and Mercurial" for the latest (OLS2006) benchmark
by Mercurial. Probably not indexed by Google, or doesn't have high 
pagerank because it is in PDF and fairly new (therefore has low 
"citations" number).

-- 
Jakub Narebski
Poland

^ permalink raw reply

* Re: VCS comparison table
From: J. Bruce Fields @ 2006-10-20 15:33 UTC (permalink / raw)
  To: Jeff King
  Cc: Carl Worth, Aaron Bentley, Linus Torvalds, Jakub Narebski,
	Andreas Ericsson, bazaar-ng, git
In-Reply-To: <20061020143111.GB17497@coredump.intra.peff.net>

On Fri, Oct 20, 2006 at 10:31:11AM -0400, Jeff King wrote:
> On Thu, Oct 19, 2006 at 01:14:09PM -0400, J. Bruce Fields wrote:
> > So in this case you can certainly lose the launch codes.  But you have
> > forever granted everyone a way to determine whether a given guess at the
> > launch codes is correct.  (Again, assuming some stuff about SHA1).
> 
> In what sense? Yes, you can make a guess if you have stored the SHA1
> that contained the launch codes. But the point is that that particular
> SHA1 is no longer part of the repository.

Well, I thought the discussion was about what meaning references have
after branches were modified or removed.  In which case the interesting
situation is one where an object is gone but someone somewhere still
holds a reference (because the SHA1 was mentioned in a bug report or an
email or whatever).

> Keeping that SHA1 is no easier than just keeping the launch codes in
> the first place.

Could be.

Anyway, the important difference between the SHA1 references and small
integers is that there's no aliasing in the former case.  Which is
important--I'd rather have a reference to nothing than a reference to
the wrong thing....

--b.

^ permalink raw reply

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
From: Aaron Bentley @ 2006-10-20 15:34 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git
In-Reply-To: <ehao3e$2qv$1@sea.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 2005 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Aaron Bentley wrote:
>>In Bazaar bundles, the text of the diff is an integral part of the data.
>> It is used to generate the text of all the files in the revision.
> 
> 
> I thought that the diff was combined diff of changes.

It is.  It's a description of how to produce revision X given revision
Y, where Y is the last-merged mainline revision.

>>Bazaar bundles were designed to be used on mailing lists.  So you can
>>review the changes from the diff, comment on them, and if it seems
>>suitable, merge them.
> 
> 
> If you have only mega-diff, you can comment only on this mega-diff.

That is what we prefer to review.

>>>Although that might just make the email bigger for not a lot of
>>>gain.
>>
>>It's my understanding that most changes discussed on lkml are provided
>>as a series of patches.  Bazaar bundles are intended as a direct
>>replacement for patches in that use case.
> 
> 
> As _series_ of patches. You have git-format-patch + git-send-email
> to format and send them, git-am to apply them (as patches, not as branch).

If you want to do it exactly the same way, you send a series of bundles.

The bundle format can also support sending a single bundles that
displays the series of patches, though there's currently no UI to select
this.

> I was under an impression that user sees only mega-patch of all the
> revisions in bundle together, and rest is for machine consumption only.

All of it is for machine consumption.  The MIME-encoded sections are a
series of patches.  They're usually MIME-encoded to avoid confusion with
the overview patch, but this is optional.

I've attached an example of what a combined patch-by-patch bundle looks
like.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFOOyB0F+nu1YWqI0RAtU6AKCJndTNlTTPNnzxZX53lkBUUHTYkwCfePlG
7x3cjpYwh8LXEb5ZWXXmu6s=
=6Lgv
-----END PGP SIGNATURE-----

[-- Attachment #2: hello-world.patch --]
[-- Type: text/x-patch, Size: 1808 bytes --]

# Bazaar revision bundle v0.8
#
# message:
#   Added 'world'
# committer: Aaron Bentley <abentley@panoramicfeedback.com>
# date: Fri 2006-10-20 11:30:21.903000116 -0400

=== modified file world
--- world
+++ world
@@ -1,1 +1,1 @@
-Hello
+Hello, world

=== modified directory  // last-changed:abentley@panoramicfeedback.com-20061020
... 153021-b5fcea14e9cd2b34
# revision id: abentley@panoramicfeedback.com-20061020153021-b5fcea14e9cd2b34
# sha1: 6d553e72158aaa76c258d98c15cd24922d171cd9
# inventory sha1: 64af82c4d81d9d6ad4f33fc734d32c2a1eaa0df5
# parent ids:
#   abentley@panoramicfeedback.com-20061020152951-10cff5ff5a51e9a2
# properties:
#   branch-nick: bar

# message:
#   Capitalized
# committer: Aaron Bentley <abentley@panoramicfeedback.com>
# date: Fri 2006-10-20 11:29:51.953999996 -0400

=== modified file world
--- world
+++ world
@@ -1,1 +1,1 @@
-hello
+Hello

=== modified directory  // last-changed:abentley@panoramicfeedback.com-20061020
... 152951-10cff5ff5a51e9a2
# revision id: abentley@panoramicfeedback.com-20061020152951-10cff5ff5a51e9a2
# sha1: f7b79934bc3b0a944e35168b5df6b106c5b29ebf
# inventory sha1: 1400d56451752300cc31c9c94ff7ee2188e8ef8c
# parent ids:
#   abentley@panoramicfeedback.com-20061020152935-64bde004f622131f
# properties:
#   branch-nick: bar

# message:
#   initial commit
# committer: Aaron Bentley <abentley@panoramicfeedback.com>
# date: Fri 2006-10-20 11:29:35.536999941 -0400

=== added directory  // file-id:TREE_ROOT
=== added file world // file-id:world-20061020152929-12bknd8mm9mx48as-1
--- /dev/null
+++ world
@@ -0,0 +1,1 @@
+hello

# revision id: abentley@panoramicfeedback.com-20061020152935-64bde004f622131f
# sha1: 0728f761b891b257f0a71e2e360799eec080cd21
# inventory sha1: e52e030ea40f6bf5da78f4e8eb8efcd072b0930a
# properties:
#   branch-nick: bar


^ permalink raw reply

* Re: VCS comparison table
From: Jakub Narebski @ 2006-10-20 15:34 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Jeff King, Aaron Bentley, Carl Worth, Linus Torvalds,
	Andreas Ericsson, bazaar-ng, git
In-Reply-To: <Pine.LNX.4.63.0610201647420.14200@wbgn013.biozentrum.uni-wuerzburg.de>

Johannes Schindelin wrote:
> On Fri, 20 Oct 2006, Jakub Narebski wrote:
> 
>> Jeff King wrote:
>>> 
>>> I was accustomed to doing such things in CVS, but I find the git way
>>> much more pleasant, since I don't have to do any arithmetic:
>>>   diff d8a60^..d8a60
>> 
>> By the way "diff d8a60" also works (unless d8a60 is merge commit, in
>> which case you would need "diff -c d8a60" or "diff -m d8a60").
> 
> I could be wrong, but I have the impression (even after actually testing 
> it) that "git diff d8a60" is equivalent to "git diff d8a60..HEAD", _not_ 
> "git diff d8a60^..d8a60".

Ooops, I mixed git-diff-tree (which behaves as mentioned above) with
git-diff, which according to documentation compares with working tree
(and not HEAD) if only one <tree-ish> is given.

git-diff(1):
       ?  When  one  <tree-ish>  is given, the working tree and the named tree are
          compared, using git-diff-index. The option --cached can be given to com-
          pare the index file and the named tree.

git-diff-tree(1):
       If there is only one <tree-ish> given, the commit is compared with its par-
       ents (see --stdin below).
-- 
Jakub Narebski
Poland

^ permalink raw reply

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
From: Sean @ 2006-10-20 15:37 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Alexander Belchenko, bazaar-ng, git
In-Reply-To: <4538D724.5040508@utoronto.ca>

On Fri, 20 Oct 2006 10:03:16 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> In Bazaar bundles, the text of the diff is an integral part of the data.
> It is used to generate the text of all the files in the revision.
> 
> Bazaar bundles were designed to be used on mailing lists.  So you can
> review the changes from the diff, comment on them, and if it seems
> suitable, merge them.

Perhaps I missed something in the earlier mails about this feature.
As I understood it, the email sent has a combined diff that shows
the net effect of all the commits included in the bundle.  (Whereas
the current Cogito version only shows a diffstat)

If the recipient of such a bundle is unable to extract the diff of
each separate commit included in the bundle then I can't see any
value in the feature at all.  But showing a combined diff in the
email may have marginal value, so long as when the bundle is 
imported into the recipient repository the individual commits
are available.

> It's my understanding that most changes discussed on lkml are provided
> as a series of patches.  Bazaar bundles are intended as a direct
> replacement for patches in that use case.

A combined diff of a bunch of changes would usually be most _unwelcome_
for review on lkml.  The constant refrain is to ask people to split their
changes up into smallish individual patches for review.

Sean

^ permalink raw reply

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
From: Sean @ 2006-10-20 15:37 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Alexander Belchenko, bazaar-ng, git
In-Reply-To: <4538D724.5040508@utoronto.ca>

On Fri, 20 Oct 2006 10:03:16 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> In Bazaar bundles, the text of the diff is an integral part of the data.
> It is used to generate the text of all the files in the revision.
> 
> Bazaar bundles were designed to be used on mailing lists.  So you can
> review the changes from the diff, comment on them, and if it seems
> suitable, merge them.

Perhaps I missed something in the earlier mails about this feature.
As I understood it, the email sent has a combined diff that shows
the net effect of all the commits included in the bundle.  (Whereas
the current Cogito version only shows a diffstat)

If the recipient of such a bundle is unable to extract the diff of
each separate commit included in the bundle then I can't see any
value in the feature at all.  But showing a combined diff in the
email may have marginal value, so long as when the bundle is 
imported into the recipient repository the individual commits
are available.

> It's my understanding that most changes discussed on lkml are provided
> as a series of patches.  Bazaar bundles are intended as a direct
> replacement for patches in that use case.

A combined diff of a bunch of changes would usually be most _unwelcome_
for review on lkml.  The constant refrain is to ask people to split their
changes up into smallish individual patches for review.

Sean

^ permalink raw reply

* Re: VCS comparison table
From: Johannes Schindelin @ 2006-10-20 15:39 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, bazaar-ng
In-Reply-To: <200610201728.13327.jnareb@gmail.com>

Hi,

On Fri, 20 Oct 2006, Jakub Narebski wrote:

> Johannes Schindelin wrote:
> 
> > On Fri, 20 Oct 2006, Jakub Narebski wrote:
> > 
> >> Christian MICHON wrote:
> >> 
> >>> - git is the fastest scm around
> >> 
> >> Mercurial also claims that.
> > 
> > Funny. When you type in "mercurial" and "benchmark" into Google, the 
> > _first_ hit is into "git Archives: Mercurial 0.4b vs git patchbomb 
> > benchmark". Performed by the good Mercurial people.
> > 
> > Leaving git as winner.
>  
> Check out http://git.or.cz/gitwiki/GitBenchmarks section "Quilt import 
> comparison of Git and Mercurial" for the latest (OLS2006) benchmark
> by Mercurial.

Thanks for the hint!

BTW the tests in Clone/status/pull make sense, especially the "4 times 
slower on pull/merge". In my tests, merge-recur (the default merge 
strategy, which was written in Python, and is now in C) was substantially 
faster.

> Probably not indexed by Google, or doesn't have high pagerank because it 
> is in PDF and fairly new (therefore has low "citations" number).

I hope these posts boost the pagerank.

Ciao,
Dscho

^ permalink raw reply

* Re: VCS comparison table
From: Jeff King @ 2006-10-20 15:43 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Carl Worth, Aaron Bentley, Linus Torvalds, Jakub Narebski,
	Andreas Ericsson, bazaar-ng, git
In-Reply-To: <20061020153323.GA12886@fieldses.org>

On Fri, Oct 20, 2006 at 11:33:23AM -0400, J. Bruce Fields wrote:

> Well, I thought the discussion was about what meaning references have
> after branches were modified or removed.  In which case the interesting
> situation is one where an object is gone but someone somewhere still
> holds a reference (because the SHA1 was mentioned in a bug report or an
> email or whatever).

Git tries very hard to make sure you don't have a reference to something
that doesn't exist. But yes, you could have a reference to the SHA1 in
another, non-git source, and try to guess the data from it. However,
there's a bit of a two-step procedure, since the SHA1 will likely be of
the commit. You have to guess the commit author, date, message, and
the contents of the rest of the tree to make a correct guess.

In practice I think most "launch code" scenarios are less about
guessable confidentiality, and more about ceasing to publish things you
shouldn't be (like copyright or patent encumbered code).

-Peff

^ permalink raw reply

* Re: [PATCH] Don't use $author_name undefined when $from contains no /\s</.
From: Jakub Narebski @ 2006-10-20 15:48 UTC (permalink / raw)
  To: git; +Cc: bug-gnu-utils
In-Reply-To: <87pscnj29t.fsf@penguin.cs.ucla.edu>

Paul Eggert wrote:

> Junio C Hamano <junkio@cox.net> writes:
> 
>> If "trailing space" highlighting picks up the first column blank
>> in "diff -u" output, that highlighting feature is *broken*.
> 
> If the buffer contains arbitrary text, some of which is diff -u output
> and some of which is not, then it it isn't possible in general for the
> highlighting mode to distinguish between the diff -u part and the
> other part.

Not true. If GNU patch (and git-apply) can detect where diff begins,
and can detect if diff was truncated, then highlighting mode can
distinguish between diff -u part and rest... well, unless you intermix
diff-u output and arbitrary text (so the patch would not apply, but what
happens when commenting a patch).

Still I'd rather relax highlighting code to not highlight "SPC LF"
than to change diff -u format.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply

* Re: VCS comparison table
From: Linus Torvalds @ 2006-10-20 15:58 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git
In-Reply-To: <eha9no$5t7$1@sea.gmane.org>



On Fri, 20 Oct 2006, Jakub Narebski wrote:
> Junio C Hamano wrote:
> > 
> > An interesting effect on this is when people have a column for
> > merge performance in a SCM comparison table, they would include
> > time to run the diffstat as part of the time spent for merging
> > when they fill in the number for git, but not for any other SCM.
> 
> So if you want to compare merge performance with other SCM, you should
> either add time to run diffstat for other SCM, or substract time to
> run "git diff-tree --stat".

Naah. Just run "git pull -n". It's even documented:

	OPTIONS
	       -n, --no-summary
	              Do not show diffstat at the end of the merge.

so while the _default_ is to always show the diffstat, you certainly can 
easily do without it.

		Linus

^ permalink raw reply

* Re: [PATCH 2/2] Remove dead code after direct graph drawing
From: Jakub Narebski @ 2006-10-20 16:01 UTC (permalink / raw)
  To: git
In-Reply-To: <e5bfff550610200449j245f9014r984b8372fcd602d0@mail.gmail.com>

Marco Costalba wrote:

> On 10/20/06, Josef Weidendorfer <Josef.Weidendorfer@gmx.de> wrote:
>> On Thursday 19 October 2006 16:13, Josef Weidendorfer wrote:
>> > Signed-off-by: Josef Weidendorfer <Josef.Weidendorfer@gmx.de>
>>
>> Hmmm...
>>
>> Is the git mailing list the right place for qgit patches?
> 
> Yes, I don't see other competitors ;-)
> 
>> Probably, I should have prefixed them with "qgit:" ...

Or use [PATCH (qgit)] or equivalent...
 
> No problem, I should found them anyway and I don't need to manually
> remove "qgit" prefix before to apply to repository.

...which would be stripped automatically
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply

* Re: VCS comparison table
From: Jakub Narebski @ 2006-10-20 16:05 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, bazaar-ng
In-Reply-To: <Pine.LNX.4.63.0610201736440.14200@wbgn013.biozentrum.uni-wuerzburg.de>

Johannes Schindelin wrote:
> On Fri, 20 Oct 2006, Jakub Narebski wrote:
>> Johannes Schindelin wrote:
>>> On Fri, 20 Oct 2006, Jakub Narebski wrote:
>>> 
>>>> Christian MICHON wrote:
>>>> 
>>>>> - git is the fastest scm around
>>>> 
>>>> Mercurial also claims that.
>>> 
>>> Funny. When you type in "mercurial" and "benchmark" into Google, the 
>>> _first_ hit is into "git Archives: Mercurial 0.4b vs git patchbomb 
>>> benchmark". Performed by the good Mercurial people.
>>> 
>>> Leaving git as winner.
>>  
>> Check out http://git.or.cz/gitwiki/GitBenchmarks section "Quilt import 
>> comparison of Git and Mercurial" for the latest (OLS2006) benchmark
>> by Mercurial.
> 
> Thanks for the hint!
> 
> BTW the tests in Clone/status/pull make sense, especially the "4 times 
> slower on pull/merge". In my tests, merge-recur (the default merge 
> strategy, which was written in Python, and is now in C) was substantially 
> faster.

As it was mentioned somewhere else in this thread, to compare times
for pull/merge in git with other SCM one should in principle substract
time for diffstat/git diff --stat.

-- 
Jakub Narebski
Poland

^ permalink raw reply

* [PATCH] cogito: Honor either post-commit script name; fail if both are executable
From: Jim Meyering @ 2006-10-20 16:15 UTC (permalink / raw)
  To: git

Hi,

I promised this patch some time ago, made the changes,
and then never sent them.  This is slightly different
from the current implementation in that it fails when both
scripts are executable.  Also, it factors out the script names and
adds tests.

Signed-off-by: Jim Meyering <jim@meyering.net>
---
 cg-commit        |   43 ++++++++++++++++++++++++++++++-------------
 t/t9800-hooks.sh |   37 +++++++++++++++++++++++++++++++++++++
 2 files changed, 67 insertions(+), 13 deletions(-)

diff --git a/cg-commit b/cg-commit
index 01a4eb7..ee4769e 100755
--- a/cg-commit
+++ b/cg-commit
@@ -148,11 +148,11 @@ #	If the file exists it will be used as
 #	the commit message. The template file makes it possible to
 #	automatically add `Signed-off-by` line to the log message.
 #
-# $GIT_DIR/hooks/commit-post::
+# $GIT_DIR/hooks/post-commit:: (legacy: commit-post)
 #	If the file exists and is executable it will be executed upon
 #	completion of the commit. The script is passed two arguments.
 #	The first argument is the commit ID and the second is the
-#	branchname. A sample `commit-post` script might look like:
+#	branchname. A sample `post-commit` script might look like:
 #
 #	#!/bin/sh
 #	id=$1
@@ -680,18 +680,35 @@ elif [ "$newhead" ]; then
 		branchname="$(cat "$_git/branch-name")"
 	fi
 	[ -z "$branchname" ] && [ "$_git_head" != "master" ] && branchname="$_git_head"
-	if [ -x "$_git/hooks/commit-post" -o -x "$_git/hooks/post-commit" ] && [ ! "$no_hooks" ]; then
-		if [ "$(git-repo-config --bool cogito.hooks.commit.post.allmerged)" = "true" ]; then
-			# We just hope that for the initial commit, the user didn't
-			# manage to install the hook yet.
-			for merged in $(git-rev-list $newhead ^$oldhead | tac); do
-				[ -x "$_git/hooks/commit-post" ] && "$_git/hooks/commit-post" "$merged" "$branchname"
-				[ -x "$_git/hooks/post-commit" ] && "$_git/hooks/post-commit" "$merged" "$branchname"
-			done
-		else
-			[ -x "$_git/hooks/commit-post" ] && "$_git/hooks/commit-post" "$newhead" "$branchname"
-			[ -x "$_git/hooks/post-commit" ] && "$_git/hooks/post-commit" "$newhead" "$branchname"
+
+	if [ "$no_hooks" ]; then
+		exit 0
+	fi
+
+	# Decide which spelling to use for the post-commit hook script name.
+	post_commit="$_git/hooks/post-commit"
+	old_name="$_git/hooks/commit-post"
+	if [ -x "$post_commit" ]; then
+		if [ -x "$old_name" ]; then
+			die "both $post_commit and $old_name are executable."
 		fi
+		# This is the expected case: use $post_commit
+	else
+		if [ ! -x "$old_name" ]; then
+			# neither is executable: do nothing
+			exit 0
+		fi
+		post_commit=$old_name
+	fi
+
+	if [ "$(git-repo-config --bool cogito.hooks.commit.post.allmerged)" = "true" ]; then
+		# We just hope that for the initial commit, the user didn't
+		# manage to install the hook yet.
+		for merged in $(git-rev-list $newhead ^$oldhead | tac); do
+			"$post_commit" "$merged" "$branchname"
+		done
+	else
+		"$post_commit" "$newhead" "$branchname"
 	fi
 
 	exit 0
diff --git a/t/t9800-hooks.sh b/t/t9800-hooks.sh
new file mode 100755
index 0000000..a54e35f
--- /dev/null
+++ b/t/t9800-hooks.sh
@@ -0,0 +1,37 @@
+#!/usr/bin/env bash
+#
+# Copyright (c) 2006 Jim Meyering
+#
+test_description="Test the commit hooks."
+
+. ./test-lib.sh
+rm -rf .git
+
+echo x > f
+test_expect_success 'initialize repo' \
+	'(cg-init -m"Initial commit")'
+
+commit_post=.git/hooks/commit-post
+post_commit=.git/hooks/post-commit
+cat > $post_commit <<\EOF
+#!/bin/sh
+echo $0
+exit 0
+EOF
+cp $post_commit $commit_post
+
+chmod a+x $post_commit
+test_expect_success 'test the post-commit name' \
+	'(echo 1 >> f; test "`cg-commit -m x|grep git/hooks`" = $post_commit)'
+
+chmod a-x $post_commit
+chmod a+x $commit_post
+
+test_expect_success 'test the legacy commit-post name' \
+	'(echo 2 >> f; test "`cg-commit -m x|grep git/hooks`" = $commit_post)'
+
+chmod a+x $post_commit
+test_expect_failure 'fail when both are executable' \
+	'(echo 3 >> f; cg-commit -m x)'
+
+test_done
-- 
1.4.3.ge193

^ permalink raw reply related

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
From: Jakub Narebski @ 2006-10-20 16:21 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git
In-Reply-To: <4538EC8F.7020502@utoronto.ca>

Aaron Bentley wrote:

> === added directory  // file-id:TREE_ROOT

Gaaah, so rename detection in bzr is done using file-ids?
Linus will tell you the inherent problems with that "solution".
-- 
Jakub Narebski
Poland

^ permalink raw reply

* Re: [PATCH] Don't use $author_name undefined when $from contains no /\s</.
From: Linus Torvalds @ 2006-10-20 16:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Paul Eggert, git
In-Reply-To: <7vhcxzpgot.fsf@assigned-by-dhcp.cox.net>



On Fri, 20 Oct 2006, Junio C Hamano wrote:
> 
> Coding a workaround is not a big deal; the change is simple and
> trivial.

Yeah, I sent Junio a patch that _should_ make git accept the patches 
already, so technically it was easy.

What irritates me personally about the new format for "-u" is that

 - Maybe "-u" is new as far as _POSIX_ is concerned, but daamn, it's been 
   a standard format for a hell of a long time in real life, and this was 
   a totally gratuitous change.

 - The new format is very much a new "special case". Now a totally empty 
   line means exactly the same as a line that is " \n", so we have a new 
   special case that simply didn't use to exist - we used to be able to 
   just always skip the first character on a line, and consider the rest 
   of the line to be "the data". Now you can't do that any more.

   The fact that GNU patch has always accepted total crap patches, has 
   always been a thorn in my side: GNU patch is simply too accepting by 
   default if you care about the integrity of the end result (I always ran 
   it with "-p1 --fuzz=0" just to at least fix the most egregious cases of 
   "we'll accept anything that loks even _remotely_ likely to apply")

 - git-apply was being very strict with patches on purpose. The "empty 
   line in a patch" error has triggered several time for me, and at least 
   so far it has _not_ ever been due to a new GNU patch, but every time 
   due to a broken mailer or somebody not being careful when editing the 
   patch by hand.  So triggering an error has been the _right_ thing to 
   do so far - it's been a big red sign saying "somebody did something bad 
   to this patch".

so I think the new format is strictly speaking a regression. It takes away 
a good sanity-check, and we're stuck with having to handle old-style 
patches _anyway_ for the forseeable future, so we can't replace it with a 
new sanity check.

But it does seem like we have no choice, simply because people apparently 
already use the broken version.

			Linus

^ permalink raw reply

* Re: VCS comparison table
From: Jakub Narebski @ 2006-10-20 16:24 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng
In-Reply-To: <200610201805.40235.jnareb@gmail.com>

Jakub Narebski wrote:

> Johannes Schindelin wrote:
>> On Fri, 20 Oct 2006, Jakub Narebski wrote:
>>> Johannes Schindelin wrote:
>>>> On Fri, 20 Oct 2006, Jakub Narebski wrote:
>>>> 
>>>>> Christian MICHON wrote:
>>>>> 
>>>>>> - git is the fastest scm around
>>>>> 
>>>>> Mercurial also claims that.
>>>> 
>>>> Funny. When you type in "mercurial" and "benchmark" into Google, the 
>>>> _first_ hit is into "git Archives: Mercurial 0.4b vs git patchbomb 
>>>> benchmark". Performed by the good Mercurial people.
>>>> 
>>>> Leaving git as winner.
>>>  
>>> Check out http://git.or.cz/gitwiki/GitBenchmarks section "Quilt import 
>>> comparison of Git and Mercurial" for the latest (OLS2006) benchmark
>>> by Mercurial.
>> 
>> Thanks for the hint!
>> 
>> BTW the tests in Clone/status/pull make sense, especially the "4 times 
>> slower on pull/merge". In my tests, merge-recur (the default merge 
>> strategy, which was written in Python, and is now in C) was substantially 
>> faster.
> 
> As it was mentioned somewhere else in this thread, to compare times
> for pull/merge in git with other SCM one should in principle substract
> time for diffstat/git diff --stat.

Or as reminded, use -n, --no-summary option to git pull.

BTW. I'd rather have -n == --no-commit for git pull...
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply

* Re: Signed git-tag doesn't find default key
From: Linus Torvalds @ 2006-10-20 16:32 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git
In-Reply-To: <200610201004.17263.andyparkins@gmail.com>



On Fri, 20 Oct 2006, Andy Parkins wrote:
> 
> I did this:
> 
> $ git tag -s adp-sign-tag
> gpg: skipped "Andy Parkins <andyparkins@gmail.com>": secret key not available
> gpg: signing failed: secret key not available
> failed to sign the tag with GPG.

I would suggest one of two things:

 - specify the signing entity explicitly:

	git tag -u "andyparkins@gmail.com" adp-sign-tag

 - or just add a new alternate user ID to match the full git user ID.

Currently, your pgp key has the full ID "Andy Parkins (Google) 
<andyparkins@gmail.com>", and the way gpg matches ID's, that will _not_ 
match an ID of "Andy Parkins <andyparkins@gmail.com>"

But you can just do something like

	gpg --edit-key andyparkins@gmail.com

and then do an "adduid", and then add your UID _without_ the "(Google)" in 
there, and that should solve all your problems.

> So when git-tag looks for "Andy Parkins <andyparkins@gmail.com>"; it's not 
> found.  The answer is (I think) to search only on the email address when 
> looking for a key.  I've simply changed git-tag to have
> 
> username=$(git-repo-config user.email)
> 
> However, this is clearly wrong as what it actually wants is the committer 
> email.  Am I safe to simply process the $tagger variable to extract it?

You're probably better off with something like

	git var GIT_COMMITTER_IDENT | sed 's/\(.*\)<\(.*\)>\(.*\)/\2/'

which should work, but see above: I think you literally are better off 
just adding an alias to your PGP key that doesn't have the comment field.

That said, I've never understood why gpg matches on the comment field. 
Dammit, it _should_ find the key anyway. Stupid program.

		Linus

^ permalink raw reply

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
From: Aaron Bentley @ 2006-10-20 17:03 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git
In-Reply-To: <200610201821.34712.jnareb@gmail.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Aaron Bentley wrote:
> 
> 
>>=== added directory  // file-id:TREE_ROOT
> 
> 
> Gaaah, so rename detection in bzr is done using file-ids?
> Linus will tell you the inherent problems with that "solution".

All solutions have disadvantages.  We prefer the disadvantages that come
from using file-ids over the disadvantages that come from using
content-based rename detection.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFOQFo0F+nu1YWqI0RAlCnAJwIqwuPG/IPBBQWaGyEImTm4GMP6QCfTV89
QZaMQsTqXBH8wrt7VKAHpII=
=Qx2i
-----END PGP SIGNATURE-----

^ permalink raw reply

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
From: Linus Torvalds @ 2006-10-20 17:18 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git
In-Reply-To: <45390168.6020502@utoronto.ca>



On Fri, 20 Oct 2006, Aaron Bentley wrote:
> 
> All solutions have disadvantages.  We prefer the disadvantages that come
> from using file-ids over the disadvantages that come from using
> content-based rename detection.

That's fine, but please don't call the git rename handling "maybe" or 
"partial", like a lot of people seem to do. 

Git _definitely_ handles renames, both in everyday life and when merging. 
Some people may not like how it's done, but other (I'll say "equally 
informed", even though obviously I know better ;) people really don't like 
the way bzr or others do their rename handling.

			Linus

^ permalink raw reply

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
From: Shawn Pearce @ 2006-10-20 17:21 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git, Jakub Narebski
In-Reply-To: <45390168.6020502@utoronto.ca>

Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> Jakub Narebski wrote:
> > Gaaah, so rename detection in bzr is done using file-ids?
> > Linus will tell you the inherent problems with that "solution".
> 
> All solutions have disadvantages.  We prefer the disadvantages that come
> from using file-ids over the disadvantages that come from using
> content-based rename detection.

As good as the content based rename detection is I got burned
recently by it.

I renamed hundreds of small files in one shot and also did a few
hundered adds and deletes of other small XML files.  Git generated
a lot of those unrelated adds/deletes as rename/modifies, as their
content was very similiar.  Some people involved in the project
freaked as the files actually had nothing in common with one
another... except for a lot of XML elements (as they shared the
same DTD).

^ permalink raw reply

* Re: VCS comparison table
From: David Lang @ 2006-10-20 17:23 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jakub Narebski, git, bazaar-ng
In-Reply-To: <Pine.LNX.4.63.0610201335420.14200@wbgn013.biozentrum.uni-wuerzburg.de>

On Fri, 20 Oct 2006, Johannes Schindelin wrote:

> On Fri, 20 Oct 2006, Jakub Narebski wrote:
>
>> Johannes Schindelin wrote:
>>
>>> On Fri, 20 Oct 2006, Lachlan Patrick wrote:
>>>
>>>> How does git disambiguate SHA1 hash collisions?
>>>
>>> It does not. You can fully expect the universe to go down before that
>>> happens.
>>
>> Or you can compile git with COLLISION_CHECK
>>
>>> From Makefile:
>> # Define COLLISION_CHECK below if you believe that SHA1's
>> # 1461501637330902918203684832716283019655932542976 hashes do not give you
>> # sufficient guarantee that no collisions between objects will ever happen.
>
> You can document your disbelief.
>
> But it does not change a thing. Since v0.99~653, we do not have any
> collision check, even if compiled with COLLISION_CHECK.

I had the same disbelief as you about this, however the last time this came up 
Linus pointed out something that satisfied me.

any action in git that could create or or recreate an object will not overwrite 
an object that it thinks that it already has.

so

if you create a new local file that would conflict and save it, git will accept 
your save and throw away the new file.

if you pull from a remote repository and there is a file there that conflicts 
with a file you already have it will throw away the new file.

if you pull from a remote repository and someone has hacked it to replace a file 
with a bad one, if you already have the good one git will throw away the bad 
one.

as a result the worst case is that a new file being checked in doesn't really 
get in and when someone checks it out and trys to use it they get the old 
contents. In the case of code, it's extremely unlikly that the wrong code will 
even compile, let alone do anything remotely close to working correctly. At this 
point the fix is to go back to the origional developer to get the correct 
version while additional changes are made to git (and remember, that unless this 
is a brand new file the prior version is readily available so only the latest 
diff needs to be recovered)

so the odds are extremely low and the concequeces of a collision are fairly 
minor.

git has (or had) an option to actually check the full contents before throwing 
away the new copy instead of just checking the hash (and throwing an error if 
the contents don't match), but the performance cost of this is pretty high.

David Lang

^ permalink raw reply

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
From: Jakub Narebski @ 2006-10-20 17:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Aaron Bentley, bazaar-ng, git
In-Reply-To: <Pine.LNX.4.64.0610201016490.3962@g5.osdl.org>

Linus Torvalds wrote:
> 
> On Fri, 20 Oct 2006, Aaron Bentley wrote:
>> 
>> All solutions have disadvantages.  We prefer the disadvantages that come
>> from using file-ids over the disadvantages that come from using
>> content-based rename detection.

If I remember correctly, git decided on contents (plus filename)
similarity based renames detection because 1), it is more generic
as it covers (or can cover) contents moving not only wholesome rename
of a file, and 2) because file-id based renames handling works only
if you explicitely use SCM command to rename file, which is not the
case of non-SCM-aware channel like for example patches (and accepting
ordinary patches is important for Linux kernel, the project git was
created for).

Another problem with file-id based rename handling is not handling
file copying (correct me if I'm wrong), and troubles with removing
or renaming a file, then having new file with old name.
 
> That's fine, but please don't call the git rename handling "maybe" or 
> "partial", like a lot of people seem to do. 
> 
> Git _definitely_ handles renames, both in everyday life and when merging. 
> Some people may not like how it's done, but other (I'll say "equally 
> informed", even though obviously I know better ;) people really don't like 
> the way bzr or others do their rename handling.

I think that "partial" refers to not complete handling of renames
for file history; pathspec doesn't follow history. Although the
information is there in SCM, it's the tools that need extension
(the --follow of rename following single file pathspec limit
proposal).

There was also suggestion of rr2-cache, which would record corrections
to automatic rename detection (rename/copy conflict resolving) 
if I remember correctly.
-- 
Jakub Narebski
Poland

^ permalink raw reply

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
From: Aaron Bentley @ 2006-10-20 17:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, bazaar-ng, git
In-Reply-To: <Pine.LNX.4.64.0610201016490.3962@g5.osdl.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> 
> On Fri, 20 Oct 2006, Aaron Bentley wrote:
> 
>>All solutions have disadvantages.  We prefer the disadvantages that come
>>from using file-ids over the disadvantages that come from using
>>content-based rename detection.
> 
> 
> That's fine, but please don't call the git rename handling "maybe" or 
> "partial", like a lot of people seem to do. 
> 
> Git _definitely_ handles renames, both in everyday life and when merging.

Hmm.  Could you say more here?  The only examples I can think of for
handling renames are situations that can be expressed as a merge.

For example, populating a working tree can be expressed as:
BASE: nothing
THIS: nothing
OTHER: aabbccddee

Or revert can be expressed as

BASE: current
THIS: current
OTHER: aabbccddee

Or fast-forward pull

BASE: last-commit
THIS: current
OTHER: aabbccddee

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFOQuv0F+nu1YWqI0RAotBAKCEEzvh1Cc2jJH4NIEBwoYrDJlbUQCgiPBF
DZ4+hSbkjbvgOwbT4+oLzFA=
=wSgK
-----END PGP SIGNATURE-----

^ permalink raw reply

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
From: Linus Torvalds @ 2006-10-20 17:48 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Aaron Bentley, Jakub Narebski, bazaar-ng, git
In-Reply-To: <20061020172125.GF18019@spearce.org>



On Fri, 20 Oct 2006, Shawn Pearce wrote:
> 
> I renamed hundreds of small files in one shot and also did a few
> hundered adds and deletes of other small XML files.  Git generated
> a lot of those unrelated adds/deletes as rename/modifies, as their
> content was very similiar.  Some people involved in the project
> freaked as the files actually had nothing in common with one
> another... except for a lot of XML elements (as they shared the
> same DTD).

Heh. We can probably tweak the heuristics (one of the _great_ things about 
content detection is that you can fix it after the fact, unlike the 
alternative).

That said, I've personally actually found the content-based similarity 
analysis to often be quite informative, even when (and perhaps 
_especially_ when) it ended up showing something that the actual author of 
the thing didn't intend.

So yeah, I've seen a few strange cases myself, but they've actually been 
interesting. Like seeing how much of a file was just a copyright license, 
and then a file being considered a "copy" just because it didn't actually 
introduce any real new code.

			Linus

^ permalink raw reply

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
From: David Lang @ 2006-10-20 17:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Shawn Pearce, Aaron Bentley, Jakub Narebski, bazaar-ng, git
In-Reply-To: <Pine.LNX.4.64.0610201045550.3962@g5.osdl.org>

On Fri, 20 Oct 2006, Linus Torvalds wrote:

> On Fri, 20 Oct 2006, Shawn Pearce wrote:
>>
>> I renamed hundreds of small files in one shot and also did a few
>> hundered adds and deletes of other small XML files.  Git generated
>> a lot of those unrelated adds/deletes as rename/modifies, as their
>> content was very similiar.  Some people involved in the project
>> freaked as the files actually had nothing in common with one
>> another... except for a lot of XML elements (as they shared the
>> same DTD).
>
> Heh. We can probably tweak the heuristics (one of the _great_ things about
> content detection is that you can fix it after the fact, unlike the
> alternative).
>
> That said, I've personally actually found the content-based similarity
> analysis to often be quite informative, even when (and perhaps
> _especially_ when) it ended up showing something that the actual author of
> the thing didn't intend.
>
> So yeah, I've seen a few strange cases myself, but they've actually been
> interesting. Like seeing how much of a file was just a copyright license,
> and then a file being considered a "copy" just because it didn't actually
> introduce any real new code.
>

isn't the default to consider them a copy if they are 80% the same, with a 
command line option to tweak this (IIRC -m, but I could easily be wrong)

David Lang

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox