Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH 1/3] git-gui: Adapt discovery of oguilib to execdir 'libexec/git-core'
From: Steffen Prohaska @ 2008-07-30  5:39 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git, Johannes Sixt
In-Reply-To: <20080730052517.GF7225@spearce.org>


On Jul 30, 2008, at 7:25 AM, Shawn O. Pearce wrote:

> Steffen Prohaska <prohaska@zib.de> wrote:
>> Isn't only the computation of sharedir based on gitexecdir wrong?
>>
>>> ifndef sharedir
>>> 	sharedir := $(dir $(gitexecdir))share
>>
>> and could be replaced with this (instead of your patch):
>>
>> ifndef sharedir
>> +ifeq (git-core,$(notdir $(gitexecdir)))
>> +       sharedir := $(dir $(patsubst %/,%,$(dir $(gitexecdir))))share
>> +else
>>        sharedir := $(dir $(gitexecdir))share
>> endif
>> +endif
>
> Oh, damn good catch.  Thanks.
>
> How about this then?  Its your patch above, my message, and me
> forging your SOB...

looks good.  SOB ok.

Thanks,
Steffen

^ permalink raw reply

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
From: Steffen Prohaska @ 2008-07-30  5:35 UTC (permalink / raw)
  To: Eyvind Bernhardsen
  Cc: Dmitry Potapov, Johannes Schindelin, Avery Pennarun,
	Joshua Jensen, Junio C Hamano, git
In-Reply-To: <A8BF9951-AB9D-4391-A6CB-E9778064F4A8@orakel.ntnu.no>

On Jul 29, 2008, at 11:17 PM, Eyvind Bernhardsen wrote:

>>> As you say, the reason I want the setting to be per-repository is  
>>> that
>>> I don't think the cost is worthwhile for every repository.
>>
>> Side note: Personally, I am not very concerned about this cost, but  
>> some
>> people are...
>
> Yeah :)
>
> I think the real penalty is that with autocrlf enabled, Git no  
> longer stores exactly what I committed.

Git does *never* exactly store what you committed.  Git compresses
your data and creates packs containing many of your individual
files in a single pack.

What matters is that git gives you exactly back what you committed.  It
does so with core.autocrlf=true, unless you check out with a different
setting for autocrlf.  There is a small chance that git decides that
a file is text even though it should be binary and that the content of
this file does not allow for reversible CRLF-conversion.  In this case
git warns about the irreversible conversion and the user gets a chance
to correct git's choice.

We accept this slight chance of irreversible conversion because we do
want to handle line-endings of text files for cross-platform use.  For
this, the goal of "giving you *exactly* back what you committed" is
modified.  Instead, we want to give you exactly back what you committed,
except for line-endings (in text files), which should be converted to
the platform-dependent line-endings (LF or CRLF), depending on the  
user's
setting.

Because of a design choice we made, CRLF must be converted on Windows.
We decided that the token that git uses *internally* to represent
a line-ending in a text file is LF.  We made this choice because git
originally supported only Unix and so we chose the Unix line-ending for
representing line-endings internally.  Now, Windows uses CRLF to
indicate line-endings but git internally uses LF, so we must convert
them.  Note that if we had users that completely ignored their native
Windows environment and only used well-selected tools, all configured to
*never* write native Windows line-ending, for these users we could set
autocrlf=false and the repository would nonetheless only contain LFs.
Those exceptional super-expert users could manually modify their
settings.  The average user (including me) will not be able to guarantee
that he will never create CRLF in text files on Windows.  Those users
simply accept that they work on Windows and use the native line-endings
(CRLF) and because we care about these average users we set  
autocrlf=true.

In contrast, setting autocrlf=input on Unix is only a safety valve.  The
average user who is only working on Unix will most likely *never* create
CRLF line-endings.  In a Unix-only environment it is actually very hard
to create CRLF line-endings.  Thus, the current default (autocrlf unset)
assumes that all text files on Unix contain only LF, and git wants LF
internally, which means we do not need to convert the line-endings.  In
cross-platform environments however, our assumption that all files on
Unix contain only LFs probably no longer holds.  In a cross-platform
environment you can easily copy files from Windows to Unix and thus
*easily* create files on Unix that contains CRLF.  In this case
autocrlf=input can save you, by correcting the line-endings for you.  In
this case, git *does not* give you exactly back what you committed, but
gives you back the very same text you committed however with the native
LF line-endings.

Personally I believe that our assumption that it is virtually impossible
to unintentionally create CRLF line-endings on Unix is wrong; but the
prevailing opinion on the list is different.  Personally, I believe that
autocrlf=input should be the default on Unix to shield the repository
from CRLFs.  I am using autocrlf=input for some time now and it has
already saved me several times.  Note that I am not working in a
Unix-only environment, but in a mixed Unix/Mac/Windows environment, so
unintentionally creating CRLFs is quite easy.

Another valid concern is speed.  But the timings that Dmitry presented
indicate that the overhead of autocrlf is so small that it is hard to
measure in practice.  I think we should stop raising this concern unless
someone comes up with timings that indicate a larger overhead than
measured by Dmitry.

	Steffen

^ permalink raw reply

* Re: [PATCH 1/3] git-gui: Adapt discovery of oguilib to execdir 'libexec/git-core'
From: Shawn O. Pearce @ 2008-07-30  5:25 UTC (permalink / raw)
  To: Steffen Prohaska; +Cc: git, Johannes Sixt
In-Reply-To: <AF6C526A-57ED-4386-A4CF-5260D82026B7@zib.de>

Steffen Prohaska <prohaska@zib.de> wrote:
> Isn't only the computation of sharedir based on gitexecdir wrong?
>
>> ifndef sharedir
>> 	sharedir := $(dir $(gitexecdir))share
>
> and could be replaced with this (instead of your patch):
>
>  ifndef sharedir
> +ifeq (git-core,$(notdir $(gitexecdir)))
> +       sharedir := $(dir $(patsubst %/,%,$(dir $(gitexecdir))))share
> +else
>         sharedir := $(dir $(gitexecdir))share
>  endif
> +endif

Oh, damn good catch.  Thanks.

How about this then?  Its your patch above, my message, and me
forging your SOB...

--8<--
From: Steffen Prohaska <prohaska@zib.de>
Subject: git-gui: Correct installation of library to be $prefix/share

We always wanted the library for git-gui to install into the
$prefix/share directory, not $prefix/libexec/share.  All of
the files in our library are platform independent and may
be reused across systems, like any other content stored in
the share directory.

Our computation of where our library should install to was broken
when git itself started installing to $prefix/libexec/git-core,
which was one level down from where we expected it to be.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
 Makefile |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/Makefile b/Makefile
index b19fb2d..c9d67fe 100644
--- a/Makefile
+++ b/Makefile
@@ -34,8 +34,12 @@ ifndef gitexecdir
 endif

 ifndef sharedir
+ifeq (git-core,$(notdir $(gitexecdir)))
+	sharedir := $(dir $(patsubst %/,%,$(dir $(gitexecdir))))share
+else
 	sharedir := $(dir $(gitexecdir))share
 endif
+endif

 ifndef INSTALL
 	INSTALL = install
-- 
1.6.0.rc1.166.gbbfa8

-- 
Shawn.

^ permalink raw reply related

* Re: [PATCH] format-patch: Produce better output with --inline or --attach
From: Jeff King @ 2008-07-30  5:24 UTC (permalink / raw)
  To: Kevin Ballard; +Cc: git, Junio C Hamano
In-Reply-To: <1217375356-80287-1-git-send-email-kevin@sb.org>

On Tue, Jul 29, 2008 at 04:49:16PM -0700, Kevin Ballard wrote:

> The first is to write a newline preceding the boundary. This is needed
> because MIME defines the encapsulation boundary as including the
> preceding CRLF (or in this case, just LF), so we should be writing
> one. Without this, the last newline in the pre-diff content is
> consumed instead.

Hmm. I am surprised we haven't had more bug reports about this, but
perhaps these features aren't used too frequently. But I double checked
the MIME specs just to be sure, and you are definitely right.

> The second change is to always write the line termination character (default: newline)

Can you please wrap your commit messages at some more reasonable length?
This line is 86 characters long.

-Peff

^ permalink raw reply

* Re: What is 'git BRANCH'?
From: Jeff King @ 2008-07-30  5:14 UTC (permalink / raw)
  To: sverre; +Cc: Junio C Hamano, Jurko Gospodnetić, git
In-Reply-To: <bd6139dc0807291549y66c56fbah928a854f37573680@mail.gmail.com>

On Wed, Jul 30, 2008 at 12:49:00AM +0200, Sverre Rabbelier wrote:

> Why does handle_internal_command not complain after the "	for (i = 0;
> i < ARRAY_SIZE(commands); i++) {" that no matching commands were
> found? Is that not an implicit assertion that would do well with being
> asserted here?

Because it is called from two places. In one, we _know_ that this must
be internal, so we die right after. In the other, we try internal, then
external, then alias. So we don't want to die. Grep for
handle_internal_cmmand in git.c.

-Peff

^ permalink raw reply

* Re: [PATCH v2] Advertise the ability to abort a commit
From: Jeff King @ 2008-07-30  5:11 UTC (permalink / raw)
  To: Anders Melchiorsen; +Cc: Junio C Hamano, git
In-Reply-To: <20080730050715.GA4034@sigill.intra.peff.net>

On Wed, Jul 30, 2008 at 01:07:15AM -0400, Jeff King wrote:

> > -		die("no commit message?  aborting commit.");
> > +		die("no commit message.  aborting commit.");
> 
> I don't think the change of punctuation makes a big difference here,
> but this could probably stand to be reworded. Maybe:
> 
>   Aborting commit due to empty commit message.

Using "die" also prepends "fatal: " which is perhaps a bit much for an
expected feature. So maybe:

  fprintf(stderr, "Aborting commit due to empty commit message.\n");
  exit(1); /* or even some specific "intentional abort" exit code */

-Peff

^ permalink raw reply

* Re: [PATCH v2] Advertise the ability to abort a commit
From: Jeff King @ 2008-07-30  5:07 UTC (permalink / raw)
  To: Anders Melchiorsen; +Cc: Junio C Hamano, git
In-Reply-To: <1217362342-30370-1-git-send-email-mail@cup.kalibalik.dk>

On Tue, Jul 29, 2008 at 10:12:22PM +0200, Anders Melchiorsen wrote:

>  			"# Please enter the commit message for your changes.\n"
> +			"# To abort the commit, use an empty commit message.\n"
>  			"# (Comment lines starting with '#' will ");

I think this is a good thing to mention, but this text has been getting
longer lately. Maybe we can compact it like this:

  # Please enter the commit message for your changes. Lines starting
  # with '#' will be ignored, and an empty message aborts the commit.

?

> -		die("no commit message?  aborting commit.");
> +		die("no commit message.  aborting commit.");

I don't think the change of punctuation makes a big difference here,
but this could probably stand to be reworded. Maybe:

  Aborting commit due to empty commit message.

-Peff

^ permalink raw reply

* [RFC/PATCH v4 2/2] documentation: merge-base: explain "git merge-base" with more than 2 args
From: Christian Couder @ 2008-07-30  5:04 UTC (permalink / raw)
  To: git, Junio C Hamano, Johannes Schindelin; +Cc: Miklos Vajna, Jakub Narebski

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/git-merge-base.txt |   77 ++++++++++++++++++++++++++++++++-----
 1 files changed, 66 insertions(+), 11 deletions(-)

diff --git a/Documentation/git-merge-base.txt b/Documentation/git-merge-base.txt
index 1a7ecbf..fd4b5c9 100644
--- a/Documentation/git-merge-base.txt
+++ b/Documentation/git-merge-base.txt
@@ -8,26 +8,81 @@ git-merge-base - Find as good common ancestors as possible for a merge
 
 SYNOPSIS
 --------
-'git merge-base' [--all] <commit> <commit>
+'git merge-base' [--all] <commit> <commit>...
 
 DESCRIPTION
 -----------
 
-'git-merge-base' finds as good a common ancestor as possible between
-the two commits. That is, given two commits A and B, `git merge-base A
-B` will output a commit which is reachable from both A and B through
-the parent relationship.
+'git-merge-base' finds best common ancestor(s) between two commits to use
+in a three-way merge.  One common ancestor is 'better' than another common
+ancestor if the latter is an ancestor of the former.  A common ancestor
+that does not have any better common ancestor than it is a 'best common
+ancestor', i.e. a 'merge base'.  Note that there can be more than one
+merge bases between two commits.
 
-Given a selection of equally good common ancestors it should not be
-relied on to decide in any particular way.
-
-The 'git-merge-base' algorithm is still in flux - use the source...
+Among the two commits to compute their merge bases, one is specified by
+the first commit argument on the command line; the other commit is a
+(possibly hypothetical) commit that is a merge across all the remaining
+commits on the command line.  As the most common special case, giving only
+two commits from the command line means computing the merge base between
+the given two commits.
 
 OPTIONS
 -------
 --all::
-	Output all common ancestors for the two commits instead of
-	just one.
+	Output all merge bases for the commits, instead of just one.
+
+DISCUSSION
+----------
+
+Given two commits 'A' and 'B', `git merge-base A B` will output a commit
+which is reachable from both 'A' and 'B' through the parent relationship.
+
+For example, with this topology:
+
+                 o---o---o---B
+                /
+        ---o---1---o---o---o---A
+
+the merge base between 'A' and 'B' is '1'.
+
+Given three commits 'A', 'B' and 'C', `git merge-base A B C` will compute the
+merge base between 'A' and an hypothetical commit 'M', which is a merge
+between 'B' and 'C'.  For example, with this topology:
+
+               o---o---o---o---C
+              /
+             /   o---o---o---B
+            /   /
+        ---2---1---o---o---o---A
+
+the result of `git merge-base A B C` is '1'.  This is because the
+equivalent topology with a merge commit 'M' between 'B' and 'C' is:
+
+
+               o---o---o---o---o
+              /                 \
+             /   o---o---o---o---M
+            /   /
+        ---2---1---o---o---o---A
+
+and the result of `git merge-base A M` is '1'.  Commit '2' is also a
+common ancestor between 'A' and 'M', but '1' is a better common ancestor,
+because '2' is an ancestor of '1'.  Hence, '2' is not a merge base.
+
+When the history involves criss-cross merges, there can be more than one
+'best' common ancestors between two commits.  For example, with this
+topology:
+
+       ---1---o---A
+	   \ /
+	    X
+	   / \
+       ---2---o---o---B
+
+both '1' and '2' are merge-base of A and B.  Neither one is better than
+the other (both are 'best' merge base).  When `--all` option is not given,
+it is unspecified which best one is output.
 
 Author
 ------
-- 
1.6.0.rc0.42.g186458.dirty

^ permalink raw reply related

* [RFC/PATCH v4 1/2] merge-base: teach "git merge-base" to accept more than 2 arguments
From: Christian Couder @ 2008-07-30  5:04 UTC (permalink / raw)
  To: git, Junio C Hamano, Johannes Schindelin; +Cc: Miklos Vajna, Jakub Narebski

Before this patch "git merge-base" accepted only 2 arguments, so
only merge bases between 2 references could be computed.

The purpose of this patch is to make "git merge-base" accept more
than 2 arguments, so that the merge bases between the first given
reference and all the other references can be computed.

This is easily implemented because the "get_merge_bases_many"
function in "commit.c" already implements the needed computation.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin-merge-base.c |   24 ++++++++++++++++--------
 1 files changed, 16 insertions(+), 8 deletions(-)

	Changes since previous version are:

	- the patch is simpler because it now goes on top of
	a fix that added the "get_commit_reference" function,

	- documentation has been removed from the first patch
	and Junio's documentation has been added in a new 2/2
	patch.

	There are still no tests. I need to find time to work
	on them before friday otherwise it will probably have
	to wait until I come back from vacation.

diff --git a/builtin-merge-base.c b/builtin-merge-base.c
index 3382b13..b08da51 100644
--- a/builtin-merge-base.c
+++ b/builtin-merge-base.c
@@ -2,9 +2,11 @@
 #include "cache.h"
 #include "commit.h"
 
-static int show_merge_base(struct commit *rev1, struct commit *rev2, int show_all)
+static int show_merge_base(struct commit **rev, int rev_nr, int show_all)
 {
-	struct commit_list *result = get_merge_bases(rev1, rev2, 0);
+	struct commit_list *result;
+
+	result = get_merge_bases_many(rev[0], rev_nr - 1, rev + 1, 0);
 
 	if (!result)
 		return 1;
@@ -20,7 +22,7 @@ static int show_merge_base(struct commit *rev1, struct commit *rev2, int show_al
 }
 
 static const char merge_base_usage[] =
-"git merge-base [--all] <commit-id> <commit-id>";
+"git merge-base [--all] <commit-id> <commit-id>...";
 
 static struct commit *get_commit_reference(const char *arg)
 {
@@ -38,7 +40,8 @@ static struct commit *get_commit_reference(const char *arg)
 
 int cmd_merge_base(int argc, const char **argv, const char *prefix)
 {
-	struct commit *rev1, *rev2;
+	struct commit **rev;
+	int rev_nr = 0;
 	int show_all = 0;
 
 	git_config(git_default_config, NULL);
@@ -51,10 +54,15 @@ int cmd_merge_base(int argc, const char **argv, const char *prefix)
 			usage(merge_base_usage);
 		argc--; argv++;
 	}
-	if (argc != 3)
+	if (argc < 3)
 		usage(merge_base_usage);
-	rev1 = get_commit_reference(argv[1]);
-	rev2 = get_commit_reference(argv[2]);
 
-	return show_merge_base(rev1, rev2, show_all);
+	rev = xmalloc((argc - 1) * sizeof(*rev));
+
+	do {
+		rev[rev_nr++] = get_commit_reference(argv[1]);
+		argc--; argv++;
+	} while (argc > 1);
+
+	return show_merge_base(rev, rev_nr, show_all);
 }
-- 
1.6.0.rc0.42.g186458.dirty

^ permalink raw reply related

* Re: Bizarre missing changes (git bug?)
From: Linus Torvalds @ 2008-07-30  4:52 UTC (permalink / raw)
  To: Jeff King; +Cc: Roman Zippel, Tim Harper, git
In-Reply-To: <20080730042609.GB3350@sigill.intra.peff.net>

On Wed, 30 Jul 2008, Jeff King wrote:
> 
> I agree with you, btw. It is definitely correct and useful; however, I
> am curious if there is some "in between" level of simplification that
> might produce an alternate graph that has interesting features. And that
> is why I am trying to get Roman to lay out exactly what it is he wants.

Actually, I know what he wants, since I tried to describe it for the 
filter-branch discussion. It's really not that conceptually complex.

Basically, the stupid model is to just do this:

 - start with --full-history

 - for each merge, look at both parents. If one parent leads directly to 
   a commit that can be reached from the the other, just remove that 
   parent as being redundant. And if that removal leads to a merge now 
   becoming a non-merge, and it has no changes wrt its single remaining 
   parent, remove the commit entirely (rewriting any parenthood to make 
   the rest all stay together, of course)

 - repeat until you cannot do any more simplification (removing one commit 
   can actually cause its children to now become targets for this 
   simplification).

and I suspect that

 (a) the stupid model is probably at least O(n^3) if done stupidly and 
     O(n^2) with some modest amount of smarts (keeping a list of at least 
     potential targets of simplification and expanding it only when 
     actually simplifying), but that
 (b) you can concentrate on just the merges that the current optimizing 
     algorithm would have removed, so 'n' is not the total number of 
     commits, but at most the number of merges, and more likely actually 
     just the number of trivial merges in that file, and finally
 (c) there is likely some smart and efficient graph minimization algorithm 
     that is O(nlogn) or something.

so I don't think it's likely to be hugely more expensive than the 
topo-sort is. All the real expense is in the same thing the topo-sort 
expense, namely in generating the list up-front.

I bet googling for "minimal directed acyclic graph" will give pointers.

And despite the fact that I've argued against Roman's world-view, I 
actually _do_ think it would be nice to have that third mode, the same way 
that we have --topo-order. It wouldn't be good for the _default_ view, but 
then neither is --full-history, so that's not a big argument.

That said, I'd like to (again) repeat the caveat that it's probably best 
done in the tool that actally visualizes the mess - exactly for the same 
reason that I argued for the topological sort being done in gitk. It's 
very painful to have to wait for the first few commits to start appearing 
in the history window.

Admittedly most of my work is actually done on machines that are pretty 
fast, but every once in a while I travel with a laptop. And more 
importantly, not everybody gets new hardware from Intel for testing even 
before the CPU has been released. So others will still appreciate 
incremental history updates, even if my machine might be fast enough (and 
my kernel tree always in the caches) that I myself could live with a 
synchronous version a-la --topo-order.

			Linus

^ permalink raw reply

* Re: q: git-fetch a tad slow?
From: Shawn O. Pearce @ 2008-07-30  4:48 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git
In-Reply-To: <20080729090802.GA11373@elte.hu>

Ingo Molnar <mingo@elte.hu> wrote:
> * Shawn O. Pearce <spearce@spearce.org> wrote:
> > Ingo Molnar <mingo@elte.hu> wrote:
> > > 
> > > Setup/background: distributed kernel testing cluster, [...]
> > > 
> > > Problem: i noticed that git-fetch is a tad slow:
> > > 
> > >   titan:~/tip> time git-fetch
> > >   real    0m2.372s
>
> note that titan is a very beefy box, almost 3 GHz Core2Duo:

That isn't going to matter if you have a quadratic algorithm and a
large dataset.  Especially when the inner loops are doing multiple
system calls per item in a long list of items.  :-|   Linux is fast,
but it isn't magic pixie dust.  It cannot fix broken applications.

> [...] So if we have a quadratic overhead on number of 
> branches, that's going to be quite a PITA.

Right.

> > I wonder if git-pack-refs + fetching only a single branch will get you 
> > closer to the tip-fetch time.
> 
> should i pack on both repos? I dont explicitly pack anything, but on the 
> server it goes into regular gc runs. (which will pack most stuff, 
> right?)

git-gc automatically runs `git pack-refs --all --prune` like I
recommended, unless you disabled it with config gc.packrefs = false.
So its probably already packed.

What does `find .git/refs -type f | wc -l` give for the repository
on the central server?  If its more than a handful (~20) I would
suggest running git-gc before testing again.

But I'm really suspecting that this is just our quadratic matching
algorithm running up against a large number of branches, causing
it to suck.

jgit at least uses an O(N) algorithm here, but since it is written
in Java its of course slow compared to C Git.  Takes a while to
get that JVM running.

I'll try to find some time to reproduce the issue and look at the
bottleneck here.  I'm two days into a new job so my git time has
been really quite short this week.  :-|

-- 
Shawn.

^ permalink raw reply

* Re: [RFC/PATCH v3] merge-base: teach "git merge-base" to accept more than 2 arguments
From: Christian Couder @ 2008-07-30  4:52 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Junio C Hamano, Miklos Vajna, Jakub Narebski
In-Reply-To: <alpine.DEB.1.00.0807281328520.2725@eeepc-johanness>

Hi,

Le lundi 28 juillet 2008, Johannes Schindelin a écrit :
> Hi,
>
> On Mon, 28 Jul 2008, Christian Couder wrote:
> > +	rev = xmalloc((argc - 1) * sizeof(*rev));
> > +
> > +	do {
> > +		struct commit *r = get_commit_reference(argv[1]);
> > +		if (!r)
> > +			return 1;
> > +		rev[rev_nr++] = r;
> > +		argc--; argv++;
> > +	} while (argc > 1);
> > +
> > +	return show_merge_base(rev, rev_nr, show_all);
>
> 	rev = xmalloc((argc - 1) * sizeof(*rev));
>
> 	for (rev_nr = 0; rev_nr + 1 < argc; rev_nr++) {
> 		rev[rev_nr] = get_commit_reference(argv[rev_nr + 1]);
> 		if (!rev[rev_nr])
> 			return !!error("Does not refer to a commit: '%s'",
> 				argv[rev_nr + 1]);
> 	}
>
> 	return show_merge_base(rev, rev_nr, show_all);
>
> I do not know about you, but I think this is not only shorter (in spite
> of adding a helpful error message), but also simpler to understand (not
> using convoluted do { } while logic), and therefore superior.

In my last version the loop is reduced to:

+	do {
+		rev[rev_nr++] = get_commit_reference(argv[1]);
+		argc--; argv++;
+	} while (argc > 1);

so it's very simple.

And the stop condition is simpler in my version.

> Your performance argument is weak IMHO, as this is not a big performance
> hit, and command line parameter parsing is definitely not performance
> critical.

It feels a bit sloppy though.

Regards,
Christian.

^ permalink raw reply

* Re: git-svn does not seems to work with crlf convertion enabled.
From: Alexander Litvinov @ 2008-07-30  4:37 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git
In-Reply-To: <alpine.DEB.1.00.0807231117290.2830@eeepc-johanness>

> This is a known issue, but since nobody with that itch seems to care
> enough to fix it, I doubt it will ever be fixed.

Hello again.

I have investigated this problem. Short result: git-svn and ANY file 
convertion will not work now.

In my case I have found the problem is the 
SVN::Git::Fetcher::apply_textdelta() function. To be more precicly call to 
SVN::TxDelta::apply(). We fetch previous version of file from git and then 
apply to it svn's delta. As far as we modify src file SVN fails to apply its 
delta. If I modify last commit and put original version of file everything 
works.

So it seems to me there are two solutions: 
1. Store original file somehow and use it to construct new file version;
2. In case of this error we could fetch full blob with new (or old) version of 
the file.

I did not find the way to gather full file conntent nor feel myself ready to 
rewrite git-svn to store original file somewhere.

Does anybody can help or comment on this ?

^ permalink raw reply

* Re: Bizarre missing changes (git bug?)
From: Jeff King @ 2008-07-30  4:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Roman Zippel, Tim Harper, git
In-Reply-To: <alpine.LFD.1.10.0807291006070.3334@nehalem.linux-foundation.org>

On Tue, Jul 29, 2008 at 10:25:35AM -0700, Linus Torvalds wrote:

> On Tue, 29 Jul 2008, Jeff King wrote:
> > 
> > I glanced briefly over "gitk kernel/printk.c" and it looks pretty sane.
> 
> Jeff, it _is_ sane. When Roman says it's "incorrect", he is just wrong.

I agree with you, btw. It is definitely correct and useful; however, I
am curious if there is some "in between" level of simplification that
might produce an alternate graph that has interesting features. And that
is why I am trying to get Roman to lay out exactly what it is he wants.

-Peff

^ permalink raw reply

* Re: Bizarre missing changes (git bug?)
From: Jeff King @ 2008-07-30  4:23 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Linus Torvalds, Tim Harper, git
In-Reply-To: <Pine.LNX.4.64.0807300430590.6791@localhost.localdomain>

On Wed, Jul 30, 2008 at 04:48:54AM +0200, Roman Zippel wrote:

> Now compare the output of "git-log file1", "git-log --full-history file1" 
> and "git-log --full-history --parents file1". You get either both merge 
> commits or none, but only one of it is relevant to file1.

Ah, I see.

So if I understand you, you wanted to see something like:

A--B
 \  \
  C--D

where

 A = initial commit
 B = duplicate change 1
 C = duplicate change 2
 D = merge branch 'test2' into HEAD

where the simplification isn't as aggressive (you still see the
duplicate commits and the merge), but we can get rid of the later merge
between A and D because A is already an ancestor of D.

So do you have a proposed set of simplification rules that will produce
that output?

-Peff

^ permalink raw reply

* Re: Bizarre missing changes (git bug?)
From: Linus Torvalds @ 2008-07-30  3:35 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Jeff King, Tim Harper, git
In-Reply-To: <alpine.LFD.1.10.0807292002520.3334@nehalem.linux-foundation.org>

On Tue, 29 Jul 2008, Linus Torvalds wrote:
> 
> In other words, that change - in a VERY REAL WAY - never actually mattered 
> for the current state of kernel/printk.c. And the history simplification 
> sees that, and avoids showing the whole pointless branch.

Btw, Roman, this is a really really important thing for you to realize. 

You need to realize that your "perfect" output really REALLY is totally 
inferior, if what you are actually interested in is "how did things get to 
be the way they are".

It's a _feature_. It's not a bug. And it's a really good one.

If side branches didn't matter for the contents of the file, those side 
branches simply don't matter, and showing them is just a distraction.

Yes, you can ask for the history that doesn't matter for the end result. 
And yes, I acknowledge freely that it would be good to then have a 
separate cleanup phase to make that thing more readable. In fact, in the 
very first reply to you I pointed you to a thread where I said exactly 
that, long before this thread even started.

But no, the current default isn't broken. No, it's not "lazy" either. No, 
it was not an "accident". And no, it's not "incorrect".

And until you can see that (along with all the reasons I've outlined why 
your "fixed" approach is a total piece of sh*t from a performance angle), 
you're just being stupid.

				Linus

^ permalink raw reply

* Re: Bizarre missing changes (git bug?)
From: Linus Torvalds @ 2008-07-30  3:21 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Jeff King, Tim Harper, git
In-Reply-To: <Pine.LNX.4.64.0807300430590.6791@localhost.localdomain>

On Wed, 30 Jul 2008, Roman Zippel wrote:
> 
> For printk.c look for commit 02630a12c7f72fa294981c8d86e38038781c25b7 and 
> try to find it in the graphical outputs.

Umm.

Why would you? Yes, it's there, if you ask for --full-history. And no, I 
don't think --full-history is actually useful to humans - it's very much 
there as a "here's all the data" thing where you could have the tools 
post-process it, where often "post-processing" is actually just searching 
for it.

And no, it's not there if you don't use --full-history.

But now, instead of _complaining_ about this, I would suggest you think 
about why it's a _good_ thing, and why it's so useful?

In other words, you're arriving at all your complaints from the wrong 
angle entirely, and because you have convinced yourself that things have 
to work a certain way, and then you're upset when they don't.

But you should _unconvince_ yourself - and look at whether maybe all your 
initial preconceptions were perhaps totally wrong? Because they were.

The reason that commit 02630a12c7f72fa294981c8d86e38038781c25b7 doesn't 
show up in the normal log when looking at kernel/printk.c is that it 
really doesn't exist as a _relevant_ part of history for the current state 
of that file. It exists only as a a side-branch for the GFS2 quota code 
that first adds a line

	+EXPORT_SYMBOL_GPL(tty_write_message);

(in commit b346671fa196a), and then removes the line not long after (in 
that commit 02630a12c7f). And both of them go away (along with the whole 
side-branch), because they didn't end up mattering for the end result: 
they only ever existed in that side branch, and by the time it was merged 
back into the main branch, all changes had been undone.

In other words, that change - in a VERY REAL WAY - never actually mattered 
for the current state of kernel/printk.c. And the history simplification 
sees that, and avoids showing the whole pointless branch.

This is such an obviously _good_ thing that I really am surprised ay how 
you can continue to argue against it. Especially as the examples you give 
"for" your argument are so wonderful examples _against_ it.

And yes, you can actually force gitk to show the state of that commit and 
thus force it to acknowledge that that state was relevant (although you 
won't necessarily force it to acknowledge that the relevance ties together 
with the final end result). You do that by just telling it that you're not 
just interested in HEAD, but in that commit too.

So I would literally suggest that anybody interested in this subject 
really just do

	gitk kernel/printk.c &
	gitk HEAD 02630a12c7f72fa294981c8d86e38038781c25b7 kernel/printk.c &

in the kernel, and now compare the two side-by-side. Notice where they 
differ (hint: look for the commit a0f1ccfd8d37457a6d8a9e01acebeefcdfcc306e 
- "[PATCH] lockdep: do not recurse in printk" - which is in both, and look 
below it).

Now, which graph is the more relevant and understandable one from the 
standpoint of what the current state of kernel/printk.c is?

Honestly now, Roman.

Because if you were actually willing to see this as a _feature_ (which it 
very much is), you'd admit that it's a damn clever and useful one. But I 
suspect you have dug yourself so deep into a hole that you can't admit 
that even to yourself any more.

				Linus

^ permalink raw reply

* Re: Bizarre missing changes (git bug?)
From: Kevin Ballard @ 2008-07-30  3:20 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Jeff King, Linus Torvalds, Tim Harper, git
In-Reply-To: <Pine.LNX.4.64.0807300430590.6791@localhost.localdomain>

On Jul 29, 2008, at 7:48 PM, Roman Zippel wrote:

> For printk.c look for commit  
> 02630a12c7f72fa294981c8d86e38038781c25b7 and
> try to find it in the graphical outputs.
> Here is a bit better example than Linus gave:
>
> [snip]
>
> Now compare the output of "git-log file1", "git-log --full-history  
> file1"
> and "git-log --full-history --parents file1". You get either both  
> merge
> commits or none, but only one of it is relevant to file1.
>
> The problem is that in practice "git-log --full-history --parents"
> produces way too much information to be useful right away.

Output looks correct to me. And of course --full-history --parents  
gives lots of output - that's what it's for. You seem to believe that  
the appropriate output is, what, to display the initial commit, both  
commits that modified file1, and the first merge, yes? Can you please  
clarify the logic that states that the first merge commit should be  
shown but the second should not?

-Kevin Ballard

-- 
Kevin Ballard
http://kevin.sb.org
kevin@sb.org
http://www.tildesoft.com

^ permalink raw reply

* Re: Bizarre missing changes (git bug?)
From: Roman Zippel @ 2008-07-30  2:48 UTC (permalink / raw)
  To: Jeff King; +Cc: Linus Torvalds, Tim Harper, git
In-Reply-To: <20080729125247.GC12069@sigill.intra.peff.net>

Hi,

On Tue, 29 Jul 2008, Jeff King wrote:

> > > Perhaps I am just slow, but I haven't been able to figure out what that
> > > history is, or what the "correct" output should be. Can you try to state
> > > more clearly what it is you are looking for?
> > 
> > Most frequently this involves changes where the same change is merged 
> > twice. Another interesting example is kernel/printk.c where a change is 
> > added and later removed again before it's merged.
> 
> I glanced briefly over "gitk kernel/printk.c" and it looks pretty sane.
> I was really hoping for you to make your case as something like:
> 
>   1. here is an ascii diagram of an actual history graph (or a recipe of
>      git commands for making one)
>   2. here is what git-log (or gitk) produces for this history by
>      default; and here is why it is not optimal (presumably some
>      information it fails to convey)
>   3. here is what git-log (or gitk) with --full-history produces; and
>      here is why it is not optimal (presumably because it is too messy)
>   4. here is what output I would like to see. Bonus points for "and here
>      is an algorithm that accomplishes it."

For printk.c look for commit 02630a12c7f72fa294981c8d86e38038781c25b7 and 
try to find it in the graphical outputs.
Here is a bit better example than Linus gave:

mkdir test
cd test
git init

echo 1 > file1
echo a > file2

git add file1 file2
git commit -m "initial commit"
git tag base

git branch test1 base
git checkout test1
echo 2 > file1
git commit -a -m "duplicate change 1"

git branch test2 base
git checkout test2
echo 2 > file1
git commit -a -m "duplicate change 2"

git branch test3 base
git checkout test3
echo b > file2
git commit -a -m "some other change"

git checkout base

git merge test1
git merge test2
git merge test3

Now compare the output of "git-log file1", "git-log --full-history file1" 
and "git-log --full-history --parents file1". You get either both merge 
commits or none, but only one of it is relevant to file1.

The problem is that in practice "git-log --full-history --parents" 
produces way too much information to be useful right away.

bye, Roman

^ permalink raw reply

* Re: Bizarre missing changes (git bug?)
From: Linus Torvalds @ 2008-07-30  2:05 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Jeff King, Tim Harper, git
In-Reply-To: <Pine.LNX.4.64.0807300315280.6791@localhost.localdomain>

On Wed, 30 Jul 2008, Roman Zippel wrote:
> > 
> > The "gitk file" history is the simplest one BY FAR, because it has very 
> > aggressively simplified history to the point where it tried to find the 
> > _simplest_ history that explains the current contents of 'file'[*]
> 
> It's "aggressively simplified" by not even bothering to look for more.

Yes and no.

It's aggressively simplified because that's the right output with the 
minimal unnecessary irrelevant information. It explains how the file came 
to a particular state, with the simplest possible self-consistent history.

(Again, the caveat about "simplest possible" always beign a local 
minimization, not a global one).

The fact that it also obviously involved less work (so git can do it 
faster, and with fewer disk and memory accesses) is a huge bonus, of 
course.

Are you complaining about the fact that I'm smart, and I get the right 
result I want with less work and with a simpler algorithm?

What's your point?

> "simplified" implies there is something more complex beforehand, but all 
> it does is simple scan through the history as fast possible without 
> bothering looking left or right.

You're just being stupid.

It's not that it's not "bothering" looking left or right. It very much 
*does* bother to look left or right. But once it finds that one or the 
other explains the situation entirely, it then says "screw left, I already 
know that rigth gives me the information I want".

In other words, it's doing the _smart_ thing. 

I don't understand why you complain about intelligence.

It's *not* just looking at one single history. Look at

	gitk kernel/sched.c

and notice that the simplified history is not linear. It tries to make it 
AS LINEAR AS POSSIBLE, BUT NO MORE.

    "Make everything as simple as possible, but not simpler."
			- Albert Einstein

You seem to complain about the fact that it's doing that. That's stupid of 
you.

> "simplified" implies to me it's something intentional, but this is more of 
> an accidental optimization which happens to work in most situations and in 
> the special cases it just picks a random change and hopes for the best.

You're just crazy. There is nothing accidental there what-so-ever.

> "git-log --full-history file" at least produces the full change history, 
> but it has an performance impact and it doesn't produce a complete graph 
> usable for graphical front ends.

Umm. You have to add "--parents" if you want a full graph. Without that, 
you can never re-generate the graph anyway.

And when you do that, it _does_ give all the commits needed to complete 
the picture.

In other words, git (once again) is actually smarter than you, and does 
the right thing, and (once again) you complain about something that you 
just don't understand.

> I gave more general examples. Tracking upstream source can produce this 
> problem frequently. Another example are stable/unstable branches where the 
> stable branch is occasionally merged into the unstable branch can produce 
> this problem.

You call it a "problem", but you don't actually give any reason for 
calling it that. IT IS NOT A PROBLEM. It's very much by design, and it's 
because what you want.

Use --full-history if you want the full history. 

> This is your _subjective_ interpretion of this problem, because it's not a 
> problem for you, nobody else can possibly have this problem (or they just 
> crazy).

No, Roman. You're not crazy because you have some issue that I cannot 
understand. You're crazy because you make the same mistake over and over, 
and don't listen when people tell you what the mistake was.

	"Insanity is doing the same thing over and over again and 
	 expecting different results."
			- Various

Please. People have told you where you go wrong. Many times. So why do you 
keep repeating it?

Take the time to slow down, listen, and realize that you're on the wrong 
track, and that others really _have_ spent time and thought on this.

		Linus

^ permalink raw reply

* Re: Bizarre missing changes (git bug?)
From: Kevin Ballard @ 2008-07-30  1:32 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Linus Torvalds, Tim Harper, git
In-Reply-To: <Pine.LNX.4.64.0807300223010.6791@localhost.localdomain>

On Jul 29, 2008, at 6:14 PM, Roman Zippel wrote:

>> So here's my challenge again, which you seem to have TOTALLY MISSED.
>>
>> Make this be fast:
>>
>> 	time sh -c "git log <filename> | head"
>>
>> nothing else matters. If you can make that one be fast, I'm happy.
>
> I already explained it, but you simply dismissed it. It's possible,  
> but it
> requires a bit of cached information (e.g. as part of the pack file,  
> which
> is needed for decent performance anyway).

As an outside observer, this argument is basically akin to "it's easy  
to fly, you just need some faerie dust". Basically, you're dismissing  
the entire complexity of the problem by saying "oh, that's easy, just  
use some cached data" without any proof that this would work, or any  
sample code, or really any evidence at all. Given that the path  
simplification can be arbitrarily complex (I can pass any set of paths  
I want), I don't believe that you can just use "a bit of cached  
information" for this. If you did rely on cached information, said  
information would probably be orders of magnitude larger than the  
object graph itself (for repos with lots of files).

>> In fact, you can see what I'm talking about by trying --topo-order  
>> in the
>> above timing test.
>
> Please give me full example.
> gitk --topo-order kernel/printk.c shows no difference (e.g. it doesn't
> show 02630a12c7f72fa294981c8d86e38038781c25b7), several experiments  
> with
> git-rev-list show no improvement either.

He's not saying it changes what commits are shown, he's saying it has  
a performance impact - topo order has to post-process the graph. For a  
quick demonstration, run `time sh -c 'git log | head'` vs `time sh -c  
'git log --topo-order | head'`.

-Kevin Ballard

-- 
Kevin Ballard
http://kevin.sb.org
kevin@sb.org
http://www.tildesoft.com

^ permalink raw reply

* Re: Bizarre missing changes (git bug?)
From: Linus Torvalds @ 2008-07-30  1:49 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Tim Harper, git
In-Reply-To: <Pine.LNX.4.64.0807300223010.6791@localhost.localdomain>

On Wed, 30 Jul 2008, Roman Zippel wrote:
> > 
> > 	time sh -c "git log <filename> | head"
> > 
> > nothing else matters. If you can make that one be fast, I'm happy. 
> 
> I already explained it, but you simply dismissed it. It's possible, but it 
> requires a bit of cached information (e.g. as part of the pack file, which 
> is needed for decent performance anyway).

Bzzt. Wrong. Try again.

> > In fact, you can see what I'm talking about by trying --topo-order in the 
> > above timing test.
> 
> Please give me full example.
> gitk --topo-order kernel/printk.c shows no difference (e.g. it doesn't 
> show 02630a12c7f72fa294981c8d86e38038781c25b7), several experiments with 
> git-rev-list show no improvement either.

Roman, what the f*ck is wrong with you? Let me repeat that thing one more 
time:

	you can see what I'm talking about by trying --topo-order in the
	above timing test.
	      ^^^^^^^^^^^

The fact is, --topo-order is a post-processing thing, exactly the way your 
half-way simplification would be. It requires _all_ commits, and it 
requires them because we cannot guarantee that we output all children 
before the parents when there are multiple threads without a central clock 
(ie any distributed environment).

So for --topo-order, we generate the whole history, and then we sort it. 

As a result, it has horrible interactivity behavior. Try it. Here's some 
random command lines, and the times:

	time git log --topo-order drivers/scsi/scsi_lib.c | head

	real    0m0.688s
	user    0m0.652s
	sys     0m0.036s

and without:

	time git log drivers/scsi/scsi_lib.c | head

	real    0m0.033s
	user    0m0.024s
	sys     0m0.008s

do you see the difference? They happen to output _exactly_ the same ten 
lines, but one of them takes the better part of a second (and that's on 
pretty much the fastest machine you can find right now - on a laptop with 
a slow disk and without things in cache, it would take many many seconds).

The other one is instantaneous.

Now, I realize that 0.033s vs 0.688s doesn't sound like a big deal, even 
though that's a 20x difference, but that 20x difference is a _really_ big 
deal when the machine is slower, or when "old history" isn't in the disk 
cache any more.

For example, try doing the timings after flushing the disk caches to 
simulate cold-cache behavior. Do it with a slow disk. Or do it over NFS. 
Yes, even the "fast" case will actually be painfully slow (well, it is for 
me, people who are used to CVS probably think it's just "normal"). 

And yes, it will depend a lot on the file in question too. Obviously, if 
the first change is far back in history, it will be slow _regardless_, but 
I've at least personally found that in practice, you tend to look at logs 
of _recent_ things much much much more than you look at things that 
haven't changed lately.	

It will also depend a lot on whether you are packed or not. For example, 
if you are well packed, the pack-file IO locality is really really good, 
and the 20x slowdown is much less. I just tested with a laptop with a slow 
disk, and the --topo-order case was "only" 2.5x slower, almost certainly 
because the IO required to bring in the first part of the history ended up 
being a large portion of the total IO, and so the "whole history" case was 
not 20x slower, because there was not 20x more IO due to the good locality 
and the kernel doing readahead etc.

But 2.5x slower is really bad, wouldn't you agree? We're not talking about 
a few percent here, we're talking about more than twice as long. It's very 
noticeable, especially when the end result was --topo-order: 29.8s, no 
topo-order 12.1s

(Yeah, that wasn't a very realistic example, but on that same machine, 
once it's in the cache, it's 0.13s vs 1.6s: one is "instant", the other is 
very much a "wait for it" kind of thing.)

THAT is the kind of performance difference you see.

And trust me, it's a performance difference that you can really notice in 
real life. I'm not kidding you. Just try it:

	git log kernel/sched.c
vs
	git log --topo-order kernel/sched.c

and one is instant, the other one pauses before it starts showing 
something. One feels fast, the other feels slow.

At the same time, if you actually time the _whole_ log, it's all exactly 
the same speed:

	[torvalds@nehalem linux]$ time git log --topo-order kernel/sched.c > /dev/null 
	real	0m0.708s
	user	0m0.684s
	sys	0m0.020s

	[torvalds@nehalem linux]$ time git log kernel/sched.c > /dev/null 
	real	0m0.703s
	user	0m0.672s
	sys	0m0.032s

Notice? The cost of the topological sort itself is basically zero. But 
from an interactivity standpoint, it's _deadly_.

And please note that here "--topo-sort" is just an example of a random 
"global history post-processing" thing. It's not that I want you to use 
the topological sort per se, it's just an example of the whole issue with 
_any_ post-factum operation. The topological sort is not expensive as a 
sort. What is expensive is that it needs to get the whole history to work.

And also please notice that this is a huge scalability issue. "git log" 
should not become slower as a project gets more history. Sure, the full 
log will take longer to generate (because there's _more_ of it), but the 
top commits should always show up immediately.

Again, if you have a filter (where "topological sort" is just an example 
of such a filter) that requires the full history to work, it simply 
_fundamentally_ cannot scale well. If very fundamentally will slow down 
with bigger history.

> The problem is that your picture doesn't include my specific problem, I'm 
> very interested in the big picture, but I'd like to be in it.

Roman, I've been trying to explain this "interactive" thing for _days_ 
now. That's the big picture. The whole "you have to be able to generate 
history incrementally" thing.

First generating the whole global history, and then simplifying it, is 
simply not acceptable. It's too slow, and it doesn't scale.

			Linus

^ permalink raw reply

* Re: Bizarre missing changes (git bug?)
From: Roman Zippel @ 2008-07-30  1:50 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jeff King, Tim Harper, git
In-Reply-To: <alpine.LFD.1.10.0807291006070.3334@nehalem.linux-foundation.org>

Hi,

On Tue, 29 Jul 2008, Linus Torvalds wrote:

> Now, do these three things
> 
> 	gitk
> 	gitk file
> 	gitk --full-history file
> 
> and compare them. They all show _different_ histories.
> 
> Which one is "correct"? They all are. It just depends on what you want to 
> see.
> 
> The "gitk file" history is the simplest one BY FAR, because it has very 
> aggressively simplified history to the point where it tried to find the 
> _simplest_ history that explains the current contents of 'file'[*]

It's "aggressively simplified" by not even bothering to look for more.
"simplified" implies there is something more complex beforehand, but all 
it does is simple scan through the history as fast possible without 
bothering looking left or right.
"simplified" implies to me it's something intentional, but this is more of 
an accidental optimization which happens to work in most situations and in 
the special cases it just picks a random change and hopes for the best.

"git-log --full-history file" at least produces the full change history, 
but it has an performance impact and it doesn't produce a complete graph 
usable for graphical front ends.

> >From a practical standpoint, and from having used this a long time, I'd 
> argue that the simple history is the one that you want 99.9% of all time. 
> But not _always_. Sometimes, the things that got simplified away actually 
> matter. It's rare, but it happens.
> 
> For example, maybe you had a bug-fix that you _know_ you did, and it it 
> doesn't show up in the simplified history. That really pisses you off, and 
> it apparently really pisses Roman off that it can happen. But the fact is, 
> that still doesn't mean that the simple history is "wrong" or even 
> "incomplete".

I gave more general examples. Tracking upstream source can produce this 
problem frequently. Another example are stable/unstable branches where the 
stable branch is occasionally merged into the unstable branch can produce 
this problem.

> No, it's actually meaningful data in itself. If the bug-fix doesn't show 
> in the simplified history, then that simply means that the bug-fix was not 
> on a branch that could _possibly_ have mattered for the current contents. 
> 
> So once you are _aware_ of history simplification and are mentally able to 
> accept it, the fact that history got simplified is actually just another 
> tool.

This is your _subjective_ interpretion of this problem, because it's not a 
problem for you, nobody else can possibly have this problem (or they just 
crazy).
Even if I know about this limitation it still doesn't solve the problem, 
that _none_ of the graphical interfaces can show me a useful history graph 
of these situations.

bye, Roman

^ permalink raw reply

* [PATCH v2] Documentation: Remove mentions of git-svnimport.
From: Brian Gernhardt @ 2008-07-30  1:16 UTC (permalink / raw)
  To: Pieter de Bie; +Cc: Git List, Junio C Hamano, Jurko Gospodnetić

git-svnimport is no longer supported, so don't mention it in the
documentation.  This also updates the description, removing the
historical discussion, since it mostly dealt with how it differed from
svnimport.  The new description gives some starting points into the
rest of the documentation.

Noticed by Jurko Gospodnetić <jurko.gospodnetic@docte.hr>

Signed-off-by: Brian Gernhardt <benji@silverinsanity.com>
---

 Replaces the remaining comparison to git-svnimport with pointers
 to the rest of the documentation.

 Documentation/git-svn.txt |   26 ++++++++++++--------------
 1 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/Documentation/git-svn.txt b/Documentation/git-svn.txt
index e7c0f1c..f230125 100644
--- a/Documentation/git-svn.txt
+++ b/Documentation/git-svn.txt
@@ -12,18 +12,18 @@ SYNOPSIS
 DESCRIPTION
 -----------
 'git-svn' is a simple conduit for changesets between Subversion and git.
-It is not to be confused with linkgit:git-svnimport[1], which is
-read-only.
+It provides a bidirectional flow of changes between a Subversion and a git
+respository.
 
-'git-svn' was originally designed for an individual developer who wants a
-bidirectional flow of changesets between a single branch in Subversion
-and an arbitrary number of branches in git.  Since its inception,
-'git-svn' has gained the ability to track multiple branches in a manner
-similar to 'git-svnimport'.
+'git-svn' can track a single Subversion branch simply by using a
+URL to the branch, follow branches laid out in the Subversion recommended
+method (trunk, branches, tags directories) with the --stdlayout option, or
+follow branches in any layout with the -T/-t/-b options (see options to
+'init' below, and also the 'clone' command).
 
-'git-svn' is especially useful when it comes to tracking repositories
-not organized in the way Subversion developers recommend (trunk,
-branches, tags directories).
+Once tracking a Subversion branch (with any of the above methods), the git
+repository can be updated from Subversion by the 'fetch' command and
+Subversion updated from git by the 'dcommit' command.
 
 COMMANDS
 --------
@@ -218,8 +218,7 @@ Any other arguments are passed directly to 'git-log'
 
 'commit-diff'::
 	Commits the diff of two tree-ish arguments from the
-	command-line.  This command is intended for interoperability with
-	'git-svnimport' and does not rely on being inside an `git-svn
+	command-line.  This command does not rely on being inside an `git-svn
 	init`-ed repository.  This command takes three arguments, (a) the
 	original tree to diff against, (b) the new tree result, (c) the
 	URL of the target Subversion repository.  The final argument
@@ -317,8 +316,7 @@ config key: svn.findcopiesharder
 -A<filename>::
 --authors-file=<filename>::
 
-Syntax is compatible with the files used by 'git-svnimport' and
-'git-cvsimport':
+Syntax is compatible with the file used by 'git-cvsimport':
 
 ------------------------------------------------------------------------
 	loginname = Joe User <user@example.com>
-- 
1.6.0.rc1.154.ge3fc

^ permalink raw reply related

* Re: Bizarre missing changes (git bug?)
From: Roman Zippel @ 2008-07-30  1:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Tim Harper, git
In-Reply-To: <alpine.LFD.1.10.0807290838360.3334@nehalem.linux-foundation.org>

Hi,

On Tue, 29 Jul 2008, Linus Torvalds wrote:

> On Tue, 29 Jul 2008, Roman Zippel wrote:
> > 
> > I'm not dismissing it, but your focus is on how to get this result.
> 
> No, you misunderstand.
> 
> My focus is really on one single thing:
> 
>  - performance
> 
> with a smaller focus on the fact that I simply don't see how it's 
> _possible_ to do better than our current all-or-nothing approach of 
> simplification (eg either extreme simplification or none at all: nothing 
> or --full-history).

That's exactly what I'm not dismissing as you claim, but I've hit the 
problem where this approach simply produces crap, so I'm foremost 
interested in getting a useful result, only after that I'm interested in 
the performance (which I think is possible).

> So here's my challenge again, which you seem to have TOTALLY MISSED.
> 
> Make this be fast:
> 
> 	time sh -c "git log <filename> | head"
> 
> nothing else matters. If you can make that one be fast, I'm happy. 

I already explained it, but you simply dismissed it. It's possible, but it 
requires a bit of cached information (e.g. as part of the pack file, which 
is needed for decent performance anyway).

> In fact, you can see what I'm talking about by trying --topo-order in the 
> above timing test.

Please give me full example.
gitk --topo-order kernel/printk.c shows no difference (e.g. it doesn't 
show 02630a12c7f72fa294981c8d86e38038781c25b7), several experiments with 
git-rev-list show no improvement either.

> > > And quite frankly, I've seen that behaviour from you before, when it comes 
> > > to other things.
> > 
> > What exact behaviour is that? That I dare to disagree with you?
> 
> No. The fact that you like arguing _pointlessly_, and just being abrasive, 
> without actually helping or understanding the big picture.

The problem is that your picture doesn't include my specific problem, I'm 
very interested in the big picture, but I'd like to be in it.

> I'm thinking 
> back on the whole scheduler thing. You weren't arguing with _me_, but you 
> had the same modus operandi.

Well, it seems I have talent for finding the special cases, e.g. last time 
I tested the scheduler it was performing twice as bad as the old scheduler 
on m68k. I've also seen cases where it sacrifices throughput for 
interactivity.
Anyway, this is the wrong place for it anyway, the problem I'm hitting is 
these "good enough" solutions, which work in most situations, but fail in 
a few special situations, but nobody is interested to get these right 
unless your name is Linus.

bye, Roman

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox