git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] rev-list docs: clarify --topo-order description
@ 2012-08-13 22:21 Junio C Hamano
  2012-08-13 22:46 ` Martin von Zweigbergk
                   ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Junio C Hamano @ 2012-08-13 22:21 UTC (permalink / raw)
  To: git; +Cc: Martin von Zweigbergk

We said "--date-order" still does not violate the topology, but it
was still not clear enough.

Reword the description for both "--date-order" and "--topo-order",
and add an illustration to it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * Let's do this before I forget...; came up in discussion $gmane/203370

 Documentation/rev-list-options.txt | 29 +++++++++++++++++++++++------
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 6a4b635..dc501ee 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -579,15 +579,32 @@ Commit Ordering
 By default, the commits are shown in reverse chronological order.
 
 --topo-order::
-
-	This option makes them appear in topological order (i.e.
-	descendant commits are shown before their parents).
+	This option makes them appear in topological order.  Even
+	without this option, descendant commits are shown before
+	their parents, but this tries to avoid showing commits on
+	multiple lines of history intermixed.
 
 --date-order::
 
-	This option is similar to '--topo-order' in the sense that no
-	parent comes before all of its children, but otherwise things
-	are still ordered in the commit timestamp order.
+	Show no parents before all of its children, but otherwise
+	show commits in the commit timestamp order.
++
+For example, in a commit history like this:
++
+----------------------------------------------------------------
+
+    ---1----2----4----7
+	\	       \
+	 3----5----6----8---
+
+----------------------------------------------------------------
++
+where the numbers denote the order of commit timestamps, `git
+rev-list` and friends with `--date-order` show the commits in the
+timestamp order: 8 7 6 5 4 3 2 1.
++
+With `--topo-order`, they would show 8 6 5 3 7 4 2 1 (or 8 7 4 2 6 5
+3 1), to avoid commits from two branches mixed together.
 
 --reverse::
 
-- 
1.7.12.rc2.85.g1de7134

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] rev-list docs: clarify --topo-order description
  2012-08-13 22:21 [PATCH] rev-list docs: clarify --topo-order description Junio C Hamano
@ 2012-08-13 22:46 ` Martin von Zweigbergk
  2012-08-13 23:05   ` Junio C Hamano
  2012-08-14  8:22 ` Michael Haggerty
  2012-08-14  8:45 ` Thomas Rast
  2 siblings, 1 reply; 26+ messages in thread
From: Martin von Zweigbergk @ 2012-08-13 22:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Mon, Aug 13, 2012 at 3:21 PM, Junio C Hamano <gitster@pobox.com> wrote:
>  * Let's do this before I forget...; came up in discussion $gmane/203370

Thanks! That definitely confused me (and I suppose I stupidly didn't
test with a proper range).

>
>  Documentation/rev-list-options.txt | 29 +++++++++++++++++++++++------
>  1 file changed, 23 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
> index 6a4b635..dc501ee 100644
> --- a/Documentation/rev-list-options.txt
> +++ b/Documentation/rev-list-options.txt
> @@ -579,15 +579,32 @@ Commit Ordering
>  By default, the commits are shown in reverse chronological order.

It seems likely that those reading the above sentence will continue on
to read about --topo-order, but still, do you think the "descendant
commits are shown before parents" part belong here instead?

>  --topo-order::
> -
> -       This option makes them appear in topological order (i.e.
> -       descendant commits are shown before their parents).
> +       This option makes them appear in topological order.  Even
> +       without this option, descendant commits are shown before
> +       their parents, but this tries to avoid showing commits on
> +       multiple lines of history intermixed.
>
>  --date-order::
>
> -       This option is similar to '--topo-order' in the sense that no
> -       parent comes before all of its children, but otherwise things
> -       are still ordered in the commit timestamp order.
> +       Show no parents before all of its children, but otherwise
> +       show commits in the commit timestamp order.
> ++
> +For example, in a commit history like this:
> ++
> +----------------------------------------------------------------
> +
> +    ---1----2----4----7
> +       \              \
> +        3----5----6----8---
> +
> +----------------------------------------------------------------
> ++
> +where the numbers denote the order of commit timestamps, `git
> +rev-list` and friends with `--date-order` show the commits in the
> +timestamp order: 8 7 6 5 4 3 2 1.
> ++
> +With `--topo-order`, they would show 8 6 5 3 7 4 2 1 (or 8 7 4 2 6 5
> +3 1), to avoid commits from two branches mixed together.

It would help at least me to also know what the output would be
without either of --date-order or --topo-order. (Does the default
order have a name, by the way?)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] rev-list docs: clarify --topo-order description
  2012-08-13 22:46 ` Martin von Zweigbergk
@ 2012-08-13 23:05   ` Junio C Hamano
  2012-08-14  5:33     ` Martin von Zweigbergk
  0 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2012-08-13 23:05 UTC (permalink / raw)
  To: Martin von Zweigbergk; +Cc: git

Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> writes:

>> diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
>> index 6a4b635..dc501ee 100644
>> --- a/Documentation/rev-list-options.txt
>> +++ b/Documentation/rev-list-options.txt
>> @@ -579,15 +579,32 @@ Commit Ordering
>>  By default, the commits are shown in reverse chronological order.
>
> It seems likely that those reading the above sentence will continue on
> to read about --topo-order, but still, do you think the "descendant
> commits are shown before parents" part belong here instead?

I do not think so.  When you are not limited (i.e. limit_list() is
not called), you could do something like "git rev-list 4 5" in a
history like this:

	--1---5---2---3---4

and get end up getting "5 4 3 2 1", and "2" certainly doesn't get
shown before "5" does.

In your series where cherry-pick runs prepare_revision_walk() and
makes the outcome sort in reverse, the list has to be limited, so
the above is a non-issue, but in the context of this document, we
cannot assume that.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] rev-list docs: clarify --topo-order description
  2012-08-13 23:05   ` Junio C Hamano
@ 2012-08-14  5:33     ` Martin von Zweigbergk
  2012-08-14 14:54       ` Junio C Hamano
  0 siblings, 1 reply; 26+ messages in thread
From: Martin von Zweigbergk @ 2012-08-14  5:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Mon, Aug 13, 2012 at 4:05 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> writes:
>
>>> diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
>>> index 6a4b635..dc501ee 100644
>>> --- a/Documentation/rev-list-options.txt
>>> +++ b/Documentation/rev-list-options.txt
>>> @@ -579,15 +579,32 @@ Commit Ordering
>>>  By default, the commits are shown in reverse chronological order.
>>
>> It seems likely that those reading the above sentence will continue on
>> to read about --topo-order, but still, do you think the "descendant
>> commits are shown before parents" part belong here instead?
>
> I do not think so.  When you are not limited (i.e. limit_list() is
> not called), you could do something like "git rev-list 4 5" in a
> history like this:
>
>         --1---5---2---3---4
>
> and get end up getting "5 4 3 2 1", and "2" certainly doesn't get
> shown before "5" does.

Oh, interesting. I had no idea, although that does make sense. Thanks.

Still, the "Even without this option" strongly suggests to me that
what follows ("descendant commits are shown before parents") applies
to the "By default" case. Would it be correct to say something like
"By default, the commits are shown in reverse chronological order.
When commit limiting is in effect, descendant commits are shown before
parents."? I'm not sure the "commit limiting" section in the man page
involves the same options as "limit_list" (I rather think they don't),
but I don't know if there's a better term to use in the documentation
either.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] rev-list docs: clarify --topo-order description
  2012-08-13 22:21 [PATCH] rev-list docs: clarify --topo-order description Junio C Hamano
  2012-08-13 22:46 ` Martin von Zweigbergk
@ 2012-08-14  8:22 ` Michael Haggerty
  2012-08-14  8:45 ` Thomas Rast
  2 siblings, 0 replies; 26+ messages in thread
From: Michael Haggerty @ 2012-08-14  8:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Martin von Zweigbergk

On 08/14/2012 12:21 AM, Junio C Hamano wrote:
> We said "--date-order" still does not violate the topology, but it
> was still not clear enough.
>
> Reword the description for both "--date-order" and "--topo-order",
> and add an illustration to it.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Thanks for this change.  I was recently trying to figure out the meaning 
of these ordering options myself, and found the old text confusing.

> ---
>
>   * Let's do this before I forget...; came up in discussion $gmane/203370
>
>   Documentation/rev-list-options.txt | 29 +++++++++++++++++++++++------
>   1 file changed, 23 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
> index 6a4b635..dc501ee 100644
> --- a/Documentation/rev-list-options.txt
> +++ b/Documentation/rev-list-options.txt
> @@ -579,15 +579,32 @@ Commit Ordering
>   By default, the commits are shown in reverse chronological order.
>
>   --topo-order::
> -
> -	This option makes them appear in topological order (i.e.
> -	descendant commits are shown before their parents).
> +	This option makes them appear in topological order.  Even
> +	without this option, descendant commits are shown before
> +	their parents, but this tries to avoid showing commits on
> +	multiple lines of history intermixed.
>
>   --date-order::
>
> -	This option is similar to '--topo-order' in the sense that no
> -	parent comes before all of its children, but otherwise things
> -	are still ordered in the commit timestamp order.
> +	Show no parents before all of its children, but otherwise
> +	show commits in the commit timestamp order.
> ++
> +For example, in a commit history like this:
> ++
> +----------------------------------------------------------------
> +
> +    ---1----2----4----7
> +	\	       \
> +	 3----5----6----8---
> +
> +----------------------------------------------------------------
> ++
> +where the numbers denote the order of commit timestamps, `git
> +rev-list` and friends with `--date-order` show the commits in the
> +timestamp order: 8 7 6 5 4 3 2 1.
> ++
> +With `--topo-order`, they would show 8 6 5 3 7 4 2 1 (or 8 7 4 2 6 5
> +3 1), to avoid commits from two branches mixed together.

Is it possible to predict which of the two orders would be taken here? 
It would be nice for the results to be deterministic.  For example, 
topology "ties" could be broken by choosing the commit with the most 
recent timestamp.

>   --reverse::
>
>

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] rev-list docs: clarify --topo-order description
  2012-08-13 22:21 [PATCH] rev-list docs: clarify --topo-order description Junio C Hamano
  2012-08-13 22:46 ` Martin von Zweigbergk
  2012-08-14  8:22 ` Michael Haggerty
@ 2012-08-14  8:45 ` Thomas Rast
  2012-08-14 14:30   ` Junio C Hamano
  2 siblings, 1 reply; 26+ messages in thread
From: Thomas Rast @ 2012-08-14  8:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Martin von Zweigbergk

Junio C Hamano <gitster@pobox.com> writes:

>  --topo-order::
> -
> -	This option makes them appear in topological order (i.e.
> -	descendant commits are shown before their parents).
> +	This option makes them appear in topological order.  Even
> +	without this option, descendant commits are shown before
> +	their parents, but this tries to avoid showing commits on
> +	multiple lines of history intermixed.

I don't think that is true in general.  Without any -order options, we
process commits in date order, which *usually* means topological order,
but not always.  You can easily verify this:

  $ git init
  $ date
  Tue Aug 14 10:39:49 CEST 2012
  $ echo initial >file
  $ git add file
  $ GIT_COMMITTER_DATE="Tue Aug 14 11:39:49 2012" git commit
  $ echo foo >file
  $ git commit -mfoo file
  $ git checkout -bside HEAD^
  $ echo bar >file
  $ git commit -mbar file
  $ git log --all --oneline
  8c71325 bar
  e5072d7 initial
  1be702c foo

So the --topo-order switch *ensures* that we process commits in
topological order even in the face of skewed clocks.

I suspect that

> +	their parents, but this tries to avoid showing commits on
> +	multiple lines of history intermixed.

is just a fortunate side effect of the topological sort.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] rev-list docs: clarify --topo-order description
  2012-08-14  8:45 ` Thomas Rast
@ 2012-08-14 14:30   ` Junio C Hamano
  2012-08-14 14:51     ` Thomas Rast
  0 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2012-08-14 14:30 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git, Martin von Zweigbergk

Thomas Rast <trast@student.ethz.ch> writes:

> So the --topo-order switch *ensures* that we process commits in
> topological order even in the face of skewed clocks.

Yes, I *think* that I attempted to show with the illustration.

> I suspect that
>
>> +	their parents, but this tries to avoid showing commits on
>> +	multiple lines of history intermixed.
>
> is just a fortunate side effect of the topological sort.

I am not sure if it is "side effect"; I *think* it was the "primary
objective" we added topo-order in the first place.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] rev-list docs: clarify --topo-order description
  2012-08-14 14:30   ` Junio C Hamano
@ 2012-08-14 14:51     ` Thomas Rast
  2012-08-14 15:47       ` Junio C Hamano
  2012-08-15 20:02       ` [PATCH v2] " Junio C Hamano
  0 siblings, 2 replies; 26+ messages in thread
From: Thomas Rast @ 2012-08-14 14:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Thomas Rast, git, Martin von Zweigbergk

Junio C Hamano <gitster@pobox.com> writes:

> Thomas Rast <trast@student.ethz.ch> writes:
>
>> So the --topo-order switch *ensures* that we process commits in
>> topological order even in the face of skewed clocks.
>
> Yes, I *think* that I attempted to show with the illustration.

But then the new description is wrong.  It claims that children are
always before parents, which is not true in the face of clock skew.  Or
am I missing something?

>> I suspect that
>>
>>> +	their parents, but this tries to avoid showing commits on
>>> +	multiple lines of history intermixed.
>>
>> is just a fortunate side effect of the topological sort.
>
> I am not sure if it is "side effect"; I *think* it was the "primary
> objective" we added topo-order in the first place.

I won't judge that, since it's waaaay before my time :-)

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] rev-list docs: clarify --topo-order description
  2012-08-14  5:33     ` Martin von Zweigbergk
@ 2012-08-14 14:54       ` Junio C Hamano
  0 siblings, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2012-08-14 14:54 UTC (permalink / raw)
  To: Martin von Zweigbergk; +Cc: git

Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> writes:

> Still, the "Even without this option" strongly suggests to me that
> what follows ("descendant commits are shown before parents") applies
> to the "By default" case. Would it be correct to say something like
> "By default, the commits are shown in reverse chronological order.
> When commit limiting is in effect, descendant commits are shown before
> parents."?

I wonder if spelling out that level of detail is unnecessarily
overspecifying the behaviour.

In general, I'd prefer to keep the insn to end users to the minimum:
i.e. use the default when you do not care too deeply about the order
but you want to get commits in the range in reverse time order in
general and let the default take care of the detail, and use one of
the --foo-order options if a script wants to be more specific about
the order.  That way we do not have to get our hands tied to an
unnecessary degree and keep the door open for us to improve the
implementation.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] rev-list docs: clarify --topo-order description
  2012-08-14 14:51     ` Thomas Rast
@ 2012-08-14 15:47       ` Junio C Hamano
  2012-08-15 20:02       ` [PATCH v2] " Junio C Hamano
  1 sibling, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2012-08-14 15:47 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git, Martin von Zweigbergk

Thomas Rast <trast@student.ethz.ch> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> Thomas Rast <trast@student.ethz.ch> writes:
>>
>>> So the --topo-order switch *ensures* that we process commits in
>>> topological order even in the face of skewed clocks.
>>
>> Yes, I *think* that I attempted to show with the illustration.
>
> But then the new description is wrong.  It claims that children are
> always before parents, which is not true in the face of clock skew.

Ah, you are talking about the "even without this option" part.
Yeah, that does not hold true.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2] rev-list docs: clarify --topo-order description
  2012-08-14 14:51     ` Thomas Rast
  2012-08-14 15:47       ` Junio C Hamano
@ 2012-08-15 20:02       ` Junio C Hamano
  2012-08-16  6:06         ` Martin von Zweigbergk
  2012-08-16  8:42         ` Thomas Rast
  1 sibling, 2 replies; 26+ messages in thread
From: Junio C Hamano @ 2012-08-15 20:02 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git, Martin von Zweigbergk

It was unclear what "--topo-order" was really about in the
documentation.  It is not just about "children before parent", but
also about "don't mix lineages".

Reword the description for both "--date-order" and "--topo-order",
and add an illustration to it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
  Thomas Rast <trast@student.ethz.ch> writes:

  > But then the new description is wrong.  It claims that children are
  > always before parents, which is not true in the face of clock skew.

  Thanks for sanity-check.  Here is an updated one.

 Documentation/rev-list-options.txt | 31 ++++++++++++++++++++++++-------
 1 file changed, 24 insertions(+), 7 deletions(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 6a4b635..9404d08 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -578,16 +578,33 @@ Commit Ordering
 
 By default, the commits are shown in reverse chronological order.
 
---topo-order::
+--date-order::
+	Show no parents before all of its children are shown, but
+	otherwise show commits in the commit timestamp order.
 
-	This option makes them appear in topological order (i.e.
-	descendant commits are shown before their parents).
+--topo-order::
+	Show no parents before all of its children are shown, and
+	avoid showing commits on multiple lines of history
+	intermixed.
++
+For example, in a commit history like this:
++
+----------------------------------------------------------------
 
---date-order::
+    ---1----2----4----7
+	\	       \
+	 3----5----6----8---
 
-	This option is similar to '--topo-order' in the sense that no
-	parent comes before all of its children, but otherwise things
-	are still ordered in the commit timestamp order.
+----------------------------------------------------------------
++
+where the numbers denote the order of commit timestamps, `git
+rev-list` and friends with `--date-order` show the commits in the
+timestamp order: 8 7 6 5 4 3 2 1.
++
+With `--topo-order`, they would show 8 6 5 3 7 4 2 1 (or 8 7 4 2 6 5
+3 1); some older commits are shown before newer ones in order to
+avoid showing the commits from two parallel development track mixed
+together.
 
 --reverse::
 
-- 
1.7.12.rc2.85.g1de7134

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] rev-list docs: clarify --topo-order description
  2012-08-15 20:02       ` [PATCH v2] " Junio C Hamano
@ 2012-08-16  6:06         ` Martin von Zweigbergk
  2012-08-16  6:20           ` Junio C Hamano
  2012-08-16  8:42         ` Thomas Rast
  1 sibling, 1 reply; 26+ messages in thread
From: Martin von Zweigbergk @ 2012-08-16  6:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Thomas Rast, git

On Wed, Aug 15, 2012 at 1:02 PM, Junio C Hamano <gitster@pobox.com> wrote:
> diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
> index 6a4b635..9404d08 100644
> --- a/Documentation/rev-list-options.txt
> +++ b/Documentation/rev-list-options.txt
> @@ -578,16 +578,33 @@ Commit Ordering
>
>  By default, the commits are shown in reverse chronological order.

As I said before, this led me to believe that "the commits are shown
in reverse chronological order" when neither of --topo-order or
--date-order is passed. I agree that we should avoid specifying more
than necessary about the default case. But in this case, what is
specified is not exactly true (in the face of clock skew). Or am I
misunderstanding or misinterpreting something? I still don't
understand the revision walker well enough to be able to provide a
better wording, but I think even the rather crude "By default, the
order is unspecified." would at least not be as easy to misinterpret
(if that's what I did) and is definitely not over-specified. Of
course, if someone can think of something better, I'm all for it.

Regardless of the above comment about the patch context, what your
patch actually changes looks good. Thanks again.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] rev-list docs: clarify --topo-order description
  2012-08-16  6:06         ` Martin von Zweigbergk
@ 2012-08-16  6:20           ` Junio C Hamano
  2012-08-16  6:26             ` Junio C Hamano
  0 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2012-08-16  6:20 UTC (permalink / raw)
  To: Martin von Zweigbergk; +Cc: Thomas Rast, git

Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> writes:

> On Wed, Aug 15, 2012 at 1:02 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
>> index 6a4b635..9404d08 100644
>> --- a/Documentation/rev-list-options.txt
>> +++ b/Documentation/rev-list-options.txt
>> @@ -578,16 +578,33 @@ Commit Ordering
>>
>>  By default, the commits are shown in reverse chronological order.
>
> As I said before, this led me to believe that "the commits are shown
> in reverse chronological order" when neither of --topo-order or
> --date-order is passed. I agree that we should avoid specifying more
> than necessary about the default case. But in this case, what is
> specified is not exactly true (in the face of clock skew). Or am I
> misunderstanding or misinterpreting something? I still don't
> understand the revision walker well enough to be able to provide a
> better wording, but I think even the rather crude "By default, the
> order is unspecified." would at least not be as easy to misinterpret
> (if that's what I did) and is definitely not over-specified. Of
> course, if someone can think of something better, I'm all for it.
>
> Regardless of the above comment about the patch context, what your
> patch actually changes looks good. Thanks again.

We could remove it if you find it confusing.

I think the original motivation that line was added was to help
people who see "git log" (without any frills) output for the first
time not to be alarmed when they see newer things first: "In
general, the "time" flows from bottom to top", but the "time" in
that sentence is not necessarily the timestamp of either committer
nor author field.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] rev-list docs: clarify --topo-order description
  2012-08-16  6:20           ` Junio C Hamano
@ 2012-08-16  6:26             ` Junio C Hamano
  2012-08-16  8:51               ` Thomas Rast
  0 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2012-08-16  6:26 UTC (permalink / raw)
  To: Martin von Zweigbergk; +Cc: Thomas Rast, git

Junio C Hamano <gitster@pobox.com> writes:

> We could remove it if you find it confusing.
>
> I think the original motivation that line was added was to help
> people who see "git log" (without any frills) output for the first
> time not to be alarmed when they see newer things first: "In
> general, the "time" flows from bottom to top", but the "time" in
> that sentence is not necessarily the timestamp of either committer
> nor author field.

Just to clarify, I am not defending the current wording that I did
not touch in my patch with the above.  I am just giving historical
background to help _other_ people (including you) to come up with a
better wording, as I do not think of a better replacement myself.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] rev-list docs: clarify --topo-order description
  2012-08-15 20:02       ` [PATCH v2] " Junio C Hamano
  2012-08-16  6:06         ` Martin von Zweigbergk
@ 2012-08-16  8:42         ` Thomas Rast
  1 sibling, 0 replies; 26+ messages in thread
From: Thomas Rast @ 2012-08-16  8:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Thomas Rast, git, Martin von Zweigbergk

Junio C Hamano <gitster@pobox.com> writes:

>   Thomas Rast <trast@student.ethz.ch> writes:
>
>   > But then the new description is wrong.  It claims that children are
>   > always before parents, which is not true in the face of clock skew.
>
>   Thanks for sanity-check.  Here is an updated one.
[...]
> ---topo-order::
> +--date-order::
> +	Show no parents before all of its children are shown, but
> +	otherwise show commits in the commit timestamp order.
>  
> -	This option makes them appear in topological order (i.e.
> -	descendant commits are shown before their parents).
> +--topo-order::
> +	Show no parents before all of its children are shown, and
> +	avoid showing commits on multiple lines of history
> +	intermixed.

Ack, thanks.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] rev-list docs: clarify --topo-order description
  2012-08-16  6:26             ` Junio C Hamano
@ 2012-08-16  8:51               ` Thomas Rast
  2012-08-16 10:01                 ` Michael Haggerty
  0 siblings, 1 reply; 26+ messages in thread
From: Thomas Rast @ 2012-08-16  8:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Martin von Zweigbergk, git

Junio C Hamano <gitster@pobox.com> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> We could remove it if you find it confusing.
>>
>> I think the original motivation that line was added was to help
>> people who see "git log" (without any frills) output for the first
>> time not to be alarmed when they see newer things first: "In
>> general, the "time" flows from bottom to top", but the "time" in
>> that sentence is not necessarily the timestamp of either committer
>> nor author field.
>
> Just to clarify, I am not defending the current wording that I did
> not touch in my patch with the above.  I am just giving historical
> background to help _other_ people (including you) to come up with a
> better wording, as I do not think of a better replacement myself.

I tend to agree with Martin, the existing header for the list

>>>  By default, the commits are shown in reverse chronological order.

is misleading.  I suppose the real problem is that the "true" ordering
is completely obvious as the one ordering that does not require
preprocessing, but ugly to specify in words.  Perhaps we can bikeshed a
little?  How about

  By default, commits are shown in an order that coincides with
  `--date-order` on well-behaved history, but is faster to compute.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] rev-list docs: clarify --topo-order description
  2012-08-16  8:51               ` Thomas Rast
@ 2012-08-16 10:01                 ` Michael Haggerty
  2012-08-16 12:00                   ` Thomas Rast
  0 siblings, 1 reply; 26+ messages in thread
From: Michael Haggerty @ 2012-08-16 10:01 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Junio C Hamano, Martin von Zweigbergk, git

On 08/16/2012 10:51 AM, Thomas Rast wrote:
> [...]
> is misleading.  I suppose the real problem is that the "true" ordering
> is completely obvious as the one ordering that does not require
> preprocessing, but ugly to specify in words.  Perhaps we can bikeshed a
> little?  How about
>
>    By default, commits are shown in an order that coincides with
>    `--date-order` on well-behaved history, but is faster to compute.

Maybe the problem is not the description of the options, but the options 
themselves.  Why does the behavior default to some mysterious order that 
we don't even want to document?  Only for the sake of computational 
efficiency.  This is the tail wagging the dog.

Why not turn the behavior on its head:

* Change the default behavior to be something well-defined, easy to 
document, and convenient for humans, such as "topological order with 
ties broken by timestamp" or "approximate timestamp order, but 
respecting dependencies".

* Add a new option, --arbitrary-order, that explicitly chooses 
efficiency instead of a defined order.

That way the easiest thing to type is also the most convenient, whereas 
when you care about efficiency and *don't* care about order (i.e., 
mainly in scripts) you can explicitly request the high-performance option.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] rev-list docs: clarify --topo-order description
  2012-08-16 10:01                 ` Michael Haggerty
@ 2012-08-16 12:00                   ` Thomas Rast
  2012-08-16 16:10                     ` Junio C Hamano
  2012-08-16 16:35                     ` Michael Haggerty
  0 siblings, 2 replies; 26+ messages in thread
From: Thomas Rast @ 2012-08-16 12:00 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Thomas Rast, Junio C Hamano, Martin von Zweigbergk, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> On 08/16/2012 10:51 AM, Thomas Rast wrote:
>> I suppose the real problem is that the "true" ordering
>> is completely obvious as the one ordering that does not require
>> preprocessing, but ugly to specify in words.  Perhaps we can bikeshed a
>> little?  How about
>>
>>    By default, commits are shown in an order that coincides with
>>    `--date-order` on well-behaved history, but is faster to compute.
>
> Maybe the problem is not the description of the options, but the
> options themselves.  Why does the behavior default to some mysterious
> order that we don't even want to document?  Only for the sake of
> computational efficiency.  This is the tail wagging the dog.
>
> Why not turn the behavior on its head:
>
> * Change the default behavior to be something well-defined, easy to
> document, and convenient for humans, such as "topological order with
> ties broken by timestamp" or "approximate timestamp order, but
> respecting dependencies".
>
> * Add a new option, --arbitrary-order, that explicitly chooses
> efficiency instead of a defined order.

I think that would be a rather bad decision, largely because (taking my
git.git as an example):

  $ time git log | head -1
  commit e5e6172f9060c958e3f0d679cd7049d4007eed2c

  real    0m0.033s
  user    0m0.026s
  sys     0m0.007s

  $ time git log --date-order | head -1
  commit e5e6172f9060c958e3f0d679cd7049d4007eed2c

  real    0m0.429s
  user    0m0.359s
  sys     0m0.031s

That is, even in medium-sized projects like git.git, any -order option
incurs a significant preprocessing time until git-log can show the first
commit.  It scales linearly with the number of commits in the range, and
in a linux.git lying around here is already around 3.9s for the same
command.

So if --date-order or --topo-order were the default, you'd have a
significant delay even if you just use 'git log' to quickly see the last
few commits.  At least to me, the optimization makes perfect sense.

The right fix would be to dig up Peff's work on generation number
caching, and modify the algorithm to take generation numbers into
account.  Then the default order would magically become the same as
--date-order at no speed loss, *and* we'd get a huge speed boost to 'git
log --graph' startup times.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] rev-list docs: clarify --topo-order description
  2012-08-16 12:00                   ` Thomas Rast
@ 2012-08-16 16:10                     ` Junio C Hamano
  2012-08-17  9:34                       ` Thomas Rast
  2012-08-16 16:35                     ` Michael Haggerty
  1 sibling, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2012-08-16 16:10 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Michael Haggerty, Martin von Zweigbergk, git

Thomas Rast <trast@student.ethz.ch> writes:

>> Why not turn the behavior on its head:
>>
>> * Change the default behavior to be something well-defined, easy to
>> document, and convenient for humans, such as "topological order with
>> ties broken by timestamp" or "approximate timestamp order, but
>> respecting dependencies".
>>
>> * Add a new option, --arbitrary-order, that explicitly chooses
>> efficiency instead of a defined order.
>
> I think that would be a rather bad decision, largely because (taking my
> git.git as an example):
>
>   $ time git log | head -1
>   $ time git log --date-order | head -1

You are correct to point out that introducing "--arbitrary-order"
and force sorting by default is stupid for one reason, but forgot to
stress the other equally important reason, I think.  Even though you
came close to it here:

> ... if you just use 'git log' to quickly see the last
> few commits.  At least to me, the optimization makes perfect sense.

you may not have fully internalized that other reason yourself, I
suspect, for the reason I mention in my last two paragraphs below.

When you run "git log", you are asking only to see "the last few"
commits.  The size of "few" actually depends on the occasion and the
user, but the important thing to notice is that the definition of
"the last" is fuzzily defined.  In such a request over a history
like this:

      ---A---B---C---D
                      \
    ---1---2---3---4---* = HEAD

the user does not care the exact order, as long as the ones "closer"
(again, a fuzzy definition) to HEAD come out earlier than the ones
"farther" (and all of them have to come out eventually but that goes
without saying).

In the case of the above sample history, even with clock skews, the
ones labeled with alphabets appear in the expected order among
themselves, and the ones labeled with numbers also do, with or
without sorting.  And all three orders match the use case of "git
log" to view "the last few" commits just fine.  When we have the
default, topo and date orders, all of which would give us output
suitable for the purpose of showing "the last few", picking the one
with the least latency is the right thing to do.

In other words, latency is important, but a short-latency solution
is acceptable and preferred only if it gives a reasonable result.
Giving useless output with small latency is not what we are aiming
to do.

> The right fix would be to dig up Peff's work on generation number
> caching, and modify the algorithm to take generation numbers into
> account.

I think you are totally wrong here, unless you are talking about a
generation number that is different from what I recall from the
older discussion.  Think of the sample history above, and imagine
that the numbered ones are based on the current 'master', but that
the alphabet ones are based on an ancient maintenance release that
is 1000 generations behind (think of me running the command after
finishing the day's integration cycle, sitting at the tip of 'pu',
where the last topic merged is meant to be eventually merged to
maint-1.7.9).  All of the commits depicted in the picture will have
the commit timestamps in the past few hours.  Ancestors of A and 1,
not drawn in the picture, were made yesterday or before.

The current "default" output will give all of the depicted commits
before showing the older commits that are not in the picture, and
that is absolutely the right thing to do when you ask "git log" to
view "the last few" commits.  Imagine what you will see if you used
generation numbers instead of the commit timestamp.  You will see
commits on the maintenance topic that can later be merged to an
older codebase, only after you saw all the development history on
the master branch since 1.7.9 release.  I do not think we want to
call such an output "the right fix".

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] rev-list docs: clarify --topo-order description
  2012-08-16 12:00                   ` Thomas Rast
  2012-08-16 16:10                     ` Junio C Hamano
@ 2012-08-16 16:35                     ` Michael Haggerty
  1 sibling, 0 replies; 26+ messages in thread
From: Michael Haggerty @ 2012-08-16 16:35 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Junio C Hamano, Martin von Zweigbergk, git

On 08/16/2012 02:00 PM, Thomas Rast wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
>
>> On 08/16/2012 10:51 AM, Thomas Rast wrote:
>>> I suppose the real problem is that the "true" ordering
>>> is completely obvious as the one ordering that does not require
>>> preprocessing, but ugly to specify in words.  Perhaps we can bikeshed a
>>> little?  How about
>>>
>>>     By default, commits are shown in an order that coincides with
>>>     `--date-order` on well-behaved history, but is faster to compute.
>>
>> Maybe the problem is not the description of the options, but the
>> options themselves.  Why does the behavior default to some mysterious
>> order that we don't even want to document?  Only for the sake of
>> computational efficiency.  This is the tail wagging the dog.
>>
>> Why not turn the behavior on its head:
>>
>> * Change the default behavior to be something well-defined, easy to
>> document, and convenient for humans, such as "topological order with
>> ties broken by timestamp" or "approximate timestamp order, but
>> respecting dependencies".
>>
>> * Add a new option, --arbitrary-order, that explicitly chooses
>> efficiency instead of a defined order.
>
> I think that would be a rather bad decision, largely because (taking my
> git.git as an example):
>
>    $ time git log | head -1
>    commit e5e6172f9060c958e3f0d679cd7049d4007eed2c
>
>    real    0m0.033s
>    user    0m0.026s
>    sys     0m0.007s
>
>    $ time git log --date-order | head -1
>    commit e5e6172f9060c958e3f0d679cd7049d4007eed2c
>
>    real    0m0.429s
>    user    0m0.359s
>    sys     0m0.031s
>
> That is, even in medium-sized projects like git.git, any -order option
> incurs a significant preprocessing time until git-log can show the first
> commit.  It scales linearly with the number of commits in the range, and
> in a linux.git lying around here is already around 3.9s for the same
> command.

Thanks for timing this; I didn't realize how costly this would be.  Just 
to make it even more obvious that this performance regression would bite 
in daily life, consider

     $ time git log -1

     real    0m0.013s
     user    0m0.000s
     sys     0m0.004s

     $ time git log -1 --topo-order

     real    0m0.334s
     user    0m0.316s
     sys     0m0.012s

Ouch.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] rev-list docs: clarify --topo-order description
  2012-08-16 16:10                     ` Junio C Hamano
@ 2012-08-17  9:34                       ` Thomas Rast
  2012-08-17  9:50                         ` Thomas Rast
                                           ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Thomas Rast @ 2012-08-17  9:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, Martin von Zweigbergk, git

Junio C Hamano <gitster@pobox.com> writes:

> Thomas Rast <trast@student.ethz.ch> writes:
>
>> The right fix would be to dig up Peff's work on generation number
>> caching, and modify the algorithm to take generation numbers into
>> account.
>
> I think you are totally wrong here, unless you are talking about a
> generation number that is different from what I recall from the
> older discussion.

Hrm, now you're making me feel like I missed something.  But my
reasoning goes roughly like this.

Consider the commit graph as a DAG, where the parent pointers are
directed edges from child to parent.  Thus, tip commits are sources and
root commits are sinks.  One way to get a valid topo ordering, knowing
the complete DAG, is:

  while there are vertices left:
    remove any source and emit it

(You can also generate it from the sinks instead; I haven't actually
checked what the code currently does.  But for the following, we need to
start from the sources.)

To implement this efficiently, one usually keeps track of the set S of
sources, and inserts vertices that (by removal of their last
in-neighbour) have become a source.

The distinction between topo-order and date-order is the preference
given in the choice of source.  IIUC, date-order prefers the newest
source by committer date (using a priority queue or such for S), and
topo-order prefers the source(s) which were parents of the vertex
removed in the previous iteration (using a stack for S).

The problem with this algorithm is that it requires knowledge of the
DAG[1].  So with --topo-order, git dutifully loads the whole commit
graph, and then runs the algorithm, leading to the discussed startup
delay.

However, suppose we knew generation numbers.  I haven't actually looked
into the old threads again, but my understanding was that they are
numbers g(C) attached to each commit C such that

  g(C) = 1 + max(g(P) for P a parent of C)   for non-root commits

  g(C) = 0                                   for root commits

They are invariant given the commit, so they can be cached.  And they
allow us to infer something about ancestry: if g(C)<=g(D), then C cannot
possibly be a descendant of D.

The topo order algorithm can be modified to take advantage of them, in
order to provide incremental processing:

  Let S be the set of tentative sources

  Let U be the set of vertices whose out-edges are no known yet
    (i.e., the set of commits which haven't been loaded yet)

  Let max g(U) denote "max g(C) taken over all C in U".

  Fill S and U from the revisions given:
    load all positive revisions, let's call this set P
    put all zero-indegree members of P in S
    put all parents of members of P in U (as far as they are not also
    members of P, and therefore loaded already)

  while there are any vertices left:

    pick any tentative source C from S that we "want to emit"

    # Ascertain that no unknown commit (from U or further beyond) can be
    # a descendant of C
    while there is a D in U such that g(D) > g(C):
      load D
      remove D from U
      add the parents of D to U if they were not already loaded
      possibly remove some elements of S if their indegree became nonzero

    if C was removed from S:
      continue

    remove C from the graph and emit it

I hope I got that right.  The order of commits is still entirely
determined by the choice of "any tentative source", but the algorithm
should now stream nicely once the generation numbers are known.


Footnotes: 

[1] If there are any parent pointers not yet followed, they may in fact
wind up (over at least one more commit) pointing to the commit you
wanted to emit next, making that an invalid choice.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] rev-list docs: clarify --topo-order description
  2012-08-17  9:34                       ` Thomas Rast
@ 2012-08-17  9:50                         ` Thomas Rast
  2012-08-17 17:18                         ` Junio C Hamano
  2012-08-17 17:40                         ` Junio C Hamano
  2 siblings, 0 replies; 26+ messages in thread
From: Thomas Rast @ 2012-08-17  9:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, Martin von Zweigbergk, git

Thomas Rast <trast@inf.ethz.ch> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
> The topo order algorithm can be modified to take advantage of
> [generation numbers], in order to provide incremental processing:
>
>   Let S be the set of tentative sources
>
>   Let U be the set of vertices whose out-edges are no known yet
>     (i.e., the set of commits which haven't been loaded yet)
[...]
>   while there are any vertices left:
>
>     pick any tentative source C from S that we "want to emit"
>
>     # Ascertain that no unknown commit (from U or further beyond) can be
>     # a descendant of C
>     while there is a D in U such that g(D) > g(C):
>       load D
>       remove D from U
>       add the parents of D to U if they were not already loaded
>       possibly remove some elements of S if their indegree became nonzero
>
>     if C was removed from S:
>       continue
>
>     remove C from the graph and emit it

By the way, this does bump the runtime of the algorithm a bit, depending
on the data structure used for U.  Recall that ordinary topo-sort with a
stack for S (i.e., --topo-order) runs linearly with the number of
vertices.

If we use a priority queue for U, which lets us get at the
highest-generation unknown commits easily, it potentially goes to n log n 
if U reaches linear size at some point.

That shouldn't hurt too much of course, since on the one hand it should
rarely actually get that big, and OTOH --date-order has n log n runtime
anyway (using a priority queue for S).

Thanks for challenging me on my "it should work" feeling.  It was quite
interesting to actually think it through and write down a workable
algorithm.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] rev-list docs: clarify --topo-order description
  2012-08-17  9:34                       ` Thomas Rast
  2012-08-17  9:50                         ` Thomas Rast
@ 2012-08-17 17:18                         ` Junio C Hamano
  2012-08-17 17:37                           ` Thomas Rast
  2012-08-17 17:40                         ` Junio C Hamano
  2 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2012-08-17 17:18 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Michael Haggerty, Martin von Zweigbergk, git

Thomas Rast <trast@inf.ethz.ch> writes:

> However, suppose we knew generation numbers.  I haven't actually looked
> into the old threads again, but my understanding was that they are
> numbers g(C) attached to each commit C such that
>
>   g(C) = 1 + max(g(P) for P a parent of C)   for non-root commits
>
>   g(C) = 0                                   for root commits
>
> They are invariant given the commit, so they can be cached.
> ...
> I hope I got that right.  The order of commits is still entirely
> determined by the choice of "any tentative source", but the algorithm
> should now stream nicely once the generation numbers are known.

That matches the definition of generation number I remember from the
old discussion.  Now look at the illustration in this discussion
again:

      ---A---B---C---D
                      \
    ---1---2---3---4---* = HEAD

and an excerpt from the message you are respoding to:

  Think of the sample history above, and imagine
  that the numbered ones are based on the current 'master', but that
  the alphabet ones are based on an ancient maintenance release that
  is 1000 generations behind (think of me running the command after
  finishing the day's integration cycle, sitting at the tip of 'pu',
  where the last topic merged is meant to be eventually merged to
  maint-1.7.9).  All of the commits depicted in the picture will have
  the commit timestamps in the past few hours.  Ancestors of A and 1,
  not drawn in the picture, were made yesterday or before.

The numbered commits 1 2 3 4 are building on top of recent "master",
while alphabetized A B C D are building on aged maintenance track.
The difference in generation numbers between 1 and 2, 2 and 3,... A
and B, B and C, C and D are all one, and HEAD (the tip of 'pu') would
have generation number of commit 4 plus 1, as commit 4's generation
number would be a thousand or more ahead of that of commit D.  And
there are a thousand ancestors of '1' with larger generation numbers
than 'D'.

When the user runs "git log" (i.e. the casual "the last few commit"
macthes), the expectation of the user is "I want to see what I did
recently".  If you substituted the commit timestamp with such a
generation number, how would that expectation satisified?

Generation numbers is a clean solution to avoid giving an incorrect
range result when there are clock skews (cf. the use of SLOP in
revision.c::still_interesting()), but it is not a good substitute
when the user is more interested in absolutely correct topology (in
other words, the user is more interested in both rougly wallclock
time order and seeing the result soon --- which brings us back to
the original "short latency is vastly preferred _as long as_ it
produces the result the user wants to see").

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] rev-list docs: clarify --topo-order description
  2012-08-17 17:18                         ` Junio C Hamano
@ 2012-08-17 17:37                           ` Thomas Rast
  2012-08-17 18:11                             ` Junio C Hamano
  0 siblings, 1 reply; 26+ messages in thread
From: Thomas Rast @ 2012-08-17 17:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, Martin von Zweigbergk, git

Junio C Hamano <gitster@pobox.com> writes:

> Thomas Rast <trast@inf.ethz.ch> writes:
>
>> However, suppose we knew generation numbers.  I haven't actually looked
>> into the old threads again, but my understanding was that they are
>> numbers g(C) attached to each commit C such that
>>
>>   g(C) = 1 + max(g(P) for P a parent of C)   for non-root commits
>>
>>   g(C) = 0                                   for root commits
>>
>> They are invariant given the commit, so they can be cached.
>> ...
>> I hope I got that right.  The order of commits is still entirely
>> determined by the choice of "any tentative source", but the algorithm
>> should now stream nicely once the generation numbers are known.
>
> That matches the definition of generation number I remember from the
> old discussion.  Now look at the illustration in this discussion
> again:
>
>       ---A---B---C---D
>                       \
>     ---1---2---3---4---* = HEAD
[...]
> The numbered commits 1 2 3 4 are building on top of recent "master",
> while alphabetized A B C D are building on aged maintenance track.
> The difference in generation numbers between 1 and 2, 2 and 3,... A
> and B, B and C, C and D are all one, and HEAD (the tip of 'pu') would
> have generation number of commit 4 plus 1, as commit 4's generation
> number would be a thousand or more ahead of that of commit D.  And
> there are a thousand ancestors of '1' with larger generation numbers
> than 'D'.
>
> When the user runs "git log" (i.e. the casual "the last few commit"
> macthes), the expectation of the user is "I want to see what I did
> recently".  If you substituted the commit timestamp with such a
> generation number, how would that expectation satisified?

Umm, have you looked at the algorithm I proposed?

It does not substitute the generation numbers for anything, let alone
the date.  It merely uses them to determine a point where it knows
"enough" of the history to be able to emit the next commit; that is,
where it can use the generation numbers to prove that no unknown commit
can be a descendant of what it wants to emit next.

It does *not* have to use the generation numbers in the final ordering
of the commits.  That final order is determined by how the algorithm
chooses the next candidate commit.  If you use a stack, it winds up
being --topo-order.  If you use a date-ordered priority queue, it
becomes --date-order.

So really, this is only about modifying the algorithm that generates the
existing order to allow for streaming output as it reads through
history.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] rev-list docs: clarify --topo-order description
  2012-08-17  9:34                       ` Thomas Rast
  2012-08-17  9:50                         ` Thomas Rast
  2012-08-17 17:18                         ` Junio C Hamano
@ 2012-08-17 17:40                         ` Junio C Hamano
  2 siblings, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2012-08-17 17:40 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Michael Haggerty, Martin von Zweigbergk, git

Thomas Rast <trast@inf.ethz.ch> writes:

> I hope I got that right.  The order of commits is still entirely
> determined by the choice of "any tentative source", but the algorithm
> should now stream nicely once the generation numbers are known.

Thanks for an intereseting read.

Even though generation numbers are not always a good subsitute for
timestamps depending on the use (see the other message), the algorithm
to compute "topo-order" can take advantage of the generation numbers
when they are available, as you illustrated in your message.

We may want to think about adding a new field to the commit object,
with a fallback mechanism to an external "cache" that is generated
on-demand.

Once the mechanism is in place so that any caller of parse_commit()
can depend on having a reliable .generation field in the commit
object it receives from the function, we can start updating various
traversal algorithms to take advantage of this new information.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] rev-list docs: clarify --topo-order description
  2012-08-17 17:37                           ` Thomas Rast
@ 2012-08-17 18:11                             ` Junio C Hamano
  0 siblings, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2012-08-17 18:11 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Michael Haggerty, Martin von Zweigbergk, git

Thomas Rast <trast@student.ethz.ch> writes:

> Umm, have you looked at the algorithm I proposed?
> ...
> So really, this is only about modifying the algorithm that generates the
> existing order to allow for streaming output as it reads through
> history.

Sorry, I thought you were optimizing sort_in_topological_order()
found in commit.c, not the callchain that forms get_revision()
in revision.c.

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2012-08-17 18:12 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-13 22:21 [PATCH] rev-list docs: clarify --topo-order description Junio C Hamano
2012-08-13 22:46 ` Martin von Zweigbergk
2012-08-13 23:05   ` Junio C Hamano
2012-08-14  5:33     ` Martin von Zweigbergk
2012-08-14 14:54       ` Junio C Hamano
2012-08-14  8:22 ` Michael Haggerty
2012-08-14  8:45 ` Thomas Rast
2012-08-14 14:30   ` Junio C Hamano
2012-08-14 14:51     ` Thomas Rast
2012-08-14 15:47       ` Junio C Hamano
2012-08-15 20:02       ` [PATCH v2] " Junio C Hamano
2012-08-16  6:06         ` Martin von Zweigbergk
2012-08-16  6:20           ` Junio C Hamano
2012-08-16  6:26             ` Junio C Hamano
2012-08-16  8:51               ` Thomas Rast
2012-08-16 10:01                 ` Michael Haggerty
2012-08-16 12:00                   ` Thomas Rast
2012-08-16 16:10                     ` Junio C Hamano
2012-08-17  9:34                       ` Thomas Rast
2012-08-17  9:50                         ` Thomas Rast
2012-08-17 17:18                         ` Junio C Hamano
2012-08-17 17:37                           ` Thomas Rast
2012-08-17 18:11                             ` Junio C Hamano
2012-08-17 17:40                         ` Junio C Hamano
2012-08-16 16:35                     ` Michael Haggerty
2012-08-16  8:42         ` Thomas Rast

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).