* Re: [PATCH] Documentation: 'cherry' does not cope well with merges from upstream
2010-07-01 21:09 ` [PATCH] Documentation: 'cherry' does not cope well with merges from upstream Jonathan Nieder
@ 2010-07-01 21:33 ` Andrew Pimlott
2010-07-01 21:35 ` Jonathan Nieder
` (3 subsequent siblings)
4 siblings, 0 replies; 13+ messages in thread
From: Andrew Pimlott @ 2010-07-01 21:33 UTC (permalink / raw)
To: Jonathan Nieder
Cc: git, Frédéric Brière, Michael J Gruber, Jeff King,
Björn Steinbrink
Excerpts from Jonathan Nieder's message of Thu Jul 01 14:09:19 -0700 2010:
> Example:
>
> o---o---F---X'---G---U [upstream]
> \ \
> X----Y---M---T [topic]
>
> Suppose the author of the âtopicâ branch starts from upstream
> commit F and makes a few changes. One is applied upstream, and
> additionally there is some other useful upstream change, so he
> performs a merge to include the upstream updates into topic.
> The expected output from âcherryâ is:
>
> + T
> + Y
> - X
>
> Consider the author of a different branch, also called âtopicâ, but
> this one starts from commit G. Some infrastructure from an existing
> branch is needed, so first she merges that. Then she adds her own
> commit. The expected output from âcherryâ is:
>
> + T
> + Y
> + X
>
> since none of the new commits have been applied upstream since
> the fork point.
>
> âcherryâ cannot distinguish between these two cases
Thanks for the awesome explanation! (I looked at the code but would not
have pulled this understanding.) I would still say the first output is
the more reasonable: it's more likely (in my estimate) the wanted
result, and in the case where it's not it's at least easily
comprehended.
Anyway, the doc patch helps, and I would love git cherry --full.
Andrew
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Documentation: 'cherry' does not cope well with merges from upstream
2010-07-01 21:09 ` [PATCH] Documentation: 'cherry' does not cope well with merges from upstream Jonathan Nieder
2010-07-01 21:33 ` Andrew Pimlott
@ 2010-07-01 21:35 ` Jonathan Nieder
2010-07-01 23:52 ` Junio C Hamano
` (2 subsequent siblings)
4 siblings, 0 replies; 13+ messages in thread
From: Jonathan Nieder @ 2010-07-01 21:35 UTC (permalink / raw)
To: Andrew Pimlott
Cc: git, Frédéric Brière, Michael J Gruber, Jeff King,
Björn Steinbrink
Jonathan Nieder wrote:
> it makes sense semantically
> since even if a new patch patches an old patch from <upstream>,
err, for 'patches' read 'matches'. I was not trying to say
something that complicated. :)
Sorry for the noise.
Jonathan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Documentation: 'cherry' does not cope well with merges from upstream
2010-07-01 21:09 ` [PATCH] Documentation: 'cherry' does not cope well with merges from upstream Jonathan Nieder
2010-07-01 21:33 ` Andrew Pimlott
2010-07-01 21:35 ` Jonathan Nieder
@ 2010-07-01 23:52 ` Junio C Hamano
2010-07-02 0:51 ` Jonathan Nieder
2010-07-02 1:04 ` Jonathan Nieder
2010-07-02 7:46 ` Michael J Gruber
4 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2010-07-01 23:52 UTC (permalink / raw)
To: Jonathan Nieder
Cc: Andrew Pimlott, git, Frédéric Brière,
Michael J Gruber, Jeff King, Björn Steinbrink
Jonathan Nieder <jrnieder@gmail.com> writes:
> Example:
>
> o---o---F---X'---G---U [upstream]
> \ \
> X----Y---M---T [topic]
>
> Suppose the author of the ‘topic’ branch starts from upstream
> commit F and makes a few changes. One is applied upstream, and
> additionally there is some other useful upstream change, so he
> performs a merge to include the upstream updates into topic.
> The expected output from ‘cherry’ is:
>
> + T
> + Y
> - X
>
> Consider the author of a different branch, also called ‘topic’, but
> this one starts from commit G. Some infrastructure from an existing
> branch is needed, so first she merges that. Then she adds her own
> commit.
Sorry, but it is unclear to me what kind of history you have in mind at
this point. What "existing branch" are you talking about? Presumably it
is not the [topic] in an earlier example, nor it is [upstream] right?
o---o---o---o----G-------.---U [upstream]
\ \
X---Y---M---T
Something like this?
> The expected output from ‘cherry’ is:
>
> + T
> + Y
> + X
>
> since none of the new commits have been applied upstream since
> the fork point.
>
> ‘cherry’ cannot distinguish between these two cases, in part because
> it does not distinguish between parents in a merge commit.
Now you completely lost me. I guess the biggest reason is you only talk
about "the expected output" without talking about "what it actually
gives". Hence it is unclear what the significant difference "between
these two cases" you are trying to stress here.
> Thoughts? Improvements?
I think the actual patch text has the same problem. You say "these
commits" without saying which ones they are; perhaps saing "the commits
represented by asterisks in the picture" or something may help, but I
dunno.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Documentation: 'cherry' does not cope well with merges from upstream
2010-07-01 23:52 ` Junio C Hamano
@ 2010-07-02 0:51 ` Jonathan Nieder
0 siblings, 0 replies; 13+ messages in thread
From: Jonathan Nieder @ 2010-07-02 0:51 UTC (permalink / raw)
To: Junio C Hamano
Cc: Andrew Pimlott, git, Frédéric Brière,
Michael J Gruber, Jeff King, Björn Steinbrink
Junio C Hamano wrote:
> Jonathan Nieder <jrnieder@gmail.com> writes:
>> Example:
>>
>> o---o---F---X'---G---U [upstream]
>> \ \
>> X----Y---M---T [topic]
[...]
>> Consider the author of a different branch, also called ‘topic’, but
>> this one starts from commit G. Some infrastructure from an existing
>> branch is needed, so first she merges that. Then she adds her own
>> commit.
>
> Sorry, but it is unclear to me what kind of history you have in mind at
> this point. What "existing branch" are you talking about? Presumably it
> is not the [topic] in an earlier example, nor it is [upstream] right?
>
> o---o---o---o----G-------.---U [upstream]
> \ \
> X---Y---M---T
>
> Something like this?
Sorry for the lack of clarity. The "existing branch" was the history
ending at commit Y in the original picture. The resulting topic would
have the shape of the branch labelled [topic] in my diagram.
And except for the shapes being the same, this has nothing to do with
the earlier example.
What I was trying to get at with the two examples is that in histories
like the above, the concept of "fork point" is not well defined.
Where did topic fork from upstream? It could have been at G, or F, or
any other merge-base of upstream and topic for that matter; the
recorded history does not give enough information to say.
The choice of fork point might look like it is only for optimization,
but it affects the semantics, too.
Example: reviving a reverted patch
... o---F---P---R---G---o [upstream]
\
P' [alice]
Upstream, a certain patch (P) was accepted, found to introduce horrible
problems, and then reverted (R). Patch submitter Alice still believes
that is a good patch, though, so she creates a branch to start work on
it, cherry-picking the change (P'). ‘git cherry’ correctly reports
P' as not merged upstream; that it has the same patch-id as commit P
is simply irrelevant.
$ git cherry
+ P'
Alice might merge a branch with a fork point that does not have P as
an ancestor:
... o---o---P---R---G---o [upstream]
\ /
x
/ \
o P'
/ \
... o---o---o---F---o---M [alice]
How can ‘git cherry’ tell that the fork point was G? That
knowledge determines whether P' should be considered to be merged
upstream or not:
* If the fork point was F, then the patch for P' has been applied
upstream since then (indirectly through the merge of G).
* If the fork point is G, then the patch for P' was upstream all
along, and P' has no upstream analog since then.
In reality, ‘git cherry’ does not choose; instead of doing arithmetic
based on fork points, it just says _no_ commit reachable from the tip
of alice can be used as evidence that a patch from alice has been
merged.
Plenty of other heuristics are possible, but it is hard to find a
more intuitive efficient one. I suspect I would find it useful to be
able to explicitly set a commit ‘prefork’ and examine all commits
prefork..upstream in the search for evidence that a patch has been
merged.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Documentation: 'cherry' does not cope well with merges from upstream
2010-07-01 21:09 ` [PATCH] Documentation: 'cherry' does not cope well with merges from upstream Jonathan Nieder
` (2 preceding siblings ...)
2010-07-01 23:52 ` Junio C Hamano
@ 2010-07-02 1:04 ` Jonathan Nieder
2010-07-02 7:46 ` Michael J Gruber
4 siblings, 0 replies; 13+ messages in thread
From: Jonathan Nieder @ 2010-07-02 1:04 UTC (permalink / raw)
To: Andrew Pimlott
Cc: git, Frédéric Brière, Michael J Gruber, Jeff King,
Björn Steinbrink
Jonathan Nieder wrote:
> +'git cherry' does not cope well with merges from upstream on the
> +working branch. Any commits after the original fork point and
> +before the latest merge from upstream will be reported as not found
> +in <upstream>.
> +
> + ____*____*____*_____*__> <upstream>
> + / \
> + original fork point \
> + \__+__+__+__+__+__+__+__*_> <head>
> +
> +While these commits are part of upstream..head, their upstream
> +counterparts are not in head..upstream.
As Junio mentioned, the text and diagram are hard to reconcile since
“these commits” are not clearly explained to be “the commits after the
original fork point and before the latest merge, marked with a +”.
Actually, the whole text is kind of awkward, which is why I left this
hanging for so long[1]. Sorry.
Adding “(marked +)” after “these commits” sounds reasonable to me as
a quick fix.
[1] http://bugs.debian.org/575577
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Documentation: 'cherry' does not cope well with merges from upstream
2010-07-01 21:09 ` [PATCH] Documentation: 'cherry' does not cope well with merges from upstream Jonathan Nieder
` (3 preceding siblings ...)
2010-07-02 1:04 ` Jonathan Nieder
@ 2010-07-02 7:46 ` Michael J Gruber
2010-07-02 8:18 ` Jonathan Nieder
4 siblings, 1 reply; 13+ messages in thread
From: Michael J Gruber @ 2010-07-02 7:46 UTC (permalink / raw)
To: Jonathan Nieder
Cc: Andrew Pimlott, git, Frédéric Brière, Jeff King,
Björn Steinbrink
Jonathan Nieder venit, vidit, dixit 01.07.2010 23:09:
> Although ‘git cherry’ is advertised simply to list commits from
> <topic> that have not been merged into <upstream>, internally it
> works by checking for patches in <upstream>..<topic> that do not
> match a patch in <topic>..<upstream>. This is fast because there
...and it says so in the very first line of the manpage.
> are not many patches to check, and it makes sense semantically
> since even if a new patch patches an old patch from <upstream>,
> it cannot be said to have been merged upstream unless it was
> applied there at some point after the <topic> and <upstream>
> branches diverged.
>
> But. If the <topic> branch later merges from <upstream>, it
> throws a spanner in the works and for such a history ‘git cherry’
> is not very useful at all.
I actually think I've reported this before, but I don't mind :)
I keep (topic) branches for my git.git patches. git cherry is helpful
here. When a patch is applied, I merge the corresponding commit (i.e.
Junio's version) to my topic branch to mark it as applied and to have a
nice way of comparing the applied version to the submitted one. At that
point git cherry stops being useful.
>
> Example:
>
> o---o---F---X'---G---U [upstream]
> \ \
> X----Y---M---T [topic]
>
> Suppose the author of the ‘topic’ branch starts from upstream
> commit F and makes a few changes. One is applied upstream, and
> additionally there is some other useful upstream change, so he
> performs a merge to include the upstream updates into topic.
> The expected output from ‘cherry’ is:
>
> + T
> + Y
> - X
>
> Consider the author of a different branch, also called ‘topic’, but
> this one starts from commit G. Some infrastructure from an existing
> branch is needed, so first she merges that. Then she adds her own
> commit. The expected output from ‘cherry’ is:
>
> + T
> + Y
> + X
>
> since none of the new commits have been applied upstream since
> the fork point.
>
> ‘cherry’ cannot distinguish between these two cases, in part because
> it does not distinguish between parents in a merge commit.
>
> Add a BUGS section to explain the problem.
This is not a bug. git cherry works exactly as described.
At worst, it is a misfeature.
git cherry would be more useful if you could specify a "limit" which is
an ancestor of "fork-point", not only descendants.
>
> Thoughts? Improvements?
Allow general "limit" :)
>
> Documentation/git-cherry.txt | 15 +++++++++++++++
> 1 files changed, 15 insertions(+), 0 deletions(-)
>
> diff --git a/Documentation/git-cherry.txt b/Documentation/git-cherry.txt
> index fed115a..83e3bdc 100644
> --- a/Documentation/git-cherry.txt
> +++ b/Documentation/git-cherry.txt
> @@ -59,6 +59,21 @@ OPTIONS
> <limit>::
> Do not report commits up to (and including) limit.
>
> +BUGS
+DISCUSSION
> +----
> +'git cherry' does not cope well with merges from upstream on the
> +working branch. Any commits after the original fork point and
...which is the wrong way round anyways ;)
> +before the latest merge from upstream will be reported as not found
> +in <upstream>.
> +
> + ____*____*____*_____*__> <upstream>
> + / \
> + original fork point \
> + \__+__+__+__+__+__+__+__*_> <head>
> +
> +While these commits are part of upstream..head, their upstream
> +counterparts are not in head..upstream.
git-cherry(1) never speaks about upstream..head nor head..upstream. It
uses "fork-point", and a merge creates a new "fork-point", i.e.
merge-base. I think it would be good to keep like that in order to avoid
that very misunderstanding. In fact, git cherry is about left and right
commits in upstream...head.
The second paragraph of "DESCRIPTION" may cause confusion when read
without the first one, though.
Michael
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Documentation: 'cherry' does not cope well with merges from upstream
2010-07-02 7:46 ` Michael J Gruber
@ 2010-07-02 8:18 ` Jonathan Nieder
2010-07-02 9:23 ` Michael J Gruber
0 siblings, 1 reply; 13+ messages in thread
From: Jonathan Nieder @ 2010-07-02 8:18 UTC (permalink / raw)
To: Michael J Gruber
Cc: Andrew Pimlott, git, Frédéric Brière, Jeff King,
Björn Steinbrink
Michael J Gruber wrote:
> Jonathan Nieder venit, vidit, dixit 01.07.2010 23:09:
>> Add a BUGS section to explain the problem.
>
> This is not a bug. git cherry works exactly as described.
>
> At worst, it is a misfeature.
Unix man pages have a history of using BUGS sections to describe
misfeatures and even unavoidable design constraints.
One nice effect is to encourage people to think of programs as
fixable. But maybe it is bad PR. ;-)
> git cherry would be more useful if you could specify a "limit" which is
> an ancestor of "fork-point", not only descendants.
>
>> Thoughts? Improvements?
>
> Allow general "limit" :)
Hmm, I am not totally sure I understand. Conceptually ‘git cherry’
currently does something like the following:
1. List all commits limit..head and find their patch ids
(limit defaults to upstream if not specified)
2. List all commits head..upstream and find their patch ids
3. For each commit listed in step 1, check if it is in the
list from step 2 and print its commit id with a + or -
accordingly.
Are you suggesting that the limit should replace head in
step 2? Or something else?
> git-cherry(1) never speaks about upstream..head nor head..upstream. It
> uses "fork-point", and a merge creates a new "fork-point", i.e.
> merge-base.
This explanation becomes problematic when there is more than one merge-base:
http://thread.gmane.org/gmane.comp.version-control.git/150067/focus=150093
Thank you for the comments. I considered using the <limit> argument
to work around this but didn’t try the modification you suggest above.
I would be happy to find that it works (generalized for repos with
a more shallow history to --full).
Sleepily,
Jonathan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Documentation: 'cherry' does not cope well with merges from upstream
2010-07-02 8:18 ` Jonathan Nieder
@ 2010-07-02 9:23 ` Michael J Gruber
0 siblings, 0 replies; 13+ messages in thread
From: Michael J Gruber @ 2010-07-02 9:23 UTC (permalink / raw)
To: Jonathan Nieder
Cc: Andrew Pimlott, git, Frédéric Brière, Jeff King,
Björn Steinbrink
Jonathan Nieder venit, vidit, dixit 02.07.2010 10:18:
> Michael J Gruber wrote:
>> Jonathan Nieder venit, vidit, dixit 01.07.2010 23:09:
>
>>> Add a BUGS section to explain the problem.
>>
>> This is not a bug. git cherry works exactly as described.
>>
>> At worst, it is a misfeature.
>
> Unix man pages have a history of using BUGS sections to describe
> misfeatures and even unavoidable design constraints.
>
> One nice effect is to encourage people to think of programs as
> fixable. But maybe it is bad PR. ;-)
>
>> git cherry would be more useful if you could specify a "limit" which is
>> an ancestor of "fork-point", not only descendants.
>>
>>> Thoughts? Improvements?
>>
>> Allow general "limit" :)
>
> Hmm, I am not totally sure I understand. Conceptually ‘git cherry’
> currently does something like the following:
>
> 1. List all commits limit..head and find their patch ids
> (limit defaults to upstream if not specified)
>
> 2. List all commits head..upstream and find their patch ids
>
> 3. For each commit listed in step 1, check if it is in the
> list from step 2 and print its commit id with a + or -
> accordingly.
>
> Are you suggesting that the limit should replace head in
> step 2? Or something else?
I suggest that I was reading limit on the wrong branch :|
What I meant was specifiying a different lower limit in 2 would help:
then you could force git cherry to compare more commits, if you have a
rough idea about how far to go back. But even being able to say
"v1.6.0..upstream" here instead of head helps and is much more efficient
then going for --full.
>
>> git-cherry(1) never speaks about upstream..head nor head..upstream. It
>> uses "fork-point", and a merge creates a new "fork-point", i.e.
>> merge-base.
>
> This explanation becomes problematic when there is more than one merge-base:
> http://thread.gmane.org/gmane.comp.version-control.git/150067/focus=150093
I guess it always pays to read the full thread before jumping in... your
"prefork" there is what I meant above.
Michael
^ permalink raw reply [flat|nested] 13+ messages in thread