git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* my git problem
@ 2008-04-27 18:29 Andrew Morton
  2008-04-27 19:15 ` Linus Torvalds
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Morton @ 2008-04-27 18:29 UTC (permalink / raw)
  To: git


git is really really bad to me during the merge window.  Let's look at an
example:

y:/usr/src/git26> cat .git/branches/git-ia64 
git+ssh://master.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6.git#test

Now, I want to generate a plain patch against mainline which will add the
patches which are in git-ia64 and which aren't in mainline.  ie: when that
patch is applied to mainline, we get git-ia64.  Sounds simple.

A naive

	git-diff origin git-ia64

generates vast amounts of stuff which is already in mainline.  Things like

 b/drivers/media/video/au0828/au0828-dvb.c       |    2 
 b/drivers/media/video/au0828/au0828-i2c.c       |    6 
 b/drivers/media/video/au0828/au0828.h           |    8 
 b/drivers/media/video/cx23885/cx23885-dvb.c     |    4 
 b/drivers/media/video/cx88/Kconfig              |    1 
 b/drivers/media/video/cx88/cx88-blackbird.c     |    6 
 b/drivers/media/video/cx88/cx88-cards.c         |    1 
 b/drivers/media/video/cx88/cx88-dvb.c           |   32 
 b/drivers/media/video/em28xx/em28xx-core.c      |    2 
 b/drivers/media/video/ir-kbd-i2c.c              |   21 
 b/drivers/media/video/pvrusb2/Kconfig           |    1 


The appended script is what I usually use.  It was worked out by Junio and
I maybe a couple of years ago. It doesn't work very well: it still generates
large numbers of changes which are already in mainline.  Some of them are
ia64 changes, some are not.

When Tony resyncs his tree with mainline this problem will correct itself. 
I drop the tree and repoll it daily until this happens.


I don't really have a bottom line here - but I would like the git
developers to be aware that what is a fairly sensible usage scenario just
doesn't seem to be satisfied at all well...

Thanks.



#!/bin/sh

GIT_TREE=/usr/src/git26
PULL=/usr/src/pull

git_header()
{
	tree="$1"
	echo GIT $(cat .git/refs/heads/$tree) $(cat .git/branches/$tree)
	echo
}

doit()
{
	tree=$1
	upstream=$2

	cd $GIT_TREE
	git reset --hard "$upstream"
	git fetch "$tree" || exit 1
	git merge --no-commit 'test merge' HEAD FETCH_HEAD > /dev/null

	{
		git_header "$tree"
		git log --no-merges ORIG_HEAD..FETCH_HEAD
		git diff --patch-with-stat ORIG_HEAD
	} >$PULL/$tree.patch
	{
		echo DESC
		echo $tree.patch
		echo EDESC
		git_header "$tree"
		git log --no-merges ORIG_HEAD..FETCH_HEAD
	} >$PULL/$tree.txt
	git reset --hard "$upstream"
}

mkdir -p $PULL

if [ $1"x" = "-x" ]
then
	exit
fi

cd /usr/src

if [ $# == 0 ]
then
	trees=/usr/src/git-trees
else
	trees="$1"
fi

do_one()
{
	tree=$1
	upstream=$2
	if [ ! -e $PULL/$tree.patch ]
	then
		echo "*** doing $tree, based on $upstream"
		git-branch -D $tree
		doit $tree $upstream
	else
		echo skipping $tree
	fi
}
	
if [ $# == 2 ]
then
	do_one $1 $2
else
	while read x
	do
		if echo $x | grep '^#.*' > /dev/null
		then
			true
		else
			do_one $x
		fi
	done < $trees
fi

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: my git problem
  2008-04-27 18:29 my git problem Andrew Morton
@ 2008-04-27 19:15 ` Linus Torvalds
  2008-04-27 19:44   ` Andrew Morton
  0 siblings, 1 reply; 21+ messages in thread
From: Linus Torvalds @ 2008-04-27 19:15 UTC (permalink / raw)
  To: Andrew Morton; +Cc: git



On Sun, 27 Apr 2008, Andrew Morton wrote:
> 
> Now, I want to generate a plain patch against mainline which will add the
> patches which are in git-ia64 and which aren't in mainline.  ie: when that
> patch is applied to mainline, we get git-ia64.  Sounds simple.
> 
> A naive
> 
> 	git-diff origin git-ia64

Don't do that.

That will diff between the two branches, and if they both contain stuff 
(which they obviously do), you'll get all the things that are in origin 
(but not git-ia64) as a reversed diff.

What you _want_ is the diff from the last common point, aka the "merge 
base".

With git, you could do that as

	merge_base=$(git merge-base origin git-ia64)
	git diff $merge_base git-ia64

but there is a convenient shorthand for that, which is to use "a...b" 
(three dots!), so

	git diff -p --stat origin...git-ia64

should generally get you what you want.

I say *generally*, because there might be multiple merge-bases if there 
are crossing merges between the two branches and there is no well-defined 
single common point. But that criss-cross case almost never happens for 
the kernel, because I've been pretty good at trying to teach maintainers 
to not generate that kind of complex history (it doesn't just obfuscate 
the above kind of situation, it also makes gitk output harder-to-read than 
it otherwise would be).

That said, your script (that does a merge) should have been able to get 
the diff too, and in fact handle even the criss-cross case. It's written a 
bit strangely (like having that really old-fashioned way of using git 
merge, passing in HEAD explicitly etc).

			Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: my git problem
  2008-04-27 19:15 ` Linus Torvalds
@ 2008-04-27 19:44   ` Andrew Morton
  2008-04-27 20:24     ` Linus Torvalds
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Morton @ 2008-04-27 19:44 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

On Sun, 27 Apr 2008 12:15:27 -0700 (PDT) Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Sun, 27 Apr 2008, Andrew Morton wrote:
> > 
> > Now, I want to generate a plain patch against mainline which will add the
> > patches which are in git-ia64 and which aren't in mainline.  ie: when that
> > patch is applied to mainline, we get git-ia64.  Sounds simple.
> > 
> > A naive
> > 
> > 	git-diff origin git-ia64
> 
> Don't do that.
> 
> That will diff between the two branches, and if they both contain stuff 
> (which they obviously do), you'll get all the things that are in origin 
> (but not git-ia64) as a reversed diff.
> 
> What you _want_ is the diff from the last common point, aka the "merge 
> base".
> 
> With git, you could do that as
> 
> 	merge_base=$(git merge-base origin git-ia64)
> 	git diff $merge_base git-ia64
> 
> but there is a convenient shorthand for that, which is to use "a...b" 
> (three dots!), so
> 
> 	git diff -p --stat origin...git-ia64
> 
> should generally get you what you want.

That generates no diff for several trees which I tried it on.  And
afaict from manual inspection, that's correct - they are empty.

git-sched is non-empty:

y:/usr/src/git26> cat .git/branches/git-sched 
git+ssh://master.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel.git#for-akpm

and seems to dtrt too.

But I'm pretty sure that the simple solutions were found wanting, but I
don't recall why.  I think it was because of a problem when
git-netdev-all is based on git-net which is based on origin.  I want to
extract the git-net -> git-netdev-all diff, but doing that generates
patches which reapply things which are already applied.

iirc this happens when git-netdev-all is resynced with origin at a
different time from when git-net is resynced with origin.  I get hunks
which reapply (or revert) changes which are in origin.

But I don't presently have any trees which are based on other non-origin
trees so I can't test that.

> I say *generally*, because there might be multiple merge-bases if there 
> are crossing merges between the two branches and there is no well-defined 
> single common point. But that criss-cross case almost never happens for 
> the kernel, because I've been pretty good at trying to teach maintainers 
> to not generate that kind of complex history (it doesn't just obfuscate 
> the above kind of situation, it also makes gitk output harder-to-read than 
> it otherwise would be).
> 
> That said, your script (that does a merge) should have been able to get 
> the diff too, and in fact handle even the criss-cross case. It's written a 
> bit strangely (like having that really old-fashioned way of using git 
> merge, passing in HEAD explicitly etc).

Well.  It is a couple of years old.

I'll try the simple version later, see what happens.  Thanks.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: my git problem
  2008-04-27 19:44   ` Andrew Morton
@ 2008-04-27 20:24     ` Linus Torvalds
  2008-04-28 18:45       ` Andrew Morton
  2008-04-28 21:35       ` Andrew Morton
  0 siblings, 2 replies; 21+ messages in thread
From: Linus Torvalds @ 2008-04-27 20:24 UTC (permalink / raw)
  To: Andrew Morton; +Cc: git



On Sun, 27 Apr 2008, Andrew Morton wrote:
> 
> But I'm pretty sure that the simple solutions were found wanting, but I
> don't recall why.  I think it was because of a problem when
> git-netdev-all is based on git-net which is based on origin.  I want to
> extract the git-net -> git-netdev-all diff, but doing that generates
> patches which reapply things which are already applied.

Well, if a tree has patches that are already applied up-stream, then yes, 
you do actually have to do the merge in order to see that. Because 
obviously the diff is in two places, and if they merge cleanly, one of 
them has to be made to not count.

So it depends on what you want.

	git diff a...b

says literally "what has been added to 'b' since it diverged from 'a'". 

That is a useful and valid thing to ask, but it is very fundamentally also 
*not* the same thing as actually doing the merge, and asking what the 
merge added. Doing

	git merge --no-commit otherbranch
	git diff HEAD > diff
	git reset --hard

will do that: it will do the merge (which obviously squashes any diffs 
that existed in the other tree as different commits), and then diffs the 
HEAD against that resulting state.

So they are two fundamentally different things to do.

In some sense, the "git diff a...b" is closer to your "series of quilt 
patches" model, in that it just generates a patch - which may obviously 
conflict with the *other* patches you are also generating. It would then 
expect your quilt logic to do some sane patch merging.

Of course, we know that quilt doesn't do sane patch merging, so you are 
probably better off with the second version: letting git merge for you, 
and taking the resulting diff.

> But I don't presently have any trees which are based on other non-origin
> trees so I can't test that.

Yes, in that case you'll only get issues when somebody commits the same 
patch I have already applied. Which does happen, of course.

			Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: my git problem
  2008-04-27 20:24     ` Linus Torvalds
@ 2008-04-28 18:45       ` Andrew Morton
  2008-04-28 18:49         ` Johannes Schindelin
                           ` (2 more replies)
  2008-04-28 21:35       ` Andrew Morton
  1 sibling, 3 replies; 21+ messages in thread
From: Andrew Morton @ 2008-04-28 18:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

On Sun, 27 Apr 2008 13:24:08 -0700 (PDT) Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Sun, 27 Apr 2008, Andrew Morton wrote:
> > 
> > But I'm pretty sure that the simple solutions were found wanting, but I
> > don't recall why.  I think it was because of a problem when
> > git-netdev-all is based on git-net which is based on origin.  I want to
> > extract the git-net -> git-netdev-all diff, but doing that generates
> > patches which reapply things which are already applied.
> 
> Well, if a tree has patches that are already applied up-stream, then yes, 
> you do actually have to do the merge in order to see that. Because 
> obviously the diff is in two places, and if they merge cleanly, one of 
> them has to be made to not count.
> 
> So it depends on what you want.
> 
> 	git diff a...b
> 
> says literally "what has been added to 'b' since it diverged from 'a'". 
> 
> That is a useful and valid thing to ask, but it is very fundamentally also 
> *not* the same thing as actually doing the merge, and asking what the 
> merge added. Doing
> 
> 	git merge --no-commit otherbranch
> 	git diff HEAD > diff
> 	git reset --hard
> 
> will do that: it will do the merge (which obviously squashes any diffs 
> that existed in the other tree as different commits), and then diffs the 
> HEAD against that resulting state.
> 

hm, weirdness.

y:/usr/src/git26> git-diff origin...git-ia64   
y:/usr/src/git26> git-log origin...git-ia64 | wc -l
15574

I'd have expected git-log to operate on the same patches as git-diff.

I'm sure there's a logical explanation for this ;)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: my git problem
  2008-04-28 18:45       ` Andrew Morton
@ 2008-04-28 18:49         ` Johannes Schindelin
  2008-04-28 19:09           ` Andrew Morton
  2008-04-28 19:21         ` Linus Torvalds
  2008-04-28 19:52         ` Daniel Barkalow
  2 siblings, 1 reply; 21+ messages in thread
From: Johannes Schindelin @ 2008-04-28 18:49 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linus Torvalds, git

Hi,

On Mon, 28 Apr 2008, Andrew Morton wrote:

> hm, weirdness.
> 
> y:/usr/src/git26> git-diff origin...git-ia64   
> y:/usr/src/git26> git-log origin...git-ia64 | wc -l
> 15574
> 
> I'd have expected git-log to operate on the same patches as git-diff.
> 
> I'm sure there's a logical explanation for this ;)

Yes.  git-diff with "..." will show you the diff of git-ia64 since the 
branch point, whereas log will show you _all_ the logs (both in origin and 
in git-ia64) since the branch point.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: my git problem
  2008-04-28 18:49         ` Johannes Schindelin
@ 2008-04-28 19:09           ` Andrew Morton
  2008-04-28 19:13             ` Johannes Schindelin
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Morton @ 2008-04-28 19:09 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Linus Torvalds, git

On Mon, 28 Apr 2008 19:49:28 +0100 (BST) Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:

> Hi,
> 
> On Mon, 28 Apr 2008, Andrew Morton wrote:
> 
> > hm, weirdness.
> > 
> > y:/usr/src/git26> git-diff origin...git-ia64   
> > y:/usr/src/git26> git-log origin...git-ia64 | wc -l
> > 15574
> > 
> > I'd have expected git-log to operate on the same patches as git-diff.
> > 
> > I'm sure there's a logical explanation for this ;)
> 
> Yes.  git-diff with "..." will show you the diff of git-ia64 since the 
> branch point, whereas log will show you _all_ the logs (both in origin and 
> in git-ia64) since the branch point.
> 

That's missing the "logical" bit :)

Oh well.  Can you suggest how I can extract the changelogs for the patches
which `git-diff origin...git-ia64' will print out?

Thanks.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: my git problem
  2008-04-28 19:09           ` Andrew Morton
@ 2008-04-28 19:13             ` Johannes Schindelin
  2008-04-28 19:28               ` Linus Torvalds
  2008-04-28 19:33               ` Andrew Morton
  0 siblings, 2 replies; 21+ messages in thread
From: Johannes Schindelin @ 2008-04-28 19:13 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linus Torvalds, git

Hi,

On Mon, 28 Apr 2008, Andrew Morton wrote:

> On Mon, 28 Apr 2008 19:49:28 +0100 (BST) Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> 
> > On Mon, 28 Apr 2008, Andrew Morton wrote:
> > 
> > > hm, weirdness.
> > > 
> > > y:/usr/src/git26> git-diff origin...git-ia64   
> > > y:/usr/src/git26> git-log origin...git-ia64 | wc -l
> > > 15574
> > > 
> > > I'd have expected git-log to operate on the same patches as git-diff.
> > > 
> > > I'm sure there's a logical explanation for this ;)
> > 
> > Yes.  git-diff with "..." will show you the diff of git-ia64 since the 
> > branch point, whereas log will show you _all_ the logs (both in origin and 
> > in git-ia64) since the branch point.
> > 
> 
> That's missing the "logical" bit :)

Heh, you're right.  I am too used to Git to think how other people would 
feel about these things... :-)

> Oh well.  Can you suggest how I can extract the changelogs for the 
> patches which `git-diff origin...git-ia64' will print out?

I think you get what you want with

	$ git-log --cherry-pick origin...git-ia64

(although I might be wrong on the order of origin and git-ia64).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: my git problem
  2008-04-28 18:45       ` Andrew Morton
  2008-04-28 18:49         ` Johannes Schindelin
@ 2008-04-28 19:21         ` Linus Torvalds
  2008-04-28 19:54           ` Andrew Morton
  2008-05-01  6:01           ` Carl Worth
  2008-04-28 19:52         ` Daniel Barkalow
  2 siblings, 2 replies; 21+ messages in thread
From: Linus Torvalds @ 2008-04-28 19:21 UTC (permalink / raw)
  To: Andrew Morton; +Cc: git



On Mon, 28 Apr 2008, Andrew Morton wrote:
> 
> hm, weirdness.
> 
> y:/usr/src/git26> git-diff origin...git-ia64   
> y:/usr/src/git26> git-log origin...git-ia64 | wc -l
> 15574
> 
> I'd have expected git-log to operate on the same patches as git-diff.

No, not at all.

 - "git log" shows each commit in a range.

 - "git diff" shows just the difference between two states.

The two have nothing in common. One operates on lots of individual commits 
(git log) individually, while the other one fundamentally operates on just 
two end-points (git diff).

And "a..b" and "a...b" means two totally different things for the two 
totally different operations.

When doing "a..b" and looking at individual commits, it means "show all 
commits that are in b but *not* in a". And when doing "a..b" when asking 
for a "diff", it means "show the difference from 'a' to 'b'".

They are *very* different operations indeed. The log can be empty, even if 
the diff is not empty (example: b is _before_ a, so there is nothing in 
'b' that isn't in 'a', but that doesn't mean that 'b' is *equal* to 'a', 
so there is still a diff!). And the log can be non-empty, even if the diff 
is empty (example: 'b' and 'a' have the same actual tree, but two 
different ways of gettign there: the diff is empty, but the log of commits 
in between them is not).

So anybody who thinks that 'diff' and 'log' have *anything* to do with 
each other is fundamentally confused. Not just about git, btw. It's true 
in any model - it's fundamental.

As to 'a...b', it also means somethign different for diff (two endpoints!) 
and log (set of commits). For diff, it means "show the difference between 
the common commit and 'b'", while for log it means "show all commits that 
are in either 'a' or 'b' but *not* in both".

So you should do

	# generate the diff from the common point
	git diff -p --stat a...b

	# show the commits that are in b but not in a
	git log a..b

where the difference between two dots and three dots is important, and 
stems directly from the fact that 'diff' and 'log' are two totally 
different operations that cannot _possibly_ have semantics that mean the 
same thing - because a "set of commits" is fundamentally different from 
"difference betwen two endpoints".

So both "a..b" and "a...b" have meaning for both diff and log, but which 
you want to use depends on what you are looking for.

They do have some relationship, of course. If you want to have a simple 
way to know which is which, then

 - "a..b" is a plain difference. Think of it as a subtraction. For "diff", 
   it is simply the diff between a and b, and for log it is the "set 
   difference" (shown as either just "-" or as "\" in set theory math) 
   between the commits that are in b and not in a.

 - "a...b" is a more complex difference. For log, it's no longer the 
   regular set difference, but a "symmetric difference" (usually shown as 
   a greek capital "Delta" in set theory math). And for "diff", it's no 
   longer just the diff between two states, it's the diff from a third 
   state (the nearest common state) to the second state.

In short: It's easy to think that "log" and "diff" are related, but they 
really are very fundamnetally different. 

		Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: my git problem
  2008-04-28 19:13             ` Johannes Schindelin
@ 2008-04-28 19:28               ` Linus Torvalds
  2008-04-29 17:15                 ` J. Bruce Fields
  2008-04-28 19:33               ` Andrew Morton
  1 sibling, 1 reply; 21+ messages in thread
From: Linus Torvalds @ 2008-04-28 19:28 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Andrew Morton, git



On Mon, 28 Apr 2008, Johannes Schindelin wrote:
> 
> On Mon, 28 Apr 2008, Andrew Morton wrote:
> > 
> > That's missing the "logical" bit :)
> 
> Heh, you're right.  I am too used to Git to think how other people would 
> feel about these things... :-)

No, you are both wrong.

You're wrong because apparently you never did abstract algebra and set 
theory in school.

If you know math, git actually does the rigth and very much the *logical* 
thing.

So ".." is a simple difference, while "..." is a more complex difference. 

They mean different things for different operation types, but that is 
again something a math person takes for granted (ie in algebra, a "+" or 
"-" is just a random operation that follows certain rules: "a-b" means one 
thing for the set of real numbers, and something *totally* different if 
you are talking about set algebra).

And "diff" and "log" really are different operation types, exactly in the 
sense that "operations on real numbers" are different from "set 
operations".

A "diff" is more like a difference between two real numbers: it gives a 
single answer (admittedly the single answer is in a different domain from 
the arguments). While "log" is very close to a set operation, in that it 
gives a set as its result.

		Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: my git problem
  2008-04-28 19:13             ` Johannes Schindelin
  2008-04-28 19:28               ` Linus Torvalds
@ 2008-04-28 19:33               ` Andrew Morton
  1 sibling, 0 replies; 21+ messages in thread
From: Andrew Morton @ 2008-04-28 19:33 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Linus Torvalds, git

On Mon, 28 Apr 2008 20:13:43 +0100 (BST) Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:

> > Oh well.  Can you suggest how I can extract the changelogs for the 
> > patches which `git-diff origin...git-ia64' will print out?
> 
> I think you get what you want with
> 
> 	$ git-log --cherry-pick origin...git-ia64
> 
> (although I might be wrong on the order of origin and git-ia64).

Nope, that still generates thousands of lines of wrongness.

It's also very slow, for a non-empty tree:

git-log --cherry-pick origin...git-audit-master > /dev/null  50.47s user 0.56s system 99% cpu 51.043 total

git-diff origin...git-audit-master > /dev/null  0.35s user 0.02s system 100% cpu 0.369 total

weird that it's hundreds of times slower than the corresponding git-diff. 
he-who-pulls-75-trees would be unhappy.

Back to my original problem...

>From my old script:

doit()
{
	tree=$1
	upstream=$2

	cd $GIT_TREE
	git reset --hard "$upstream"
	git fetch "$tree" || exit 1
	git merge --no-commit 'test merge' HEAD FETCH_HEAD > /dev/null

	{
		git_header "$tree"
		git log --no-merges ORIG_HEAD..FETCH_HEAD
		git diff --patch-with-stat ORIG_HEAD
	} >$PULL/$tree.patch
	{
		echo DESC
		echo $tree.patch
		echo EDESC
		git_header "$tree"
		git log --no-merges ORIG_HEAD..FETCH_HEAD
	} >$PULL/$tree.txt
	git reset --hard "$upstream"
}

the `git log' here does what I want.

The new version:

doit()
{
	tree=$1
	upstream=$2

	cd $GIT_TREE
	git reset --hard "$upstream"
	git fetch "$tree" || exit 1

	{
		git_header "$tree"
		git log --no-merges $upstream...$tree
		git diff -p --stat --no-merges $upstream...$tree
	} >$PULL/$tree.patch
	{
		echo DESC
		echo $tree.patch
		echo EDESC
		git_header "$tree"
#		git log --no-merges $upstream...$tree
	} >$PULL/$tree.txt
	git reset --hard "$upstream"
}

loses the changelogs :(

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: my git problem
  2008-04-28 18:45       ` Andrew Morton
  2008-04-28 18:49         ` Johannes Schindelin
  2008-04-28 19:21         ` Linus Torvalds
@ 2008-04-28 19:52         ` Daniel Barkalow
  2 siblings, 0 replies; 21+ messages in thread
From: Daniel Barkalow @ 2008-04-28 19:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linus Torvalds, git

On Mon, 28 Apr 2008, Andrew Morton wrote:

> On Sun, 27 Apr 2008 13:24:08 -0700 (PDT) Linus Torvalds <torvalds@linux-foundation.org> wrote:
> > 
> > On Sun, 27 Apr 2008, Andrew Morton wrote:
> > > 
> > > But I'm pretty sure that the simple solutions were found wanting, but I
> > > don't recall why.  I think it was because of a problem when
> > > git-netdev-all is based on git-net which is based on origin.  I want to
> > > extract the git-net -> git-netdev-all diff, but doing that generates
> > > patches which reapply things which are already applied.
> > 
> > Well, if a tree has patches that are already applied up-stream, then yes, 
> > you do actually have to do the merge in order to see that. Because 
> > obviously the diff is in two places, and if they merge cleanly, one of 
> > them has to be made to not count.
> > 
> > So it depends on what you want.
> > 
> > 	git diff a...b
> > 
> > says literally "what has been added to 'b' since it diverged from 'a'". 
> > 
> > That is a useful and valid thing to ask, but it is very fundamentally also 
> > *not* the same thing as actually doing the merge, and asking what the 
> > merge added. Doing
> > 
> > 	git merge --no-commit otherbranch
> > 	git diff HEAD > diff
> > 	git reset --hard
> > 
> > will do that: it will do the merge (which obviously squashes any diffs 
> > that existed in the other tree as different commits), and then diffs the 
> > HEAD against that resulting state.
> > 
> 
> hm, weirdness.
> 
> y:/usr/src/git26> git-diff origin...git-ia64   
> y:/usr/src/git26> git-log origin...git-ia64 | wc -l
> 15574
> 
> I'd have expected git-log to operate on the same patches as git-diff.

I don't think this operation is even well-defined for git-log. What do you 
expect to see if origin contains some of the same changes as ia64, but 
where what changes are in the same patch are different on the two paths? 
Even more so, what if ia64 has some additional changes in a commit that 
had other changes that were left out of the origin version?

For the diff, it doesn't matter, but for log it would.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: my git problem
  2008-04-28 19:21         ` Linus Torvalds
@ 2008-04-28 19:54           ` Andrew Morton
  2008-05-01  6:01           ` Carl Worth
  1 sibling, 0 replies; 21+ messages in thread
From: Andrew Morton @ 2008-04-28 19:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

On Mon, 28 Apr 2008 12:21:17 -0700 (PDT) Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Mon, 28 Apr 2008, Andrew Morton wrote:
> > 
> > hm, weirdness.
> > 
> > y:/usr/src/git26> git-diff origin...git-ia64   
> > y:/usr/src/git26> git-log origin...git-ia64 | wc -l
> > 15574
> > 
> > I'd have expected git-log to operate on the same patches as git-diff.
> 
> No, not at all.
> 
>  - "git log" shows each commit in a range.
> 
>  - "git diff" shows just the difference between two states.
> 
> The two have nothing in common. One operates on lots of individual commits 
> (git log) individually, while the other one fundamentally operates on just 
> two end-points (git diff).
> 
> And "a..b" and "a...b" means two totally different things for the two 
> totally different operations.
> 
> When doing "a..b" and looking at individual commits, it means "show all 
> commits that are in b but *not* in a". And when doing "a..b" when asking 
> for a "diff", it means "show the difference from 'a' to 'b'".
> 
> They are *very* different operations indeed. The log can be empty, even if 
> the diff is not empty (example: b is _before_ a, so there is nothing in 
> 'b' that isn't in 'a', but that doesn't mean that 'b' is *equal* to 'a', 
> so there is still a diff!). And the log can be non-empty, even if the diff 
> is empty (example: 'b' and 'a' have the same actual tree, but two 
> different ways of gettign there: the diff is empty, but the log of commits 
> in between them is not).

whimper.

> So anybody who thinks that 'diff' and 'log' have *anything* to do with 
> each other is fundamentally confused.

hi, everyone.

>  Not just about git, btw. It's true 
> in any model - it's fundamental.
>
> As to 'a...b', it also means somethign different for diff (two endpoints!) 
> and log (set of commits). For diff, it means "show the difference between 
> the common commit and 'b'", while for log it means "show all commits that 
> are in either 'a' or 'b' but *not* in both".
> 
> So you should do
> 
> 	# generate the diff from the common point
> 	git diff -p --stat a...b
> 
> 	# show the commits that are in b but not in a
> 	git log a..b

That seems to work nicely, thanks.

> where the difference between two dots and three dots is important, and 
> stems directly from the fact that 'diff' and 'log' are two totally 
> different operations that cannot _possibly_ have semantics that mean the 
> same thing - because a "set of commits" is fundamentally different from 
> "difference betwen two endpoints".

yup.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: my git problem
  2008-04-27 20:24     ` Linus Torvalds
  2008-04-28 18:45       ` Andrew Morton
@ 2008-04-28 21:35       ` Andrew Morton
  2008-04-28 21:47         ` Linus Torvalds
  1 sibling, 1 reply; 21+ messages in thread
From: Andrew Morton @ 2008-04-28 21:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

On Sun, 27 Apr 2008 13:24:08 -0700 (PDT) Linus Torvalds <torvalds@linux-foundation.org> wrote:

> So it depends on what you want.
> 
> 	git diff a...b
> 
> says literally "what has been added to 'b' since it diverged from 'a'". 

Confounded by Ingo.

origin.patch (generated via git-diff v2.6.25...origin) has:

commit 7f424a8b08c26dc14ac5c17164014539ac9a5c65
Author: Peter Zijlstra <peterz@infradead.org>
Date:   Fri Apr 25 17:39:01 2008 +0200

    fix idle (arch, acpi and apm) and lockdep


and git-x86 (generated via git-diff origin...git-x86) has:

commit 0a1679501624482a06c19af49d55b68d3973e2f0
Author: Peter Zijlstra <peterz@infradead.org>
Date:   Fri Apr 25 17:39:01 2008 +0200

    fix idle (arch, acpi and apm) and lockdep
    

which I assume is the same patch as a different commit.


The old `doit':

doit()
{
	tree=$1
	upstream=$2

	cd $GIT_TREE
	git reset --hard "$upstream"
	git fetch "$tree" || exit 1
	git merge --no-commit 'test merge' HEAD FETCH_HEAD > /dev/null

	{
		git_header "$tree"
		git log --no-merges ORIG_HEAD..FETCH_HEAD
		git diff --patch-with-stat ORIG_HEAD
	} >$PULL/$tree.patch
	{
		echo DESC
		echo $tree.patch
		echo EDESC
		git_header "$tree"
		git log --no-merges ORIG_HEAD..FETCH_HEAD
	} >$PULL/$tree.txt
	git reset --hard "$upstream"
}

prevented that by doing a merge.


git-diff was "wrong" to claim that this change is actually present in the
origin->git-x86 diff.  But I guess it cannot operate at that level and we
need to do the merge to resolve it.  Or something.

ho hum.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: my git problem
  2008-04-28 21:35       ` Andrew Morton
@ 2008-04-28 21:47         ` Linus Torvalds
  2008-04-28 22:04           ` Johannes Schindelin
  2008-04-28 22:14           ` Linus Torvalds
  0 siblings, 2 replies; 21+ messages in thread
From: Linus Torvalds @ 2008-04-28 21:47 UTC (permalink / raw)
  To: Andrew Morton; +Cc: git



On Mon, 28 Apr 2008, Andrew Morton wrote:
> 
> origin.patch (generated via git-diff v2.6.25...origin) has:
> 
> commit 7f424a8b08c26dc14ac5c17164014539ac9a5c65
> Author: Peter Zijlstra <peterz@infradead.org>
> Date:   Fri Apr 25 17:39:01 2008 +0200
> 
>     fix idle (arch, acpi and apm) and lockdep
> 
> 
> and git-x86 (generated via git-diff origin...git-x86) has:
> 
> commit 0a1679501624482a06c19af49d55b68d3973e2f0
> Author: Peter Zijlstra <peterz@infradead.org>
> Date:   Fri Apr 25 17:39:01 2008 +0200
> 
>     fix idle (arch, acpi and apm) and lockdep
>     
> 
> which I assume is the same patch as a different commit.

Yes. 

So this is an example of the fact that that patch was merged in two 
different trees as separate patches, so when you do

	git diff origin..git-x86

then it shows literally the diff from the last common state, and does 
*not* take into account that since that last common state there has been 
updates to the origin branch that essentially conflict (and in this case 
trivially, by just duplicating the work).

> The old `doit' prevented that by doing a merge.

Yes. And it sounds like what you want is that merge, followed by the diff. 
You're not actually asking for "what has changed since the last common 
state". You are literally asking for "what would a merge result in".

> git-diff was "wrong" to claim that this change is actually present in the
> origin->git-x86 diff.  But I guess it cannot operate at that level and we
> need to do the merge to resolve it.  Or something.

I don't actually see what was wrong with the old script.  The merge was
really oddly done, but apart from that, something like this should work
(just your old script with some trivial fixes to 'git merge' and using
somewhat saner arguments):

	doit()
	{
		tree=$1
		upstream=$2
	
		cd $GIT_TREE
		git reset --hard "$upstream"
		git fetch "$tree" || exit 1
		git merge --no-commit FETCH_HEAD > /dev/null
	
		{
			git_header "$tree"
			git shortlog --no-merges ORIG_HEAD..FETCH_HEAD
			git diff -p --stat ORIG_HEAD
		} >$PULL/$tree.patch

		{
			echo DESC
			echo $tree.patch
			echo EDESC
			git_header "$tree"
			git log --no-merges ORIG_HEAD..FETCH_HEAD
		} >$PULL/$tree.txt
		git reset --hard "$upstream"
	}

but obviously that will still result in problems if there are real conflicts.

			Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: my git problem
  2008-04-28 21:47         ` Linus Torvalds
@ 2008-04-28 22:04           ` Johannes Schindelin
  2008-04-28 22:14           ` Linus Torvalds
  1 sibling, 0 replies; 21+ messages in thread
From: Johannes Schindelin @ 2008-04-28 22:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, git

Hi,

On Mon, 28 Apr 2008, Linus Torvalds wrote:

> On Mon, 28 Apr 2008, Andrew Morton wrote:
> > 
> > origin.patch (generated via git-diff v2.6.25...origin) has:
> > 
> > commit 7f424a8b08c26dc14ac5c17164014539ac9a5c65
> > Author: Peter Zijlstra <peterz@infradead.org>
> > Date:   Fri Apr 25 17:39:01 2008 +0200
> > 
> >     fix idle (arch, acpi and apm) and lockdep
> > 
> > 
> > and git-x86 (generated via git-diff origin...git-x86) has:
> > 
> > commit 0a1679501624482a06c19af49d55b68d3973e2f0
> > Author: Peter Zijlstra <peterz@infradead.org>
> > Date:   Fri Apr 25 17:39:01 2008 +0200
> > 
> >     fix idle (arch, acpi and apm) and lockdep
> >     
> > 
> > which I assume is the same patch as a different commit.
> 
> Yes. 

FWIW that is why I suggested using --cherry-pick.  It tries to filter out 
the patches from "upstream" by calculating a patch-id (Basically a SHA-1 
of the patch), and leaving out all commits from "downstream" with the same 
patch-id.

Of course, it does not always work perfectly, because it wants a context 
of 3 lines to agree.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: my git problem
  2008-04-28 21:47         ` Linus Torvalds
  2008-04-28 22:04           ` Johannes Schindelin
@ 2008-04-28 22:14           ` Linus Torvalds
  2008-04-29  2:14             ` Andrew Morton
  1 sibling, 1 reply; 21+ messages in thread
From: Linus Torvalds @ 2008-04-28 22:14 UTC (permalink / raw)
  To: Andrew Morton; +Cc: git



On Mon, 28 Apr 2008, Linus Torvalds wrote:
> 
> I don't actually see what was wrong with the old script.  The merge was
> really oddly done, but apart from that, something like this should work
> (just your old script with some trivial fixes to 'git merge' and using
> somewhat saner arguments):

Side note: you are still going to have *exactly* the same issue if you 
have other git trees (not origin) that contain the same patch. 

The fact is, your model of "merging by applying patches" is the real 
problem here. It's not a model that can work unless all trees are 
independent and they obviously aren't. So trying to merge with "origin" is 
only going to remove _one_ tree from the equation.

			Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: my git problem
  2008-04-28 22:14           ` Linus Torvalds
@ 2008-04-29  2:14             ` Andrew Morton
  0 siblings, 0 replies; 21+ messages in thread
From: Andrew Morton @ 2008-04-29  2:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

On Mon, 28 Apr 2008 15:14:20 -0700 (PDT) Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Mon, 28 Apr 2008, Linus Torvalds wrote:
> > 
> > I don't actually see what was wrong with the old script.  The merge was
> > really oddly done, but apart from that, something like this should work
> > (just your old script with some trivial fixes to 'git merge' and using
> > somewhat saner arguments):
> 
> Side note: you are still going to have *exactly* the same issue if you 
> have other git trees (not origin) that contain the same patch. 
> 

yup.  The same-patch-in-two-trees problem is surprisingly rare.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: my git problem
  2008-04-28 19:28               ` Linus Torvalds
@ 2008-04-29 17:15                 ` J. Bruce Fields
  2008-04-30  8:17                   ` Jakub Narebski
  0 siblings, 1 reply; 21+ messages in thread
From: J. Bruce Fields @ 2008-04-29 17:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Johannes Schindelin, Andrew Morton, git

On Mon, Apr 28, 2008 at 12:28:38PM -0700, Linus Torvalds wrote:
> 
> 
> On Mon, 28 Apr 2008, Johannes Schindelin wrote:
> > 
> > On Mon, 28 Apr 2008, Andrew Morton wrote:
> > > 
> > > That's missing the "logical" bit :)
> > 
> > Heh, you're right.  I am too used to Git to think how other people would 
> > feel about these things... :-)
> 
> No, you are both wrong.
> 
> You're wrong because apparently you never did abstract algebra and set 
> theory in school.

Hmph.  I've got a PhD in algebra and still find that choice of operators
confusing.

(Which may just be further evidence that one can take a lot of classes
and still be an idiot.)

> If you know math, git actually does the rigth and very much the *logical* 
> thing.
> 
> So ".." is a simple difference, while "..." is a more complex difference. 
> 
> They mean different things for different operation types, but that is 
> again something a math person takes for granted (ie in algebra, a "+" or 
> "-" is just a random operation that follows certain rules: "a-b" means one 
> thing for the set of real numbers, and something *totally* different if 
> you are talking about set algebra).

I suspect one reason the set-difference operator is more commonly
written with a backslash than a minus sign is that set difference is
different enough from anything else usually called subtraction that most
people find it confusing to use the same notation.

I can sorta buy the argument that "A...B" means most generally "some
kind of difference between the three sets A, A^B, and B", and that in
the context of "git diff" it's most sensible to take ordering into
account and produce some approximation of a diff between A^B and B.  I'd
personally have found an entirely separate operator simpler to
understand.  But perhaps there's only so many keys on the keyboard.

--b.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: my git problem
  2008-04-29 17:15                 ` J. Bruce Fields
@ 2008-04-30  8:17                   ` Jakub Narebski
  0 siblings, 0 replies; 21+ messages in thread
From: Jakub Narebski @ 2008-04-30  8:17 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Linus Torvalds, Johannes Schindelin, Andrew Morton, git

"J. Bruce Fields" <bfields@fieldses.org> writes:

> On Mon, Apr 28, 2008 at 12:28:38PM -0700, Linus Torvalds wrote:
>> 
>> 
>> On Mon, 28 Apr 2008, Johannes Schindelin wrote:
>>> 
>>> On Mon, 28 Apr 2008, Andrew Morton wrote:
>>>> 
>>>> That's missing the "logical" bit :)
>>> 
>>> Heh, you're right.  I am too used to Git to think how other people would 
>>> feel about these things... :-)
>> 
>> No, you are both wrong.
>> 
>> You're wrong because apparently you never did abstract algebra and set 
>> theory in school.
[...]
>> If you know math, git actually does the rigth and very much the *logical* 
>> thing.
>> 
>> So ".." is a simple difference, while "..." is a more complex difference. 
>> 
>> They mean different things for different operation types, but that is 
>> again something a math person takes for granted (ie in algebra, a "+" or 
>> "-" is just a random operation that follows certain rules: "a-b" means one 
>> thing for the set of real numbers, and something *totally* different if 
>> you are talking about set algebra).
[...]
> I can sorta buy the argument that "A...B" means most generally "some
> kind of difference between the three sets A, A^B, and B", and that in
> the context of "git diff" it's most sensible to take ordering into
> account and produce some approximation of a diff between A^B and B.  I'd
> personally have found an entirely separate operator simpler to
> understand.  But perhaps there's only so many keys on the keyboard.

IMHO adding support for a..b and a...b to git-diff is a bit of trick,
as a..b and a...b were created to represent a set of revisions (a
revision range).

If we have linear history:

   *---*---*---a---*---*---b

then a..b notation for a revision range is very natural, and having
git-diff interprete "a..b" as "a b" (for git-diff only endpoints
matter) to allow copy'n'pasting between git-log and git-diff, and
between git-fetch messages and git-diff was a good extension.

Now if the history is not linear, as in example below:

   *---*---*---x---*---*---b
                \
                 \-*---a

then "a..b", which is shortcut for "b ^a" (b --not a), returns x..b
range (set) of revisions.  If you read "a..b" as "what's in 'b'
since 'a'" it makes perfect sense.  But "git diff a..b" is still
"git diff a b", not "git diff x b". 


It would be perhaps as good notation to have "git diff a..b" mean
"git diff x b", i.e. be diff between endpoints of "git log a..b",
and have "git diff a...b" be "git diff a b", i.e. to be diff between
endpoints^W points of "git log a...b"... but if there is no clean
winner, simplicity of implementation wins. 

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: my git problem
  2008-04-28 19:21         ` Linus Torvalds
  2008-04-28 19:54           ` Andrew Morton
@ 2008-05-01  6:01           ` Carl Worth
  1 sibling, 0 replies; 21+ messages in thread
From: Carl Worth @ 2008-05-01  6:01 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, git

[-- Attachment #1: Type: text/plain, Size: 3455 bytes --]

On Mon, 28 Apr 2008 12:21:17 -0700 (PDT), Linus Torvalds wrote:
> On Mon, 28 Apr 2008, Andrew Morton wrote:
> > I'd have expected git-log to operate on the same patches as git-diff.
>
> No, not at all.
>
>  - "git log" shows each commit in a range.
>
>  - "git diff" shows just the difference between two states.
>
> The two have nothing in common. One operates on lots of individual commits
> (git log) individually, while the other one fundamentally operates on just
> two end-points (git diff).

Yes, the two operations are internally operating on different things,
(exactly as described).

> And "a..b" and "a...b" means two totally different things for the two
> totally different operations.

But the internal difference doesn't justify the totally different
meaning of "a..b" and "a...b" for these two operations. And in the
rest of your message you didn't justify the difference at all, (just
the fact that there *can* be a difference).

As a concrete example, I often want to view a series of patches that
is unique to a branch. I can easily do that with:

	git log -p a..b

Now, if I want to view basically the same information, but in a
"combined" view, (a single patch from the beginning to the final
state), I have to instead do:

	git diff a...b

And that's the part that's really confusing people, I think, (see
Andrew Morton running into the problem here, and Havoc Pennington
elsewhere). Conceptually, "a..b" is a way to say "the commits that are
unique to 'b' compared to 'a'", and that works great for git-log. But
when a similar concept is often desired for git-diff it's spelled
"a...b" instead. What's the justification for that? (Other than
historical accident.)

Meanwhile, you could even point to a similar case with the opposite
forms of each command. That is, one can see a series of patches with:

	git log -p a...b

And again, one can get get basically the same information in a
combined, single-patch form with:

	git diff a..b
or:	git diff b..a

depending on which direction one would like the combined version to be
in. Again, why the opposite syntax for basically the same information?

So git-log and git-diff are consistent in not treating the ".." and
"..." syntax uniformly, but I can't see any good justification for
that.

I don't think this second case is causing much problem. The symmetric
difference that's described by "git log a...b" isn't something I end
up needing very often, anyway. Meanwhile, a single patch between two
arbitrary states is extremely common, but I've always expressed that
naturally as "git diff a b", (by analogue with good old "diff fileA
fileB"), and never felt a need to spell this command as "git diff
a..b".

So, I do think the handling of ".." and "..." in git-diff is
objectively backwards compared to the way that git-log, git-rev-list
and everything else treat these operators.

I don't know what can be done to fix this now, though. Just expect
users to get confused, and then try to get them to drill "git log
a..b" and "git diff a...b" into their wee little brains.

Or maybe go Elijah's route and invent a new top-level command name in
which issues like this can get fixed. (I've been lukewarm on the idea
after watching the cogito attempt eventually be abandoned. I'd really
much rather see Elijah's ideas get pushed down into git itself for the
most part. But it's tough when backwards-compatibility prevents fixing
some things that are obviously confusing people.)

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2008-05-01  6:21 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-27 18:29 my git problem Andrew Morton
2008-04-27 19:15 ` Linus Torvalds
2008-04-27 19:44   ` Andrew Morton
2008-04-27 20:24     ` Linus Torvalds
2008-04-28 18:45       ` Andrew Morton
2008-04-28 18:49         ` Johannes Schindelin
2008-04-28 19:09           ` Andrew Morton
2008-04-28 19:13             ` Johannes Schindelin
2008-04-28 19:28               ` Linus Torvalds
2008-04-29 17:15                 ` J. Bruce Fields
2008-04-30  8:17                   ` Jakub Narebski
2008-04-28 19:33               ` Andrew Morton
2008-04-28 19:21         ` Linus Torvalds
2008-04-28 19:54           ` Andrew Morton
2008-05-01  6:01           ` Carl Worth
2008-04-28 19:52         ` Daniel Barkalow
2008-04-28 21:35       ` Andrew Morton
2008-04-28 21:47         ` Linus Torvalds
2008-04-28 22:04           ` Johannes Schindelin
2008-04-28 22:14           ` Linus Torvalds
2008-04-29  2:14             ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).