public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* New SCM and commit list
@ 2005-04-10 23:10 Benjamin Herrenschmidt
  2005-04-10 23:26 ` Linus Torvalds
  2005-04-11  7:13 ` David Woodhouse
  0 siblings, 2 replies; 24+ messages in thread
From: Benjamin Herrenschmidt @ 2005-04-10 23:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel list

Hi Linus !

Do you intend to continue posting "commited" patches to a mailing list
like bk scripts did to bk-commits-head@vger ? As I said a while ago, I
find this very useful, especially with the actual patch included in the
commit message (which isn't the case with most other projects CVS commit
lists, and I find that annoying).

If yes, then I would appreciate if you could either keep the same list,
or if you want to change the list name, keep the subscriber list so
those of us who actually archive it don't miss anything ;)

Thanks !

Regards,
Ben.



^ permalink raw reply	[flat|nested] 24+ messages in thread
* Re: New SCM and commit list
@ 2005-04-11 18:18 Adam J. Richter
  0 siblings, 0 replies; 24+ messages in thread
From: Adam J. Richter @ 2005-04-11 18:18 UTC (permalink / raw)
  To: linux-kernel, torvalds

On 2005-04-11 Linus Torvalds wrote:
>Then the bad news: the merge algorithm is going to suck. It's going to be
>just plain 3-way merge, the same RCS/CVS thing you've seen before. With no
>understanding of renames etc. I'll try to find the best parent to base the
>merge off of, although early testers may have to tell the piece of crud
>what the most recent common parent was.

	I've been surprised at how well it works to put each character on a
separate line, pipe the input into diff3 and then join the lines
back together.  For example, let's consider the case of
a adding parameters to a function.  Here one version adds a parameter
before the existing parameter, and another version adds another parameter
after the existing parameter:

$ cat orig
call(bar);
$ cat ver1
call(foo,bar);
$ cat ver2
call(bar,baz);
$ charmerge ver1 orig ver2
call(foo,bar,baz);

	A more practically scaled application that I tried was with
another filter that I wrote that would automatically resolve certain
types of diff3 conflicts[1].  With that filter, I took the SCSI
FlashPoint driver, and made an edited version by piping it through GNU
indent, which not only reindents, but also splits and joins lines.
I made a second edited version by changing all 146 instances of
"SYNC" to "GROP" in the original.  It merged apparently successfully,
giving me a GNU indented version with all of the keyword changes.
The version of this resolution program dies if it his a diff3
conflict of a type that it is not prepared to resolve.  I'll post
it once I've got it properly preserving the conflicts that it
doesn't try to fix.  In the meantime, here is an illustrative
script to do get diff3 to do character-based merges, although it
gives garbage results if there are any conflicts.

[1] The type of conflict that was automatically resolved is as follows:

	variant1 = <prepended-new-text><original><appended-new-text>

	result --> <prepended-new-text><variant2><appended-new-text>

	...this is actually exactly the order one would want in the
case where <original> also occurs in variant2, but it was close
enough for this test.

                    __     ______________ 
Adam J. Richter        \ /
adam@yggdrasil.com      | g g d r a s i l



#!/bin/sh
# Usage: charmerge ver1_file orig_file ver2_file

lineify() {
	sed 's/\([^\n]\)/\1\
/g'
}

unlineify() {
	awk '/^$/ {print $0} /^..*/ { printf "%s", $0}'
}

tmpdir=/tmp/charmerge.$$

mkdir $tmpdir
lineify < "$1" > $tmpdir/1
lineify < "$2" > $tmpdir/2
lineify < "$3" > $tmpdir/3
diff3 -m $tmpdir/{1,2,3} | unlineify
rm -rf $tmpdir

^ permalink raw reply	[flat|nested] 24+ messages in thread
* Re: New SCM and commit list
@ 2005-04-12  3:02 Adam J. Richter
  2005-04-12 21:54 ` Daniel Barkalow
  0 siblings, 1 reply; 24+ messages in thread
From: Adam J. Richter @ 2005-04-12  3:02 UTC (permalink / raw)
  To: barkalow
  Cc: benh, dwmw2, greg, james.bottomley, jgarzik, linux-kernel, mason,
	mingo, torvalds

On 2005-04-11, Daniel Barkalow wrote:
>If merge took trees instead of single files, and had some way of detecting
>renames (or it got additional information about the differences between
>files), would that give BK-quality performance? Or does BK also support
>cases like:
>
>orig ---> first ---> first-merge -
> |                /               \
> |------> second -                 -> final
> |                \               /
> |------> third ---> third-merge -
>
>where the final merge requires, for complete cleanliness, a comparison of
>more than 3 states (since some changes will have orig as the common
>ancestor and some will have second).

	With 3-way merge and the ability to regenerate the relevant
files from each step, this should be easy to handle as long
as you have a list of which patches are considered to have been
duplicated.  Let's detail your example:

orig ---> first 1a 1b 1c ---> first-merge - 1d 1e
 |                          /                    \
 |------> second 2a 2b 2c -                       -> final
 |                          \                    /
 |------> third 3a 3b 3c ---> third-merge - 3d 3e

Here, 1a, 1b, etc. refer to specific states of the source tree.
I will refer to differences by a notation like "1a->1b", which
is the difference to go from snapshot 1a to 1b.  All that the
merge algorithm for the final merge needs to know is that the
ends of the branches (that is, 1e and 3e) both contain the
following diffs:

		orig->2a
		2a->2b
		2b->2c

	The function merge(orig, ver1, ver2) can try to reverse
the duplicate merges in one of the branches:

		1e' = merge( 1e, 2c->2b);
		1e'' = merge(1e', 2b->2a);
		1e''' = merge(1e'', 2a->orig);
		return merge(1e''', 2c->3e)

	Of course, conflicts can happen, but that can happen
in any merge.  There are also other ways to calculate the
merge and because there are different ways one can write a
merge function, it is possible that merging in a different
order might produce slightly different results.  For example,
it would be possible to reverse the dpulicates in your "third merge"
branch instead of your "first merge" branch, or one could
reconstruct a branch without the duplicated merges by executing
the other changes forward from a common ancestor, like so:

		1e''' = merge(orig, 3d->3e);

	...regardless, the point is that all the information
that is absolutely needed is a list of instance of diffs
to be skipped.  It is not even necessary that the changes
have such a clearly explainable ancestory as that you have
described.  All the merge program needs to know are the changes
to be skipped, although information like changes the skipped
patches are duplicating may be useful for things like trying
to reverse a patch in your "third-merge" branch in your
example if reverseing the patch in "first-merge" fails.

	I believe that at least bitkeeper, darcs, a free python-based
system that I can't remember at the moment, and possibly arch do this
sort of machination already.


>Does this happen in real life? [...]

	Yes.  Both individual users and Linux distributions incorporate
patches that they think are useful to them and then futher patches
that they develop.  The time costs of rejecting such patches would
likely be paid for by other integration or development work not being
done.

>It seems like sane development processes
               ^^^^
>wouldn't have multiple mainline-candidate patch sets including the same
>patches, if for no other reason than that, should the merge fail, nobody
>with any clue about the original patches would be anywhere nearby.

	If you could avoid prejudicial subjective adjectives, it
it would make it easier for the saneness or insaneness of an
approach to be apparent just by discussing your more objective criteria,
like the remainder of your sentence, which is where the focus should
be.

	(1) Does allowing duplicate patches really mean that
	   "nobody with any clue about the original patches would be
	   anywhere near by?"  What attracts these clueful people
	   just by third parties having to rebase their patches?

	(2) Does this supposed benefit outweigh the cost of rejecting
	    many patches unnecessarily?  I know from my own experience
	    that I have either given up on or had to put into a very low
	    priority mode at least 66% of the patches that I haven't
	    gotten integrated, but which I am confident the kernel
	    would be better having (e.g.: devfs shrink, lookup()
	    trapping, ipv4 as a loadable (not not yet removable) module,
	    sysfs memory shrink, factoring much of the DMA mapping to
	    the common bus code from individual drivers, fewer kmap's
	    in crypto, I could go on).

>It
>seems better to throw something back to someone to rebase their diffs.
       ^^^^^^

	I try to avoid a general subjective adjectives like "better"
unless I am claiming that I've covered the trade-offs fully, and, even
then, avoiding it keeps the focus on analyzing the trade-offs.

                    __     ______________ 
Adam J. Richter        \ /
adam@yggdrasil.com      | g g d r a s i l

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2005-04-18  8:18 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-10 23:10 New SCM and commit list Benjamin Herrenschmidt
2005-04-10 23:26 ` Linus Torvalds
2005-04-11  3:25   ` James Bottomley
2005-04-11 20:53     ` Greg KH
2005-04-11 21:26       ` Linus Torvalds
2005-04-11 21:31         ` James Bottomley
2005-04-12  4:24           ` Arjan van de Ven
2005-04-13 20:04         ` H. Peter Anvin
2005-04-11  5:53   ` Jeff Garzik
2005-04-11  6:15     ` Linus Torvalds
2005-04-11  6:40       ` Ryan Anderson
2005-04-11  6:47       ` Geert Uytterhoeven
2005-04-11  7:38       ` Ingo Molnar
2005-04-11 12:51         ` Chris Mason
2005-04-11 19:32           ` Chris Mason
2005-04-11 22:50       ` Daniel Barkalow
2005-04-12  8:36         ` Geert Uytterhoeven
2005-04-12  9:52       ` Catalin Marinas
2005-04-16  8:35         ` Paul Jackson
2005-04-18  8:18           ` Catalin Marinas
2005-04-11  7:13 ` David Woodhouse
  -- strict thread matches above, loose matches on Subject: below --
2005-04-11 18:18 Adam J. Richter
2005-04-12  3:02 Adam J. Richter
2005-04-12 21:54 ` Daniel Barkalow

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox