From: "Boyd Stephen Smith Jr." <bss03@volumehost.net>
To: "Ian Clarke" <ian.clarke@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: A better approach to diffing and merging
Date: Sat, 29 Nov 2008 17:40:02 -0600 [thread overview]
Message-ID: <200811291740.06865.bss03@volumehost.net> (raw)
In-Reply-To: <823242bd0811291012g15c4d442qa5d7afc9cc762b20@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2374 bytes --]
On Saturday 2008 November 29 12:12, Ian Clarke wrote:
> While I'm no merging expert, it seems that most merging algorithms do
> it on a line-by-line basis, treating source code as nothing but a list
> of lines of text. It got me thinking, what if the merging algorithm
> understood the structure of the source code it is trying to merge?
Unfortunately, this is hard to do in general. Not impossible, but very hard.
Heck, some languages don't really have a formal grammar, or have one that is
undecidable without doing deeper analysis. Perl 6 is supposed to have some
support for language constructs that change the grammar.
Also, this generally takes a lot of time. Automatic merges are only useful if
they take less (or only a little more) time than doing the merge manually.
If your mergetool has to think about something for 30 minutes that you could
have resolved in 5, it's not normally a "win".
Also, it slightly changes the format of a "patch" file. Currently, patch
files are a line-by-line diff. If you instead made changes based on mapping
parse trees to parse trees, you'd (probably) want to
store/transfer/communicate your patches using a different format, to preserve
the proper amount of "context" and make the patch easy to apply. (I.e., do
the hard work once.)
> Any takers? I've set up a Google Group for further discussion, please
> join if interested.
You might look deeper into Darcs development. This level of
pluggable "understanding" of the file(s) being modified fits in well with a
Grand Unified Theory of Patching. Also "understanding" patches better allows
Darcs to reorder patches (and calculate "reverse patches") better -- reducing
the time to do existing automatic merging (or reject the merge as
non-automatable) and make merges automatic that are currently not handled
automatically.
I'm not going to come out and discourage you or other from adding the
functionality to git, but I think there are more useful and practical ways to
improve git. (Line-by-line merging is generally "good enough", the worst
enemy of "good" software.)
--
Boyd Stephen Smith Jr. ,= ,-_-. =.
bss03@volumehost.net ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-'
http://iguanasuicide.org/ \_/
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
next prev parent reply other threads:[~2008-11-29 23:58 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-29 18:12 A better approach to diffing and merging Ian Clarke
2008-11-29 23:40 ` Boyd Stephen Smith Jr. [this message]
2008-11-30 2:54 ` Miklos Vajna
2008-11-30 1:56 ` Brian Dessent
2008-12-01 9:54 ` Karl Hasselström
2008-12-01 11:41 ` Jakub Narebski
2008-12-02 8:37 ` Karl Hasselström
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200811291740.06865.bss03@volumehost.net \
--to=bss03@volumehost.net \
--cc=git@vger.kernel.org \
--cc=ian.clarke@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.