git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Stephen Bash <bash@genarts.com>
Cc: git@vger.kernel.org
Subject: Re: Merging split files
Date: Tue, 29 Mar 2011 11:16:23 -0400	[thread overview]
Message-ID: <20110329151623.GD10771@sigill.intra.peff.net> (raw)
In-Reply-To: <2495196.195017.1300454556155.JavaMail.root@mail.hq.genarts.com>

On Fri, Mar 18, 2011 at 09:22:36AM -0400, Stephen Bash wrote:

> In our previous release foo.cxx contained both the base class and a
> few subclasses.  Since then the number of subclasses has grown, and
> we've split foo.cxx (base and sub-classes) into foo-base.cxx (base
> class) and foo-defs.cxx (sub-classes).  Since the release, we've had a
> few bug fixes in foo.cxx on the maintenance branch, and need to merge
> those back to development.  When I did the merge Git identified
> foo.cxx as moved to foo-defs.cxx, which worked for most changes, but a
> few needed to be in foo-base.cxx.  In this case it was a pretty
> trivial manual resolution, but is there a method for handling merges
> of split files?

I don't think there is currently a good way to do this automatically.

The problem is that the closest merge-recursive gets to understanding
content movement is that it considers whole file renames. So it sees
"foo.cxx became foo-defs.cxx", and applies changes to foo.cxx to
foo-defs.cxx, but it has no clue that foo-base.cxx. So at the very
least, it would need to represent "foo.cxx has split into foo-base.cxx
and foo-defs.cxx", which is not something it can currently handle. But
more than that, you want to know _which_ parts moved to each file.

So I think the most flexible thing is to forget file renames at all.
They are just a rough version of the general idea of content movement.
In theory, we should be able to see that the content we changed in
foo.cxx no longer exists, and then start looking for similar content
elsewhere. Not similar _files_, but for the chunk of content that is
changed between the merge base and the maintenance (and some surrounding
context), find where that bit of content went. And then try to merge our
changes into that new bit of content.

One problem is that when it fails, it fails pretty hard. With file
renames, your changes at least usually ends up in the right file (your
present problem excluded), and you get some textual mess to clean up.
But with content-level renaming, I suspect in conflict cases we would
end up with no clue where the result goes (because the conflict means we
can't easily match up the content for similarity), and have to stick it
in the deleted file. On the other hand, it might simply work to keep
expanding the amount of context we consider for content similarity until
we find a match, which eventually would end up considering the whole
file, and generalize to a file rename.

Implementing that inside of merge-recursive is likely to be pretty nasty
(even the current file-rename code is already pretty nasty). But it may
be possible to prototype something that runs after we hit the conflicted
state, like mergetool.

I definitely think it's an interesting area to work in, but I would have
to give it a lot of thought.

-Peff

  reply	other threads:[~2011-03-29 15:16 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <31155742.183989.1300374518689.JavaMail.root@mail.hq.genarts.com>
2011-03-18 13:22 ` Merging split files Stephen Bash
2011-03-29 15:16   ` Jeff King [this message]
2011-03-29 16:33     ` Stephen Bash
2011-03-29 18:15       ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110329151623.GD10771@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=bash@genarts.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).