* [RFC] Importing from a patch-oriented SCM
@ 2005-08-19 7:04 Martin Langhoff
2005-08-19 7:49 ` Junio C Hamano
2005-08-19 8:29 ` Johannes Schindelin
0 siblings, 2 replies; 5+ messages in thread
From: Martin Langhoff @ 2005-08-19 7:04 UTC (permalink / raw)
To: GIT
I am drafting an import script to turn a GNU Arch into a GIT archive.
Importing the branches and commits increamentally is reasonably
straightforward -- or so it seems so far. Note: the repository
manipulation is based on cvsimport -- so my knowledge of the git repo
internals is still pertty close to zero.
Each patchset has a unique identifier, and can carry metadata with the
identifiers of the patches it "includes". If you are using gnu arch,
when you merge across branches, it'll know to skip a particular
patchset if it has been applied already. AFAICT there is no such
concept in GIT, and I wonder what to do with all this metadata about
merges.
My proto-plan is to keep track of merged stuff (in a cache file
somewhere), and if a particular merge means that the branches are
fully merged up to the last patch of the series (if no commits from
the source branch have been skipped) mark it as a merge in GIT.
If the merges have been done out-of-order, that may show up in the
latest merge. For example, branch A and B of the same project each
have 10 commits from the branching point. If a merge A -> B does
commits 1,2,3,7,8 it gets imported to git as a merge up to commit "3",
although there is more there. The next merge, which does 4,5,6,10 will
show up as a merge of commit 8.
Yuk.
If I remember correctly, Junio added some stuff in the merge & rebase
code that will identify if a particular patch has been seen and
applied, and skip it even if it's a bit out of order. But I don't know
what that is based on, and whether I can somehow maximize the chances
of the patch being identified as already merged across branches. If
it's based on the initial commit identifier being carried through
(does that travel with commits when you merge?) I stand a small
chance. Otherwise, I'm lost.
Suggestions?
cheers,
martin
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC] Importing from a patch-oriented SCM
2005-08-19 7:04 [RFC] Importing from a patch-oriented SCM Martin Langhoff
@ 2005-08-19 7:49 ` Junio C Hamano
2005-08-19 8:52 ` Martin Langhoff
2005-08-19 8:29 ` Johannes Schindelin
1 sibling, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2005-08-19 7:49 UTC (permalink / raw)
To: Martin Langhoff; +Cc: git
Martin Langhoff <martin.langhoff@gmail.com> writes:
> If I remember correctly, Junio added some stuff in the merge & rebase
> code that will identify if a particular patch has been seen and
> applied, and skip it even if it's a bit out of order. But I don't know
I think you are talking about git-patch-id.
f97672225b3b1a2ca57cfc16f49239bed1efcd87
Author: Linus Torvalds <torvalds@ppc970.osdl.org>
Date: Thu Jun 23 15:06:04 2005 -0700
Add "git-patch-id" program to generate patch ID's.
A "patch ID" is nothing but a SHA1 of the diff associated
with a patch, with whitespace and line numbers ignored. As
such, it's "reasonably stable", but at the same time also
reasonably unique, ie two patches that have the same "patch
ID" are almost guaranteed to be the same thing.
IOW, you can use this thing to look for likely duplicate
commits.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC] Importing from a patch-oriented SCM
2005-08-19 7:04 [RFC] Importing from a patch-oriented SCM Martin Langhoff
2005-08-19 7:49 ` Junio C Hamano
@ 2005-08-19 8:29 ` Johannes Schindelin
1 sibling, 0 replies; 5+ messages in thread
From: Johannes Schindelin @ 2005-08-19 8:29 UTC (permalink / raw)
To: Martin Langhoff; +Cc: GIT
Hi,
On Fri, 19 Aug 2005, Martin Langhoff wrote:
> Each patchset has a unique identifier, and can carry metadata with the
> identifiers of the patches it "includes". If you are using gnu arch,
> when you merge across branches, it'll know to skip a particular
> patchset if it has been applied already. AFAICT there is no such
> concept in GIT, and I wonder what to do with all this metadata about
> merges.
You should include the metadata in the commit object. If the information
is about parents, they should be parents in git, too. If the information
is something else, you should convert it to readable text and put it in
the comment part of the commit object.
> If I remember correctly, Junio added some stuff in the merge & rebase
> code that will identify if a particular patch has been seen and
> applied, and skip it even if it's a bit out of order.
The usual way of git is to use a 3-way merge: given a common ancestor, try
to apply the changes between the ancestor and the second branch to the
first branch. And yes, this does not take history into account.
Originally, I wanted to write an "intelligent" merge, which turns the
history into patches and tries to merge these, but ultimately I got
convinced that this is too complicated to be worthwhile.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC] Importing from a patch-oriented SCM
2005-08-19 7:49 ` Junio C Hamano
@ 2005-08-19 8:52 ` Martin Langhoff
2005-08-19 16:22 ` Daniel Barkalow
0 siblings, 1 reply; 5+ messages in thread
From: Martin Langhoff @ 2005-08-19 8:52 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On 8/19/05, Junio C Hamano <junkio@cox.net> wrote:
> Martin Langhoff <martin.langhoff@gmail.com> writes:
>
> > If I remember correctly, Junio added some stuff in the merge & rebase
> > code that will identify if a particular patch has been seen and
> > applied, and skip it even if it's a bit out of order. But I don't know
>
> I think you are talking about git-patch-id.
Is this used at commit time, and stored somewhere (doesn't seem to be)
or do you select older patches from the destination branch at merge
time?
If you only compare patches since the last merge, patches that were
merged but somehow unreported will fall into a black hole and cause a
conflict going forward anyway. Hmm. That seems to be a problem I
won't be able to avoid if merges happen out-of-order.
I'll try and work out how it's being used during the merge
(pointers/hints welcome) and see if I can do something smart w it.
Thanks!
cheers,
martin
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC] Importing from a patch-oriented SCM
2005-08-19 8:52 ` Martin Langhoff
@ 2005-08-19 16:22 ` Daniel Barkalow
0 siblings, 0 replies; 5+ messages in thread
From: Daniel Barkalow @ 2005-08-19 16:22 UTC (permalink / raw)
To: Martin Langhoff; +Cc: Junio C Hamano, git
On Fri, 19 Aug 2005, Martin Langhoff wrote:
> On 8/19/05, Junio C Hamano <junkio@cox.net> wrote:
> > Martin Langhoff <martin.langhoff@gmail.com> writes:
> >
> > > If I remember correctly, Junio added some stuff in the merge & rebase
> > > code that will identify if a particular patch has been seen and
> > > applied, and skip it even if it's a bit out of order. But I don't know
> >
> > I think you are talking about git-patch-id.
>
> Is this used at commit time, and stored somewhere (doesn't seem to be)
> or do you select older patches from the destination branch at merge
> time?
If a patch is applied verbatim, or a merge results in no conflicts (i.e.,
only offsets), then you can run git-patch-id on the diff caused by it and
compare the result with the git-patch-id of the diff caused by your local
change to see if you've found it. Of course, if there was any modification
to the patch or a conflict was resolved, you won't see a match, but that's
plausibly correct anyway: you don't know whether the content change that
resulted from your patch really matched the change you wanted to make.
> If you only compare patches since the last merge, patches that were
> merged but somehow unreported will fall into a black hole and cause a
> conflict going forward anyway. Hmm. That seems to be a problem I
> won't be able to avoid if merges happen out-of-order.
They might cause conflicts, but they're relatively unlikely to require
manual intervention, because the merging mechanism in git is stronger than
the one in arch (by virtue of identifying a common ancestor), and will
recognize when a section of changes made by both sides is the same and
produce a warning rather than a conflict. That's how the rebase stuff can
identify that your rebased patch is empty (when upstream applies your
patch): the content change that it would make has been made.
-Daniel
*This .sig left intentionally blank*
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2005-08-19 16:19 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-19 7:04 [RFC] Importing from a patch-oriented SCM Martin Langhoff
2005-08-19 7:49 ` Junio C Hamano
2005-08-19 8:52 ` Martin Langhoff
2005-08-19 16:22 ` Daniel Barkalow
2005-08-19 8:29 ` Johannes Schindelin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).