* Dividing up a large merge. @ 2009-07-14 23:32 davidb 2009-07-15 0:16 ` Bryan Donlan 0 siblings, 1 reply; 13+ messages in thread From: davidb @ 2009-07-14 23:32 UTC (permalink / raw) To: Git Mailing List I'm trying to figure out a better way of dividing up the effort involved in a merge amongst a group of people. Right now, I basically describe the merge to each of them, and ask them to merge their part, and then 'git checkout HEAD' the other parts. They tell me about the commits, along with the files that they've merged correctly. When everybody is done, I make a real merge commit, and pull in all of their files. It's a lot for me to track, and confusing for each person. I'd like to create a branch we can all push to that we gradually work to become the result of a resolved merge. Not only does git not want to help me do the merge, but seems to actively be fighting against me doing this. What I thought of was something like telling people to do: $ git merge v2.6.30 resolve some files $ git checkout HEAD ...rest of files... $ git commit; git push but that 'rest of files' is fairly large and complicated. I can think of two ideas: - Something that basically does a partial 'git reset --hard HEAD' to put many of the files back. - The ability to specify subpaths on the 'git merge' to do the merge work but limited to a directory or set of files. Obviously, either case will require someone to still track the overall effort and make sure the final state of the tree really represents the total merge. Is there anything that can parse the output of 'git merge-tree'? Even just splitting this up and then applying parts of it would be helpful. Would it be useful to write something that can apply the results output of 'git merge-tree'? Thanks, David ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Dividing up a large merge. 2009-07-14 23:32 Dividing up a large merge davidb @ 2009-07-15 0:16 ` Bryan Donlan 2009-07-15 0:29 ` davidb 0 siblings, 1 reply; 13+ messages in thread From: Bryan Donlan @ 2009-07-15 0:16 UTC (permalink / raw) To: davidb; +Cc: Git Mailing List On Tue, Jul 14, 2009 at 7:32 PM, <davidb@quicinc.com> wrote: > I'm trying to figure out a better way of dividing up the effort > involved in a merge amongst a group of people. Right now, I > basically describe the merge to each of them, and ask them to > merge their part, and then 'git checkout HEAD' the other parts. > They tell me about the commits, along with the files that they've > merged correctly. When everybody is done, I make a real merge > commit, and pull in all of their files. It's a lot for me to > track, and confusing for each person. What do you mean by describing a merge? git is designed to have all the information needed for a merge inherent in the repository history. > I'd like to create a branch we can all push to that we gradually > work to become the result of a resolved merge. Not only does git > not want to help me do the merge, but seems to actively be > fighting against me doing this. > > What I thought of was something like telling people to do: > > $ git merge v2.6.30 > resolve some files > $ git checkout HEAD ...rest of files... > $ git commit; git push > > but that 'rest of files' is fairly large and complicated. I can > think of two ideas: > > - Something that basically does a partial 'git reset --hard > HEAD' to put many of the files back. > > - The ability to specify subpaths on the 'git merge' to do the > merge work but limited to a directory or set of files. > > Obviously, either case will require someone to still track the > overall effort and make sure the final state of the tree really > represents the total merge. > > Is there anything that can parse the output of 'git merge-tree'? > Even just splitting this up and then applying parts of it would > be helpful. Would it be useful to write something that can apply > the results output of 'git merge-tree'? I'm having a hard time understanding the situation here - why can't you just: $ git checkout -b mergebranch v2.6.30 $ git merge developer1/topic # Fix conflicts $ git merge developer2/topic # Fix conflicts # etc Why are there so many conflicts to make this an issue? If the commits are isolated to small changes, rebasing the developer topic branches instead of merging may help, by allowing you to take conflicts one commit at a time. For example, if your problems are primarily conflicts between developer branches and upstream: $ git checkout -b mergebranch-dev1 developer1/topic $ git rebase v2.6.30 # Fix conflicts on a commit-by-commit basis $ git checkout -b mergebranch-dev2 developer2/topic $ git rebase v2.6.30 # Fix conflicts on a commit-by-commit basis $ git checkout -b mergebranch $ git merge mergebranch-dev1 # Fix any remaining conflicts If your problems are because of conflicts between developer branches and each other: $ git checkout -b mergebranch-dev1 developer1/topic $ git rebase v2.6.30 # Fix conflicts on a commit-by-commit basis $ git checkout -b mergebranch-dev2 developer2/topic $ git rebase mergebranch-dev1 # Fix conflicts on a commit-by-commit basis These rebasing approaches will change the commit IDs, so your developers will need to rebase any further work upon these new commit IDs, but if things are as bad as you say, a commit-by-commit merge that rebase allows you may be much simpler. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Dividing up a large merge. 2009-07-15 0:16 ` Bryan Donlan @ 2009-07-15 0:29 ` davidb 2009-07-15 0:34 ` Avery Pennarun ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: davidb @ 2009-07-15 0:29 UTC (permalink / raw) To: Bryan Donlan; +Cc: Git Mailing List On Tue, Jul 14, 2009 at 05:16:54PM -0700, Bryan Donlan wrote: > What do you mean by describing a merge? git is designed to have all > the information needed for a merge inherent in the repository history. Yes, provided you can actually do the merge all at once. > Why are there so many conflicts to make this an issue? Because I have to work in the "real world". > If the commits are isolated to small changes, rebasing the developer > topic branches instead of merging may help, by allowing you to take > conflicts one commit at a time. For example, if your problems are > primarily conflicts between developer branches and upstream: No real developer branches with conflicts (I make those be fixed), but several upstreams. We have many developers busily doing work, and one or more other companies is also working on the same code. Meanwhile, the mainline kernel advances at it's own astounding rate. Unfortunately, paying customers will always get priority of work, even when that position is actually somewhat shortsighted and it makes for a lot of merge effort later. The real issue is that there isn't any single individual who understands all of the code that conflicts. It has to be divided up somehow, I'm just trying to figure out a better way of doing it. Thanks, David ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Dividing up a large merge. 2009-07-15 0:29 ` davidb @ 2009-07-15 0:34 ` Avery Pennarun 2009-07-15 1:19 ` davidb 2009-07-15 12:28 ` Theodore Tso 2009-07-15 18:57 ` Daniel Barkalow 2 siblings, 1 reply; 13+ messages in thread From: Avery Pennarun @ 2009-07-15 0:34 UTC (permalink / raw) To: davidb; +Cc: Bryan Donlan, Git Mailing List On Tue, Jul 14, 2009 at 8:29 PM, <davidb@quicinc.com> wrote: > The real issue is that there isn't any single individual who > understands all of the code that conflicts. It has to be divided > up somehow, I'm just trying to figure out a better way of doing > it. How about having one person do the merge, then commit it (including conflict markers), then have other people just make a series of commits removing the conflict markers? Avery ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Dividing up a large merge. 2009-07-15 0:34 ` Avery Pennarun @ 2009-07-15 1:19 ` davidb 2009-07-15 1:29 ` Douglas Campos 2009-07-15 1:32 ` Avery Pennarun 0 siblings, 2 replies; 13+ messages in thread From: davidb @ 2009-07-15 1:19 UTC (permalink / raw) To: Avery Pennarun; +Cc: Bryan Donlan, Git Mailing List On Tue, Jul 14, 2009 at 05:34:26PM -0700, Avery Pennarun wrote: > How about having one person do the merge, then commit it (including > conflict markers), then have other people just make a series of > commits removing the conflict markers? I guess this helps in some sense, but the intermediate result isn't going to build, and things like mergetool aren't going to work. It's helpful for the individuals to have the full merge conflict available, or at least the stages of the files in question. David ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Dividing up a large merge. 2009-07-15 1:19 ` davidb @ 2009-07-15 1:29 ` Douglas Campos 2009-07-15 1:32 ` Avery Pennarun 1 sibling, 0 replies; 13+ messages in thread From: Douglas Campos @ 2009-07-15 1:29 UTC (permalink / raw) To: Git Mailing List; +Cc: Avery Pennarun, Bryan Donlan Merging the peer branches before doesn't help it? On Tue, Jul 14, 2009 at 10:19 PM, <davidb@quicinc.com> wrote: > On Tue, Jul 14, 2009 at 05:34:26PM -0700, Avery Pennarun wrote: > >> How about having one person do the merge, then commit it (including >> conflict markers), then have other people just make a series of >> commits removing the conflict markers? > > I guess this helps in some sense, but the intermediate result > isn't going to build, and things like mergetool aren't going to > work. It's helpful for the individuals to have the full merge > conflict available, or at least the stages of the files in > question. > > David > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Douglas Campos Theros Consulting +55 11 7626 5959 +55 11 3020 8168 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Dividing up a large merge. 2009-07-15 1:19 ` davidb 2009-07-15 1:29 ` Douglas Campos @ 2009-07-15 1:32 ` Avery Pennarun 1 sibling, 0 replies; 13+ messages in thread From: Avery Pennarun @ 2009-07-15 1:32 UTC (permalink / raw) To: davidb; +Cc: Bryan Donlan, Git Mailing List On Tue, Jul 14, 2009 at 9:19 PM, <davidb@quicinc.com> wrote: > On Tue, Jul 14, 2009 at 05:34:26PM -0700, Avery Pennarun wrote: >> How about having one person do the merge, then commit it (including >> conflict markers), then have other people just make a series of >> commits removing the conflict markers? > > I guess this helps in some sense, but the intermediate result > isn't going to build, and things like mergetool aren't going to > work. It's helpful for the individuals to have the full merge > conflict available, or at least the stages of the files in > question. It sounds like you're going in circles a bit here. You want the full merge conflict available - but you want it to be able to build. It sounds like the "git reset the unwanted subdirs" solution suggested earlier is the only option that will really work. You could simplify life for your co-workers by writing a script to automate the steps, I suppose. You probably want all the individuals to use merge --squash, so that you don't mark the history as merged until you're done. Then you combine all their work at the end and mark the commit as done using 'git merge -s ours'. Avery ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Dividing up a large merge. 2009-07-15 0:29 ` davidb 2009-07-15 0:34 ` Avery Pennarun @ 2009-07-15 12:28 ` Theodore Tso 2009-07-15 13:39 ` Jakub Narebski 2009-07-15 14:47 ` Larry D'Anna 2009-07-15 18:57 ` Daniel Barkalow 2 siblings, 2 replies; 13+ messages in thread From: Theodore Tso @ 2009-07-15 12:28 UTC (permalink / raw) To: davidb; +Cc: Bryan Donlan, Git Mailing List On Tue, Jul 14, 2009 at 05:29:26PM -0700, davidb@quicinc.com wrote: > No real developer branches with conflicts (I make those be > fixed), but several upstreams. We have many developers busily > doing work, and one or more other companies is also working on > the same code. Meanwhile, the mainline kernel advances at it's > own astounding rate. If you hare maintaining a large number of changes over a long-term (which in the case of the kernel can be measured in a month or two), it's often much easier to maintain things as a series of patches. That way you can merge each patch one at a time. If you already have everything in a git tree, I'd suggest pulling it apart into separate patches, by using "git format-patch". Note that if you have multiple merges into tree, this will go much more smoothly if you can separate things into a single linear stream. This is also a good reason why if you have partial work that is complete enough to be merged into mainline, it is ***much*** better to try pushing patches to mainline earlier rather than later. Waiting until you are 100% done and the work is completely certified involves a large number of risks; for example, what if people complain about work that was done early on? Or if the design was fundamentally flawed from the get-go? At the minimum, you will save a huge amount of effort if you post a request-for-comment version of the patches up front. And, if you believe your release cycle is going to run for more than, say, 2-3 months, I suggest that you keep things in a single linear patch stream. You can keep the patch series under git control, and then rebase periodically; I'd suggest rebasing once a mainline release happens (i.e., when 2.6.X is released), and then again after most of the major changes have been merged in and the tree has settled down (i.e., after 2.6.X-rc2 or 2.6.X-rc3). > The real issue is that there isn't any single individual who > understands all of the code that conflicts. It has to be divided > up somehow, I'm just trying to figure out a better way of doing > it. Yeah, that's another prime argument for maintaining your changes as a patch queue. I use a combination of quilt plus git. So the rebasing methodology becomes: # pop all patches guilt pop -a # update the base of the patches git pull origin # start trying to apply each of the patches, one at a time # next_patch: guilt push -a # when you get a failure, the push will stop and tell you it can't # apply a patch; so force apply the patch: guilt push -f # # this will leave some patch .rej files; resolve the patch failures # for all of the files. Use "git add" once the patches have been resolved # also make sure that any files that were added by the patch that was # force applied are also manually marked as needing added using "git add". # Once you are sure the patch is properly merged, do this: guilt refresh --diffstat # Check the changes made to the patch; I normally create a symlink from # .git/patches/<work-branch-for-quilt> to patches in the top level, i.e. # "ln -s .git/patches/master patches"; if you can't remember the name of the # patch, you can get it via the command "guilt applied | tail -1" (cd patches; git diff name-of-patch) # now repeat with the next set of patches by going back to next_patch, above I normally keep an indication of the version that the patch series is based upon via a comment in the first line of the series file, like this: "# BASE v2.6.30-rc3" or sometimes like this "# BASE 6ab2792". This can be useful when creating automated scripts to test the patch series, since they know what version to apply the patches against. In your case, the first person to start the rebase should change the "# BASE" comment, and then apply those patches which he/she is most familiar with. When you hit a point where you need someone else's expertise, you can do a "(cd patches; git commit -a)" to commit all of the changes in the patch queue so far, and then let someone else take over. They would then do: # Pop all of the patches off the next developers work directory guilt pop -a # Update the patch queue (cd patches; guilt pull) # Now we need to make sure we have the latest kernel patches from mainline git fetch # Now update the work directory to the version specified by the patch # series file git merge $(head patches/series | sed -e 's/# BASE //') # Now resume trying to apply patches, one at a time... # next_patch guilt push -a # if there is a failed patch, force apply it and resolve patch rejects guilt push -f # refresh the patch guilt refresh --diffstat # .... and so on My biggest suggestion, though, is to try to merge partial work earlier rather than later. I'd try getting a partially functioning device driver merged first, and then try to get the optimizations applied earlier. If you don't want people using it in production, that's what the EXPERIMENTAL tag is for... - Ted ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Dividing up a large merge. 2009-07-15 12:28 ` Theodore Tso @ 2009-07-15 13:39 ` Jakub Narebski 2009-07-15 16:07 ` Theodore Tso 2009-07-15 14:47 ` Larry D'Anna 1 sibling, 1 reply; 13+ messages in thread From: Jakub Narebski @ 2009-07-15 13:39 UTC (permalink / raw) To: Theodore Tso; +Cc: davidb, Bryan Donlan, Git Mailing List Theodore Tso <tytso@mit.edu> writes: > Yeah, that's another prime argument for maintaining your changes as a > patch queue. I use a combination of quilt plus git. Why not StGit, or Guilt, or TopGit? -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Dividing up a large merge. 2009-07-15 13:39 ` Jakub Narebski @ 2009-07-15 16:07 ` Theodore Tso 0 siblings, 0 replies; 13+ messages in thread From: Theodore Tso @ 2009-07-15 16:07 UTC (permalink / raw) To: Jakub Narebski; +Cc: davidb, Bryan Donlan, Git Mailing List On Wed, Jul 15, 2009 at 06:39:46AM -0700, Jakub Narebski wrote: > Theodore Tso <tytso@mit.edu> writes: > > > Yeah, that's another prime argument for maintaining your changes as a > > patch queue. I use a combination of quilt plus git. > > Why not StGit, or Guilt, or TopGit? Sorry, typo; that should have read "guilt". The example workflow I included used guilt commands. - Ted ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Dividing up a large merge. 2009-07-15 12:28 ` Theodore Tso 2009-07-15 13:39 ` Jakub Narebski @ 2009-07-15 14:47 ` Larry D'Anna 1 sibling, 0 replies; 13+ messages in thread From: Larry D'Anna @ 2009-07-15 14:47 UTC (permalink / raw) To: Theodore Tso; +Cc: davidb, Bryan Donlan, Git Mailing List * Theodore Tso (tytso@mit.edu) [090715 08:28]: > And, if you believe your release cycle is going to run for more than, > say, 2-3 months, I suggest that you keep things in a single linear > patch stream. You can keep the patch series under git control, and > then rebase periodically; I'd suggest rebasing once a mainline release > happens (i.e., when 2.6.X is released), and then again after most of > the major changes have been merged in and the tree has settled down > (i.e., after 2.6.X-rc2 or 2.6.X-rc3). or use TopGit http://repo.or.cz/w/topgit.git --larry ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Dividing up a large merge. 2009-07-15 0:29 ` davidb 2009-07-15 0:34 ` Avery Pennarun 2009-07-15 12:28 ` Theodore Tso @ 2009-07-15 18:57 ` Daniel Barkalow 2009-07-15 21:01 ` davidb 2 siblings, 1 reply; 13+ messages in thread From: Daniel Barkalow @ 2009-07-15 18:57 UTC (permalink / raw) To: davidb; +Cc: Bryan Donlan, Git Mailing List On Tue, 14 Jul 2009, davidb@quicinc.com wrote: > On Tue, Jul 14, 2009 at 05:16:54PM -0700, Bryan Donlan wrote: > > > What do you mean by describing a merge? git is designed to have all > > the information needed for a merge inherent in the repository history. > > Yes, provided you can actually do the merge all at once. > > > Why are there so many conflicts to make this an issue? > > Because I have to work in the "real world". > > > If the commits are isolated to small changes, rebasing the developer > > topic branches instead of merging may help, by allowing you to take > > conflicts one commit at a time. For example, if your problems are > > primarily conflicts between developer branches and upstream: > > No real developer branches with conflicts (I make those be > fixed), but several upstreams. We have many developers busily > doing work, and one or more other companies is also working on > the same code. Meanwhile, the mainline kernel advances at it's > own astounding rate. > > Unfortunately, paying customers will always get priority of work, > even when that position is actually somewhat shortsighted and it > makes for a lot of merge effort later. > > The real issue is that there isn't any single individual who > understands all of the code that conflicts. It has to be divided > up somehow, I'm just trying to figure out a better way of doing > it. It sounds to me like you're maintaining an internal version that everybody merges their stuff into, and you periodically merge that with the mainline kernel (generating a lot of conflicts which have to be resolved at the same time). Instead of merging the branch that contains a lot of merges, it would probably be easier to merge into a clone of mainline each of the things that was merged before. That is, instead of merging less than all of two trees, you'd merge commits which are not the newest commit on the branch, choosing ones that individuals can resolve. This also has the advantage where, if two of the changes affect an API that's used in various different places, one person will get the responsibility of resolving each of those conflicts, despite them being in the middle of code they don't really understand, because they understand what happened with the API and therefore what has to be done in that little spot. Dividing the merge up by parts of the content would split this work among people who aren't looking at the conflict in the definition of the API. -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Dividing up a large merge. 2009-07-15 18:57 ` Daniel Barkalow @ 2009-07-15 21:01 ` davidb 0 siblings, 0 replies; 13+ messages in thread From: davidb @ 2009-07-15 21:01 UTC (permalink / raw) To: Daniel Barkalow; +Cc: Bryan Donlan, Git Mailing List On Wed, Jul 15, 2009 at 11:57:59AM -0700, Daniel Barkalow wrote: > It sounds to me like you're maintaining an internal version that everybody > merges their stuff into, and you periodically merge that with the mainline > kernel (generating a lot of conflicts which have to be resolved at the > same time). Instead of merging the branch that contains a lot of merges, > it would probably be easier to merge into a clone of mainline each of the > things that was merged before. That is, instead of merging less than all > of two trees, you'd merge commits which are not the newest commit on the > branch, choosing ones that individuals can resolve. That's part of it, although I have a pretty good handle on that part. The place where this comes up is that people in company X are working on an internal version and company Y are working on a similar internal version. We need to share back and forth between these more frequently than stuff gets into the mainline. We do rebase at various points, but it takes quite a bit of work, and it's fairly different work than the conflicts I'm concerned with here. Thanks, David ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2009-07-15 21:01 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-07-14 23:32 Dividing up a large merge davidb 2009-07-15 0:16 ` Bryan Donlan 2009-07-15 0:29 ` davidb 2009-07-15 0:34 ` Avery Pennarun 2009-07-15 1:19 ` davidb 2009-07-15 1:29 ` Douglas Campos 2009-07-15 1:32 ` Avery Pennarun 2009-07-15 12:28 ` Theodore Tso 2009-07-15 13:39 ` Jakub Narebski 2009-07-15 16:07 ` Theodore Tso 2009-07-15 14:47 ` Larry D'Anna 2009-07-15 18:57 ` Daniel Barkalow 2009-07-15 21:01 ` davidb
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).