* Removing useless merge commit with "filter-branch" @ 2012-03-08 23:21 Anatol Pomozov 2012-03-08 23:30 ` Junio C Hamano 0 siblings, 1 reply; 4+ messages in thread From: Anatol Pomozov @ 2012-03-08 23:21 UTC (permalink / raw) To: Git Mailing List Hi, I have a large project (~100K commits) and I need to split a part of it into separate project. What I usually do in this case is git filter-branch --prune-empty --index-filter 'git rm -rfq --cached --ignore-unmatch UNNEEDED_DIRECTORIES' HEAD that works more or less fine for me. The original project has a lot of merge commits (don't ask me why). Basically every non-merge commit is merged back to master branch instead of rebasing on top of the master. In the command above I use --prune-empty parameter that removes empty commits, but not their merge points. This leaves a lot of "useless commit points" like this: | o - merge commit that previously merged feature X |\ | \ | \ o | - real commit | | | / |/ | As of me such merge left-overs are completely useless and I would like to remove them. Actually this task can be split into 2 steps: 1) Remove useless parents. A useless part is the one that points to a commit that is *already* reachable by some other parent. This step converts useless merge points to regular empty commits. 2) run filter branch with --prune-empty that removes such empty commits. So my questions are: 1) What is the best way to remove "useless parents" as in the algorithm above? 2) Should such behavior (remove useless parent/merge commits) be enabled when flag --prune-empty is used? ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Removing useless merge commit with "filter-branch" 2012-03-08 23:21 Removing useless merge commit with "filter-branch" Anatol Pomozov @ 2012-03-08 23:30 ` Junio C Hamano 2012-03-13 22:27 ` Anatol Pomozov 0 siblings, 1 reply; 4+ messages in thread From: Junio C Hamano @ 2012-03-08 23:30 UTC (permalink / raw) To: Anatol Pomozov; +Cc: Git Mailing List Anatol Pomozov <anatol.pomozov@gmail.com> writes: > | > o - merge commit that previously merged feature X > |\ > | \ > | \ > o | - real commit > | | > | / > |/ > | It is unclear how many commits are drawn in the above picture and what "feature X" is about in the above picture. Care to redraw the commit DAG to explain what you are trying to do a bit better? The way I read it is that you start from a history like this (note that when we draw an ascii art history we often write it sideways, time flows from left to right): ---A-----B-----M--- \ / C-------D where a side branch to implement "feature X" that has C and D forked at A, and it was merged at M after somebody else committed B on the mainline. When you filtered out some parts of the tree, it turns out that C and D are totally unintereseting because their changes touch parts outside of your interest, i.e. the history is: ---A-----B-----M--- \ / o-------o where 'o' are now no-op. Is that what you are talking about? I think "log --simplify-merges A..M -- path" may already has logic that deals with this, so it may help if you study what it does and how it does what it does. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Removing useless merge commit with "filter-branch" 2012-03-08 23:30 ` Junio C Hamano @ 2012-03-13 22:27 ` Anatol Pomozov 2012-03-29 18:26 ` Anatol Pomozov 0 siblings, 1 reply; 4+ messages in thread From: Anatol Pomozov @ 2012-03-13 22:27 UTC (permalink / raw) To: Junio C Hamano; +Cc: Git Mailing List Hi On Thu, Mar 8, 2012 at 3:30 PM, Junio C Hamano <gitster@pobox.com> wrote: > Anatol Pomozov <anatol.pomozov@gmail.com> writes: > >> | >> o - merge commit that previously merged feature X >> |\ >> | \ >> | \ >> o | - real commit >> | | >> | / >> |/ >> | > > It is unclear how many commits are drawn in the above picture and > what "feature X" is about in the above picture. Care to redraw the > commit DAG to explain what you are trying to do a bit better? > > The way I read it is that you start from a history like this (note > that when we draw an ascii art history we often write it sideways, > time flows from left to right): > > ---A-----B-----M--- > \ / > C-------D > > where a side branch to implement "feature X" that has C and D forked > at A, and it was merged at M after somebody else committed B on the > mainline. When you filtered out some parts of the tree, it turns > out that C and D are totally unintereseting because their changes > touch parts outside of your interest, i.e. the history is: > > ---A-----B-----M--- > \ / > o-------o > > where 'o' are now no-op. > > Is that what you are talking about? Yes, in fact --prune-empty flag removes empty commits so the history looks like -----A-------B-------M-------- \ / -------------- So M is a merge that has 2 parents A and B. I would like to remove this merge M and leave the history as -----A-----B----- as only these commits have changes in my library that I am trying to extract. I think some trickery with "git filter-branch --parent-filter" should help here. First one runs filter-branch with --parent-filter and removes useless parents from merges (in this example with will be parent A---M), this converts such merges to regular empty commits then run filter-branch one more time with --prune-empty - it removes empty commits. > > I think "log --simplify-merges A..M -- path" may already has logic > that deals with this, so it may help if you study what it does and > how it does what it does. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Removing useless merge commit with "filter-branch" 2012-03-13 22:27 ` Anatol Pomozov @ 2012-03-29 18:26 ` Anatol Pomozov 0 siblings, 0 replies; 4+ messages in thread From: Anatol Pomozov @ 2012-03-29 18:26 UTC (permalink / raw) To: Junio C Hamano; +Cc: Git Mailing List Hi, I solved my issue by using "git filter-branch --parent-filter". The idea is to visit all commits and remove all "dependent" parents. To find independent parents I used "git show-branch --independent PARENT_1..PARENT_N" command. In my case (we have a lot of short-term development branches) this converted ~80% of all merges to "empty non-merge commits" and later "git filter-branch --prune-empty" removed them. This made history of my project much more simpler and linear. I think such functionality should be available in git and enabled when "--prune-empty" flag is used. So "--prune-empty" removes not only simple commits but also empty useless merge commits. Or maybe add a --prune-empty-merges flag? Anyway here is the script that I use, future readers might find it useful: $ git filter-branch -f --prune-empty --parent-filter PATH_TO/rewrite_parent.rb master $ cat rewrite_parent.rb #!/usr/bin/ruby old_parents = gets.chomp.gsub('-p ', ' ') if old_parents.empty? then new_parents = [] else new_parents = `git show-branch --independent #{old_parents}`.split end puts new_parents.map{|p| '-p ' + p}.join(' ') Most likely the script can be rewritten as one-line shell script. On Tue, Mar 13, 2012 at 3:27 PM, Anatol Pomozov <anatol.pomozov@gmail.com> wrote: > Hi > > On Thu, Mar 8, 2012 at 3:30 PM, Junio C Hamano <gitster@pobox.com> wrote: >> Anatol Pomozov <anatol.pomozov@gmail.com> writes: >> >>> | >>> o - merge commit that previously merged feature X >>> |\ >>> | \ >>> | \ >>> o | - real commit >>> | | >>> | / >>> |/ >>> | >> >> It is unclear how many commits are drawn in the above picture and >> what "feature X" is about in the above picture. Care to redraw the >> commit DAG to explain what you are trying to do a bit better? >> >> The way I read it is that you start from a history like this (note >> that when we draw an ascii art history we often write it sideways, >> time flows from left to right): >> >> ---A-----B-----M--- >> \ / >> C-------D >> >> where a side branch to implement "feature X" that has C and D forked >> at A, and it was merged at M after somebody else committed B on the >> mainline. When you filtered out some parts of the tree, it turns >> out that C and D are totally unintereseting because their changes >> touch parts outside of your interest, i.e. the history is: >> >> ---A-----B-----M--- >> \ / >> o-------o >> >> where 'o' are now no-op. >> >> Is that what you are talking about? > > Yes, in fact --prune-empty flag removes empty commits so the history looks like > > -----A-------B-------M-------- > \ / > -------------- > > > So M is a merge that has 2 parents A and B. I would like to remove > this merge M and leave the history as > > -----A-----B----- > > as only these commits have changes in my library that I am trying to extract. > > I think some trickery with "git filter-branch --parent-filter" should help here. > > First one runs filter-branch with --parent-filter and removes useless > parents from merges (in this example with will be parent A---M), this > converts such merges to regular empty commits > > then run filter-branch one more time with --prune-empty - it removes > empty commits. >> >> I think "log --simplify-merges A..M -- path" may already has logic >> that deals with this, so it may help if you study what it does and >> how it does what it does. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2012-03-29 18:26 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-03-08 23:21 Removing useless merge commit with "filter-branch" Anatol Pomozov 2012-03-08 23:30 ` Junio C Hamano 2012-03-13 22:27 ` Anatol Pomozov 2012-03-29 18:26 ` Anatol Pomozov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).