* My custom cccmd @ 2009-10-15 13:20 Felipe Contreras 2009-10-15 20:09 ` Junio C Hamano 0 siblings, 1 reply; 8+ messages in thread From: Felipe Contreras @ 2009-10-15 13:20 UTC (permalink / raw) To: git Hi, I love the new option to run a cccmd and how good it works on the linux kernel, but I couldn't find a generic script. So I decided to write my own. It's very simple, it just looks into the authors of the commits that modified the lines being overridden (git blame). It's not checking for s-o-b, or anything fancy. Comments? #!/usr/bin/env ruby @commits = {} # keeps a count of commits per author ARGV.each do |filename| File.open(filename) do |patch_file| patch_file.each_line do |patch_line| case patch_line when /^---\s+(\S+)/ @source = $1[2..-1] when /^@@\s-(\d+),(\d+)/ blame = `git blame -p -L #{$1},+#{$2} #{@source} | grep author` blame.each_line do |al| key, value = al.chomp.split(" ", 2) case key when "author" @name = value when "author-mail" @mail = value author = "\"#{@name}\" #{@mail}" @commits[author] ||= 0 @commits[author] += 1 end end end end end end @commits.each_key do |a| puts a end -- Felipe Contreras ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: My custom cccmd 2009-10-15 13:20 My custom cccmd Felipe Contreras @ 2009-10-15 20:09 ` Junio C Hamano 2009-10-15 21:37 ` Felipe Contreras 0 siblings, 1 reply; 8+ messages in thread From: Junio C Hamano @ 2009-10-15 20:09 UTC (permalink / raw) To: Felipe Contreras; +Cc: git Felipe Contreras <felipe.contreras@gmail.com> writes: > Hi, > > I love the new option to run a cccmd and how good it works on the > linux kernel, but I couldn't find a generic script. So I decided to > write my own. > > It's very simple, it just looks into the authors of the commits that > modified the lines being overridden (git blame). It's not checking for > s-o-b, or anything fancy. > > Comments? > #!/usr/bin/env ruby > > @commits = {} # keeps a count of commits per author > > ARGV.each do |filename| > File.open(filename) do |patch_file| > patch_file.each_line do |patch_line| > case patch_line > when /^---\s+(\S+)/ > @source = $1[2..-1] > when /^@@\s-(\d+),(\d+)/ > blame = `git blame -p -L #{$1},+#{$2} #{@source} | grep author` > blame.each_line do |al| > key, value = al.chomp.split(" ", 2) > case key > when "author" > @name = value > when "author-mail" > @mail = value > author = "\"#{@name}\" #{@mail}" > @commits[author] ||= 0 > @commits[author] += 1 > end > end > end > end > end > end Comments. #0. Gaahhh, my eyes, my eyes!! Can't you do this ugly run of infinite number of "end"s? #1. You are not making sure that you start blaming from the commit the patch is based on, so your -La,b line numbers can be off. If you can assume that you are always reading format-patch output, you can learn which commit to start from by reading the first "magic" line. #2. If you have two patch series that updates one file twice, some changes in your second patch could even be an update to the changes you introduced in your first patch. After you fix issue #1, you would probably want to fix this by excluding the commits you have already sent the blames for. #3. Does the number of commits you keep per author have any significance? I know it doesn't in the implementation you posted, but should it, and if so how? > @commits.each_key do |a| > puts a > end > > -- > Felipe Contreras > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: My custom cccmd 2009-10-15 20:09 ` Junio C Hamano @ 2009-10-15 21:37 ` Felipe Contreras 2009-10-25 15:04 ` Felipe Contreras 0 siblings, 1 reply; 8+ messages in thread From: Felipe Contreras @ 2009-10-15 21:37 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Thu, Oct 15, 2009 at 11:09 PM, Junio C Hamano <gitster@pobox.com> wrote: > Felipe Contreras <felipe.contreras@gmail.com> writes: > >> Hi, >> >> I love the new option to run a cccmd and how good it works on the >> linux kernel, but I couldn't find a generic script. So I decided to >> write my own. >> >> It's very simple, it just looks into the authors of the commits that >> modified the lines being overridden (git blame). It's not checking for >> s-o-b, or anything fancy. >> <snip/> > Comments. > > #0. Gaahhh, my eyes, my eyes!! Can't you do this ugly run of infinite > number of "end"s? Hehe, sure. Will do. > #1. You are not making sure that you start blaming from the commit the > patch is based on, so your -La,b line numbers can be off. If you can > assume that you are always reading format-patch output, you can learn > which commit to start from by reading the first "magic" line. The 'From' magic line points to the actual commit the patch was generated from, so it would actually be @from^. This of course would only work if the patches have the corresponding commits in the current tree (which is the case most of the time). And makes sense only for the first patch, the rest of the patches would use a wrong commit as a base. See below. > #2. If you have two patch series that updates one file twice, some > changes in your second patch could even be an update to the changes > you introduced in your first patch. After you fix issue #1, you > would probably want to fix this by excluding the commits you have > already sent the blames for. How exactly? By looking at the commit from 'git blame' and discarding it? That would be a bit tricky since each instance of 'cccmd' is not aware of the previous ones. > #3. Does the number of commits you keep per author have any significance? > I know it doesn't in the implementation you posted, but should it, > and if so how? Not currently. Once I add support for s-o-b it might be useful. Currently I left it in order to order the CC's by the count, but it turned out to be a bit messier than I thought, and the advantage is almost nothing. I'll clean it up. Taking in consideration the previous comments, here is v2: #!/usr/bin/env ruby @authors = {} def parse_blame(line) key, value = line.split(" ", 2) case key when "author" @name = value when "author-mail" @mail = value author = "\"#{@name}\" #{@mail}" @authors[author] = true end end ARGV.each do |filename| patch_file = File.open(filename) patch_file.each_line do |patch_line| case patch_line when /^---\s+(\S+)/ @source = $1[2..-1] when /^@@\s-(\d+),(\d+)/ blame = `git blame -p -L #{$1},+#{$2} #{@source} | grep author` blame.each_line { |l| parse_blame(l.chomp) } end end end @authors.each_key do |a| puts a end -- Felipe Contreras ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: My custom cccmd 2009-10-15 21:37 ` Felipe Contreras @ 2009-10-25 15:04 ` Felipe Contreras 2009-10-27 21:53 ` Junio C Hamano 0 siblings, 1 reply; 8+ messages in thread From: Felipe Contreras @ 2009-10-25 15:04 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Thu, Oct 15, 2009 at 11:37 PM, Felipe Contreras <felipe.contreras@gmail.com> wrote: > On Thu, Oct 15, 2009 at 11:09 PM, Junio C Hamano <gitster@pobox.com> wrote: >> #2. If you have two patch series that updates one file twice, some >> changes in your second patch could even be an update to the changes >> you introduced in your first patch. After you fix issue #1, you >> would probably want to fix this by excluding the commits you have >> already sent the blames for. > > How exactly? By looking at the commit from 'git blame' and discarding > it? That would be a bit tricky since each instance of 'cccmd' is not > aware of the previous ones. I explored this a bit more and I've come to the conclusion that we actually don't wand to discard the previous commits in the patch series. Let's think about this example: 0001 <- John 0002 <- Me overriding some changes from John In this case we want John to appear in the CC list of 0002, because we are changing his code. If it turns out that the whole patch series was written by me, it's still ok for me to appear on the CC list, which would essentially be "discarding" those commits. So all we need to do is pick @from^ as the base of the git blame: @@ -17,11 +17,12 @@ end ARGV.each do |filename| patch_file = File.open(filename) patch_file.each_line do |patch_line| + @from ||= patch_line[/From (\w+)/, 1] case patch_line when /^---\s+(\S+)/ @source = $1[2..-1] when /^@@\s-(\d+),(\d+)/ - blame = `git blame -p -L #{$1},+#{$2} #{@source} | grep author` + blame = `git blame --incremental -L #{$1},+#{$2} #{@source} #{@from}^ | grep author` blame.each_line { |l| parse_blame(l.chomp) } end end You can find the whole script here: http://gist.github.com/218093 Cheers. -- Felipe Contreras ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: My custom cccmd 2009-10-25 15:04 ` Felipe Contreras @ 2009-10-27 21:53 ` Junio C Hamano 2009-10-30 8:39 ` Felipe Contreras 0 siblings, 1 reply; 8+ messages in thread From: Junio C Hamano @ 2009-10-27 21:53 UTC (permalink / raw) To: Felipe Contreras; +Cc: git Felipe Contreras <felipe.contreras@gmail.com> writes: > I explored this a bit more and I've come to the conclusion that we > actually don't wand to discard the previous commits in the patch > series. Let's think about this example: > 0001 <- John > 0002 <- Me overriding some changes from John > > In this case we want John to appear in the CC list of 0002, because we > are changing his code. There are two cases: your patch partially overrides his code, and your patch completely rewrites/removes his code. Obviously in the former case you do want to Cc, but there are parts of his change that remain in the final result, so this case does not matter in this discussion. You will Cc him because of these remaining lines anyway. In the latter case, the only contribution that remains from him is a commit with his log message that does not help describing anything in the end result, cluttering the history. In such a case, I would imagine that the reviewers would want to see a cleaned up series that does not have his patch that is irrelevant for understanding the final result. John might want to know about it, if only to raise objection to the way how you arranged your series. For that reason, I think it makes sense to Cc him. But it is a rather a convoluted logic, if you ask me. You find John and Cc him, primarily so that he can point out that you should be redoing the series not to have his patch as an intermediate state in the series to begin with, because his commit does not contribute to the end result. It might make more sense if your tool told you about such a case directly, rather than helping you find John so that he can tell you ;-). ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: My custom cccmd 2009-10-27 21:53 ` Junio C Hamano @ 2009-10-30 8:39 ` Felipe Contreras 2009-10-30 21:52 ` Junio C Hamano 0 siblings, 1 reply; 8+ messages in thread From: Felipe Contreras @ 2009-10-30 8:39 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Tue, Oct 27, 2009 at 11:53 PM, Junio C Hamano <gitster@pobox.com> wrote: > Felipe Contreras <felipe.contreras@gmail.com> writes: > >> I explored this a bit more and I've come to the conclusion that we >> actually don't wand to discard the previous commits in the patch >> series. Let's think about this example: >> 0001 <- John >> 0002 <- Me overriding some changes from John >> >> In this case we want John to appear in the CC list of 0002, because we >> are changing his code. > > There are two cases: your patch partially overrides his code, and your > patch completely rewrites/removes his code. > > Obviously in the former case you do want to Cc, but there are parts of his > change that remain in the final result, so this case does not matter in > this discussion. You will Cc him because of these remaining lines anyway. > > In the latter case, the only contribution that remains from him is a > commit with his log message that does not help describing anything in the > end result, cluttering the history. > > In such a case, I would imagine that the reviewers would want to see a > cleaned up series that does not have his patch that is irrelevant for > understanding the final result. John might want to know about it, if only > to raise objection to the way how you arranged your series. For that > reason, I think it makes sense to Cc him. > > But it is a rather a convoluted logic, if you ask me. You find John and > Cc him, primarily so that he can point out that you should be redoing the > series not to have his patch as an intermediate state in the series to > begin with, because his commit does not contribute to the end result. > > It might make more sense if your tool told you about such a case directly, > rather than helping you find John so that he can tell you ;-). But that's not the purpose of the cccmd tool. I agree that such "patch series simplificator" tool would be very useful, but that's out of scope for this. Isn't it? -- Felipe Contreras ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: My custom cccmd 2009-10-30 8:39 ` Felipe Contreras @ 2009-10-30 21:52 ` Junio C Hamano 2009-11-02 14:25 ` Felipe Contreras 0 siblings, 1 reply; 8+ messages in thread From: Junio C Hamano @ 2009-10-30 21:52 UTC (permalink / raw) To: Felipe Contreras; +Cc: Junio C Hamano, git Felipe Contreras <felipe.contreras@gmail.com> writes: > On Tue, Oct 27, 2009 at 11:53 PM, Junio C Hamano <gitster@pobox.com> wrote: >> Felipe Contreras <felipe.contreras@gmail.com> writes: >> >>> I explored this a bit more and I've come to the conclusion that we >>> actually don't wand to discard the previous commits in the patch >>> series. Let's think about this example: >>> 0001 <- John >>> 0002 <- Me overriding some changes from John >>> >>> In this case we want John to appear in the CC list of 0002, because we >>> are changing his code. >> ... >> In such a case, I would imagine that the reviewers would want to see a >> cleaned up series that does not have his patch that is irrelevant for >> understanding the final result. John might want to know about it, if only >> to raise objection to the way how you arranged your series. For that >> reason, I think it makes sense to Cc him. >> >> But it is a rather a convoluted logic, if you ask me. You find John and >> Cc him, primarily so that he can point out that you should be redoing the >> series not to have his patch as an intermediate state in the series to >> begin with, because his commit does not contribute to the end result. >> >> It might make more sense if your tool told you about such a case directly, >> rather than helping you find John so that he can tell you ;-). > > But that's not the purpose of the cccmd tool. > > I agree that such "patch series simplificator" tool would be very > useful, but that's out of scope for this. Isn't it? Exactly. So you agree that you _do_ want to "discard the previous commits in the patch series", because not doing so would mean the result would be a half-cooked "patch series simplificator" that tries to do something that is outside the scope of cccmd, right? The "discarding the previous commits" happens to match what I suggested earlier that lead to your "explored this a bit more": On Thu, Oct 15, 2009 at 11:37 PM, Felipe Contreras <felipe.contreras@gmail.com> wrote: > On Thu, Oct 15, 2009 at 11:09 PM, Junio C Hamano <gitster@pobox.com> wrote: >> #2. If you have two patch series that updates one file twice, some >> changes in your second patch could even be an update to the changes >> you introduced in your first patch. After you fix issue #1, you >> would probably want to fix this by excluding the commits you have >> already sent the blames for. so I think we are in agreement. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: My custom cccmd 2009-10-30 21:52 ` Junio C Hamano @ 2009-11-02 14:25 ` Felipe Contreras 0 siblings, 0 replies; 8+ messages in thread From: Felipe Contreras @ 2009-11-02 14:25 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Fri, Oct 30, 2009 at 11:52 PM, Junio C Hamano <gitster@pobox.com> wrote: > Felipe Contreras <felipe.contreras@gmail.com> writes: > >> On Tue, Oct 27, 2009 at 11:53 PM, Junio C Hamano <gitster@pobox.com> wrote: >>> It might make more sense if your tool told you about such a case directly, >>> rather than helping you find John so that he can tell you ;-). >> >> But that's not the purpose of the cccmd tool. >> >> I agree that such "patch series simplificator" tool would be very >> useful, but that's out of scope for this. Isn't it? > > Exactly. > > So you agree that you _do_ want to "discard the previous commits in the > patch series", because not doing so would mean the result would be a > half-cooked "patch series simplificator" that tries to do something that > is outside the scope of cccmd, right? > > The "discarding the previous commits" happens to match what I suggested > earlier that lead to your "explored this a bit more": > > On Thu, Oct 15, 2009 at 11:37 PM, Felipe Contreras <felipe.contreras@gmail.com> wrote: >> On Thu, Oct 15, 2009 at 11:09 PM, Junio C Hamano <gitster@pobox.com> wrote: >>> #2. If you have two patch series that updates one file twice, some >>> changes in your second patch could even be an update to the changes >>> you introduced in your first patch. After you fix issue #1, you >>> would probably want to fix this by excluding the commits you have >>> already sent the blames for. > > so I think we are in agreement. Cool. Is this script something that would make sense in the contrib section? -- Felipe Contreras ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2009-11-02 14:26 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-10-15 13:20 My custom cccmd Felipe Contreras 2009-10-15 20:09 ` Junio C Hamano 2009-10-15 21:37 ` Felipe Contreras 2009-10-25 15:04 ` Felipe Contreras 2009-10-27 21:53 ` Junio C Hamano 2009-10-30 8:39 ` Felipe Contreras 2009-10-30 21:52 ` Junio C Hamano 2009-11-02 14:25 ` Felipe Contreras
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).