* My custom cccmd
@ 2009-10-15 13:20 Felipe Contreras
2009-10-15 20:09 ` Junio C Hamano
0 siblings, 1 reply; 8+ messages in thread
From: Felipe Contreras @ 2009-10-15 13:20 UTC (permalink / raw)
To: git
Hi,
I love the new option to run a cccmd and how good it works on the
linux kernel, but I couldn't find a generic script. So I decided to
write my own.
It's very simple, it just looks into the authors of the commits that
modified the lines being overridden (git blame). It's not checking for
s-o-b, or anything fancy.
Comments?
#!/usr/bin/env ruby
@commits = {} # keeps a count of commits per author
ARGV.each do |filename|
File.open(filename) do |patch_file|
patch_file.each_line do |patch_line|
case patch_line
when /^---\s+(\S+)/
@source = $1[2..-1]
when /^@@\s-(\d+),(\d+)/
blame = `git blame -p -L #{$1},+#{$2} #{@source} | grep author`
blame.each_line do |al|
key, value = al.chomp.split(" ", 2)
case key
when "author"
@name = value
when "author-mail"
@mail = value
author = "\"#{@name}\" #{@mail}"
@commits[author] ||= 0
@commits[author] += 1
end
end
end
end
end
end
@commits.each_key do |a|
puts a
end
--
Felipe Contreras
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: My custom cccmd
2009-10-15 13:20 My custom cccmd Felipe Contreras
@ 2009-10-15 20:09 ` Junio C Hamano
2009-10-15 21:37 ` Felipe Contreras
0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2009-10-15 20:09 UTC (permalink / raw)
To: Felipe Contreras; +Cc: git
Felipe Contreras <felipe.contreras@gmail.com> writes:
> Hi,
>
> I love the new option to run a cccmd and how good it works on the
> linux kernel, but I couldn't find a generic script. So I decided to
> write my own.
>
> It's very simple, it just looks into the authors of the commits that
> modified the lines being overridden (git blame). It's not checking for
> s-o-b, or anything fancy.
>
> Comments?
> #!/usr/bin/env ruby
>
> @commits = {} # keeps a count of commits per author
>
> ARGV.each do |filename|
> File.open(filename) do |patch_file|
> patch_file.each_line do |patch_line|
> case patch_line
> when /^---\s+(\S+)/
> @source = $1[2..-1]
> when /^@@\s-(\d+),(\d+)/
> blame = `git blame -p -L #{$1},+#{$2} #{@source} | grep author`
> blame.each_line do |al|
> key, value = al.chomp.split(" ", 2)
> case key
> when "author"
> @name = value
> when "author-mail"
> @mail = value
> author = "\"#{@name}\" #{@mail}"
> @commits[author] ||= 0
> @commits[author] += 1
> end
> end
> end
> end
> end
> end
Comments.
#0. Gaahhh, my eyes, my eyes!! Can't you do this ugly run of infinite
number of "end"s?
#1. You are not making sure that you start blaming from the commit the
patch is based on, so your -La,b line numbers can be off. If you can
assume that you are always reading format-patch output, you can learn
which commit to start from by reading the first "magic" line.
#2. If you have two patch series that updates one file twice, some
changes in your second patch could even be an update to the changes
you introduced in your first patch. After you fix issue #1, you
would probably want to fix this by excluding the commits you have
already sent the blames for.
#3. Does the number of commits you keep per author have any significance?
I know it doesn't in the implementation you posted, but should it,
and if so how?
> @commits.each_key do |a|
> puts a
> end
>
> --
> Felipe Contreras
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: My custom cccmd
2009-10-15 20:09 ` Junio C Hamano
@ 2009-10-15 21:37 ` Felipe Contreras
2009-10-25 15:04 ` Felipe Contreras
0 siblings, 1 reply; 8+ messages in thread
From: Felipe Contreras @ 2009-10-15 21:37 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Thu, Oct 15, 2009 at 11:09 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Felipe Contreras <felipe.contreras@gmail.com> writes:
>
>> Hi,
>>
>> I love the new option to run a cccmd and how good it works on the
>> linux kernel, but I couldn't find a generic script. So I decided to
>> write my own.
>>
>> It's very simple, it just looks into the authors of the commits that
>> modified the lines being overridden (git blame). It's not checking for
>> s-o-b, or anything fancy.
>>
<snip/>
> Comments.
>
> #0. Gaahhh, my eyes, my eyes!! Can't you do this ugly run of infinite
> number of "end"s?
Hehe, sure. Will do.
> #1. You are not making sure that you start blaming from the commit the
> patch is based on, so your -La,b line numbers can be off. If you can
> assume that you are always reading format-patch output, you can learn
> which commit to start from by reading the first "magic" line.
The 'From' magic line points to the actual commit the patch was
generated from, so it would actually be @from^.
This of course would only work if the patches have the corresponding
commits in the current tree (which is the case most of the time).
And makes sense only for the first patch, the rest of the patches
would use a wrong commit as a base. See below.
> #2. If you have two patch series that updates one file twice, some
> changes in your second patch could even be an update to the changes
> you introduced in your first patch. After you fix issue #1, you
> would probably want to fix this by excluding the commits you have
> already sent the blames for.
How exactly? By looking at the commit from 'git blame' and discarding
it? That would be a bit tricky since each instance of 'cccmd' is not
aware of the previous ones.
> #3. Does the number of commits you keep per author have any significance?
> I know it doesn't in the implementation you posted, but should it,
> and if so how?
Not currently. Once I add support for s-o-b it might be useful.
Currently I left it in order to order the CC's by the count, but it
turned out to be a bit messier than I thought, and the advantage is
almost nothing.
I'll clean it up.
Taking in consideration the previous comments, here is v2:
#!/usr/bin/env ruby
@authors = {}
def parse_blame(line)
key, value = line.split(" ", 2)
case key
when "author"
@name = value
when "author-mail"
@mail = value
author = "\"#{@name}\" #{@mail}"
@authors[author] = true
end
end
ARGV.each do |filename|
patch_file = File.open(filename)
patch_file.each_line do |patch_line|
case patch_line
when /^---\s+(\S+)/
@source = $1[2..-1]
when /^@@\s-(\d+),(\d+)/
blame = `git blame -p -L #{$1},+#{$2} #{@source} | grep author`
blame.each_line { |l| parse_blame(l.chomp) }
end
end
end
@authors.each_key do |a|
puts a
end
--
Felipe Contreras
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: My custom cccmd
2009-10-15 21:37 ` Felipe Contreras
@ 2009-10-25 15:04 ` Felipe Contreras
2009-10-27 21:53 ` Junio C Hamano
0 siblings, 1 reply; 8+ messages in thread
From: Felipe Contreras @ 2009-10-25 15:04 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Thu, Oct 15, 2009 at 11:37 PM, Felipe Contreras
<felipe.contreras@gmail.com> wrote:
> On Thu, Oct 15, 2009 at 11:09 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> #2. If you have two patch series that updates one file twice, some
>> changes in your second patch could even be an update to the changes
>> you introduced in your first patch. After you fix issue #1, you
>> would probably want to fix this by excluding the commits you have
>> already sent the blames for.
>
> How exactly? By looking at the commit from 'git blame' and discarding
> it? That would be a bit tricky since each instance of 'cccmd' is not
> aware of the previous ones.
I explored this a bit more and I've come to the conclusion that we
actually don't wand to discard the previous commits in the patch
series. Let's think about this example:
0001 <- John
0002 <- Me overriding some changes from John
In this case we want John to appear in the CC list of 0002, because we
are changing his code. If it turns out that the whole patch series was
written by me, it's still ok for me to appear on the CC list, which
would essentially be "discarding" those commits.
So all we need to do is pick @from^ as the base of the git blame:
@@ -17,11 +17,12 @@ end
ARGV.each do |filename|
patch_file = File.open(filename)
patch_file.each_line do |patch_line|
+ @from ||= patch_line[/From (\w+)/, 1]
case patch_line
when /^---\s+(\S+)/
@source = $1[2..-1]
when /^@@\s-(\d+),(\d+)/
- blame = `git blame -p -L #{$1},+#{$2} #{@source} | grep author`
+ blame = `git blame --incremental -L #{$1},+#{$2} #{@source}
#{@from}^ | grep author`
blame.each_line { |l| parse_blame(l.chomp) }
end
end
You can find the whole script here:
http://gist.github.com/218093
Cheers.
--
Felipe Contreras
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: My custom cccmd
2009-10-25 15:04 ` Felipe Contreras
@ 2009-10-27 21:53 ` Junio C Hamano
2009-10-30 8:39 ` Felipe Contreras
0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2009-10-27 21:53 UTC (permalink / raw)
To: Felipe Contreras; +Cc: git
Felipe Contreras <felipe.contreras@gmail.com> writes:
> I explored this a bit more and I've come to the conclusion that we
> actually don't wand to discard the previous commits in the patch
> series. Let's think about this example:
> 0001 <- John
> 0002 <- Me overriding some changes from John
>
> In this case we want John to appear in the CC list of 0002, because we
> are changing his code.
There are two cases: your patch partially overrides his code, and your
patch completely rewrites/removes his code.
Obviously in the former case you do want to Cc, but there are parts of his
change that remain in the final result, so this case does not matter in
this discussion. You will Cc him because of these remaining lines anyway.
In the latter case, the only contribution that remains from him is a
commit with his log message that does not help describing anything in the
end result, cluttering the history.
In such a case, I would imagine that the reviewers would want to see a
cleaned up series that does not have his patch that is irrelevant for
understanding the final result. John might want to know about it, if only
to raise objection to the way how you arranged your series. For that
reason, I think it makes sense to Cc him.
But it is a rather a convoluted logic, if you ask me. You find John and
Cc him, primarily so that he can point out that you should be redoing the
series not to have his patch as an intermediate state in the series to
begin with, because his commit does not contribute to the end result.
It might make more sense if your tool told you about such a case directly,
rather than helping you find John so that he can tell you ;-).
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: My custom cccmd
2009-10-27 21:53 ` Junio C Hamano
@ 2009-10-30 8:39 ` Felipe Contreras
2009-10-30 21:52 ` Junio C Hamano
0 siblings, 1 reply; 8+ messages in thread
From: Felipe Contreras @ 2009-10-30 8:39 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Tue, Oct 27, 2009 at 11:53 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Felipe Contreras <felipe.contreras@gmail.com> writes:
>
>> I explored this a bit more and I've come to the conclusion that we
>> actually don't wand to discard the previous commits in the patch
>> series. Let's think about this example:
>> 0001 <- John
>> 0002 <- Me overriding some changes from John
>>
>> In this case we want John to appear in the CC list of 0002, because we
>> are changing his code.
>
> There are two cases: your patch partially overrides his code, and your
> patch completely rewrites/removes his code.
>
> Obviously in the former case you do want to Cc, but there are parts of his
> change that remain in the final result, so this case does not matter in
> this discussion. You will Cc him because of these remaining lines anyway.
>
> In the latter case, the only contribution that remains from him is a
> commit with his log message that does not help describing anything in the
> end result, cluttering the history.
>
> In such a case, I would imagine that the reviewers would want to see a
> cleaned up series that does not have his patch that is irrelevant for
> understanding the final result. John might want to know about it, if only
> to raise objection to the way how you arranged your series. For that
> reason, I think it makes sense to Cc him.
>
> But it is a rather a convoluted logic, if you ask me. You find John and
> Cc him, primarily so that he can point out that you should be redoing the
> series not to have his patch as an intermediate state in the series to
> begin with, because his commit does not contribute to the end result.
>
> It might make more sense if your tool told you about such a case directly,
> rather than helping you find John so that he can tell you ;-).
But that's not the purpose of the cccmd tool.
I agree that such "patch series simplificator" tool would be very
useful, but that's out of scope for this. Isn't it?
--
Felipe Contreras
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: My custom cccmd
2009-10-30 8:39 ` Felipe Contreras
@ 2009-10-30 21:52 ` Junio C Hamano
2009-11-02 14:25 ` Felipe Contreras
0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2009-10-30 21:52 UTC (permalink / raw)
To: Felipe Contreras; +Cc: Junio C Hamano, git
Felipe Contreras <felipe.contreras@gmail.com> writes:
> On Tue, Oct 27, 2009 at 11:53 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Felipe Contreras <felipe.contreras@gmail.com> writes:
>>
>>> I explored this a bit more and I've come to the conclusion that we
>>> actually don't wand to discard the previous commits in the patch
>>> series. Let's think about this example:
>>> 0001 <- John
>>> 0002 <- Me overriding some changes from John
>>>
>>> In this case we want John to appear in the CC list of 0002, because we
>>> are changing his code.
>> ...
>> In such a case, I would imagine that the reviewers would want to see a
>> cleaned up series that does not have his patch that is irrelevant for
>> understanding the final result. John might want to know about it, if only
>> to raise objection to the way how you arranged your series. For that
>> reason, I think it makes sense to Cc him.
>>
>> But it is a rather a convoluted logic, if you ask me. You find John and
>> Cc him, primarily so that he can point out that you should be redoing the
>> series not to have his patch as an intermediate state in the series to
>> begin with, because his commit does not contribute to the end result.
>>
>> It might make more sense if your tool told you about such a case directly,
>> rather than helping you find John so that he can tell you ;-).
>
> But that's not the purpose of the cccmd tool.
>
> I agree that such "patch series simplificator" tool would be very
> useful, but that's out of scope for this. Isn't it?
Exactly.
So you agree that you _do_ want to "discard the previous commits in the
patch series", because not doing so would mean the result would be a
half-cooked "patch series simplificator" that tries to do something that
is outside the scope of cccmd, right?
The "discarding the previous commits" happens to match what I suggested
earlier that lead to your "explored this a bit more":
On Thu, Oct 15, 2009 at 11:37 PM, Felipe Contreras <felipe.contreras@gmail.com> wrote:
> On Thu, Oct 15, 2009 at 11:09 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> #2. If you have two patch series that updates one file twice, some
>> changes in your second patch could even be an update to the changes
>> you introduced in your first patch. After you fix issue #1, you
>> would probably want to fix this by excluding the commits you have
>> already sent the blames for.
so I think we are in agreement.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: My custom cccmd
2009-10-30 21:52 ` Junio C Hamano
@ 2009-11-02 14:25 ` Felipe Contreras
0 siblings, 0 replies; 8+ messages in thread
From: Felipe Contreras @ 2009-11-02 14:25 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Fri, Oct 30, 2009 at 11:52 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Felipe Contreras <felipe.contreras@gmail.com> writes:
>
>> On Tue, Oct 27, 2009 at 11:53 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>> It might make more sense if your tool told you about such a case directly,
>>> rather than helping you find John so that he can tell you ;-).
>>
>> But that's not the purpose of the cccmd tool.
>>
>> I agree that such "patch series simplificator" tool would be very
>> useful, but that's out of scope for this. Isn't it?
>
> Exactly.
>
> So you agree that you _do_ want to "discard the previous commits in the
> patch series", because not doing so would mean the result would be a
> half-cooked "patch series simplificator" that tries to do something that
> is outside the scope of cccmd, right?
>
> The "discarding the previous commits" happens to match what I suggested
> earlier that lead to your "explored this a bit more":
>
> On Thu, Oct 15, 2009 at 11:37 PM, Felipe Contreras <felipe.contreras@gmail.com> wrote:
>> On Thu, Oct 15, 2009 at 11:09 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>> #2. If you have two patch series that updates one file twice, some
>>> changes in your second patch could even be an update to the changes
>>> you introduced in your first patch. After you fix issue #1, you
>>> would probably want to fix this by excluding the commits you have
>>> already sent the blames for.
>
> so I think we are in agreement.
Cool. Is this script something that would make sense in the contrib section?
--
Felipe Contreras
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2009-11-02 14:26 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-15 13:20 My custom cccmd Felipe Contreras
2009-10-15 20:09 ` Junio C Hamano
2009-10-15 21:37 ` Felipe Contreras
2009-10-25 15:04 ` Felipe Contreras
2009-10-27 21:53 ` Junio C Hamano
2009-10-30 8:39 ` Felipe Contreras
2009-10-30 21:52 ` Junio C Hamano
2009-11-02 14:25 ` Felipe Contreras
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).