My custom cccmd

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* My custom cccmd
@ 2009-10-15 13:20 Felipe Contreras
  2009-10-15 20:09 ` Junio C Hamano
  0 siblings, 1 reply; 8+ messages in thread
From: Felipe Contreras @ 2009-10-15 13:20 UTC (permalink / raw)
  To: git

Hi,

I love the new option to run a cccmd and how good it works on the
linux kernel, but I couldn't find a generic script. So I decided to
write my own.

It's very simple, it just looks into the authors of the commits that
modified the lines being overridden (git blame). It's not checking for
s-o-b, or anything fancy.

Comments?

#!/usr/bin/env ruby

@commits = {} # keeps a count of commits per author

ARGV.each do |filename|
  File.open(filename) do |patch_file|
    patch_file.each_line do |patch_line|
      case patch_line
      when /^---\s+(\S+)/
        @source = $1[2..-1]
      when /^@@\s-(\d+),(\d+)/
        blame = `git blame -p -L #{$1},+#{$2} #{@source} | grep author`
        blame.each_line do |al|
          key, value = al.chomp.split(" ", 2)
          case key
          when "author"
            @name = value
          when "author-mail"
            @mail = value
            author = "\"#{@name}\" #{@mail}"
            @commits[author] ||= 0
            @commits[author] += 1
          end
        end
      end
    end
  end
end

@commits.each_key do |a|
  puts a
end

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: My custom cccmd
  2009-10-15 13:20 My custom cccmd Felipe Contreras
@ 2009-10-15 20:09 ` Junio C Hamano
  2009-10-15 21:37   ` Felipe Contreras
  0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2009-10-15 20:09 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: git

Felipe Contreras <felipe.contreras@gmail.com> writes:

> Hi,
>
> I love the new option to run a cccmd and how good it works on the
> linux kernel, but I couldn't find a generic script. So I decided to
> write my own.
>
> It's very simple, it just looks into the authors of the commits that
> modified the lines being overridden (git blame). It's not checking for
> s-o-b, or anything fancy.
>
> Comments?

> #!/usr/bin/env ruby
>
> @commits = {} # keeps a count of commits per author
>
> ARGV.each do |filename|
>   File.open(filename) do |patch_file|
>     patch_file.each_line do |patch_line|
>       case patch_line
>       when /^---\s+(\S+)/
>         @source = $1[2..-1]
>       when /^@@\s-(\d+),(\d+)/
>         blame = `git blame -p -L #{$1},+#{$2} #{@source} | grep author`
>         blame.each_line do |al|
>           key, value = al.chomp.split(" ", 2)
>           case key
>           when "author"
>             @name = value
>           when "author-mail"
>             @mail = value
>             author = "\"#{@name}\" #{@mail}"
>             @commits[author] ||= 0
>             @commits[author] += 1
>           end
>         end
>       end
>     end
>   end
> end

Comments.

 #0. Gaahhh, my eyes, my eyes!!  Can't you do this ugly run of infinite
     number of "end"s?

 #1. You are not making sure that you start blaming from the commit the
     patch is based on, so your -La,b line numbers can be off.  If you can
     assume that you are always reading format-patch output, you can learn
     which commit to start from by reading the first "magic" line.
     
 #2. If you have two patch series that updates one file twice, some
     changes in your second patch could even be an update to the changes
     you introduced in your first patch.  After you fix issue #1, you
     would probably want to fix this by excluding the commits you have
     already sent the blames for.

 #3. Does the number of commits you keep per author have any significance?
     I know it doesn't in the implementation you posted, but should it,
     and if so how?

> @commits.each_key do |a|
>   puts a
> end
>
> -- 
> Felipe Contreras
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: My custom cccmd
  2009-10-15 20:09 ` Junio C Hamano
@ 2009-10-15 21:37   ` Felipe Contreras
  2009-10-25 15:04     ` Felipe Contreras
  0 siblings, 1 reply; 8+ messages in thread
From: Felipe Contreras @ 2009-10-15 21:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Oct 15, 2009 at 11:09 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Felipe Contreras <felipe.contreras@gmail.com> writes:
>
>> Hi,
>>
>> I love the new option to run a cccmd and how good it works on the
>> linux kernel, but I couldn't find a generic script. So I decided to
>> write my own.
>>
>> It's very simple, it just looks into the authors of the commits that
>> modified the lines being overridden (git blame). It's not checking for
>> s-o-b, or anything fancy.
>>

<snip/>

> Comments.
>
>  #0. Gaahhh, my eyes, my eyes!!  Can't you do this ugly run of infinite
>     number of "end"s?

Hehe, sure. Will do.

>  #1. You are not making sure that you start blaming from the commit the
>     patch is based on, so your -La,b line numbers can be off.  If you can
>     assume that you are always reading format-patch output, you can learn
>     which commit to start from by reading the first "magic" line.

The 'From' magic line points to the actual commit the patch was
generated from, so it would actually be @from^.

This of course would only work if the patches have the corresponding
commits in the current tree (which is the case most of the time).

And makes sense only for the first patch, the rest of the patches
would use a wrong commit as a base. See below.

>  #2. If you have two patch series that updates one file twice, some
>     changes in your second patch could even be an update to the changes
>     you introduced in your first patch.  After you fix issue #1, you
>     would probably want to fix this by excluding the commits you have
>     already sent the blames for.

How exactly? By looking at the commit from 'git blame' and discarding
it? That would be a bit tricky since each instance of 'cccmd' is not
aware of the previous ones.

>  #3. Does the number of commits you keep per author have any significance?
>     I know it doesn't in the implementation you posted, but should it,
>     and if so how?

Not currently. Once I add support for s-o-b it might be useful.
Currently I left it in order to order the CC's by the count, but it
turned out to be a bit messier than I thought, and the advantage is
almost nothing.

I'll clean it up.

Taking in consideration the previous comments, here is v2:

#!/usr/bin/env ruby

@authors = {}

def parse_blame(line)
  key, value = line.split(" ", 2)
  case key
  when "author"
    @name = value
  when "author-mail"
    @mail = value
    author = "\"#{@name}\" #{@mail}"
    @authors[author] = true
  end
end

ARGV.each do |filename|
  patch_file = File.open(filename)
  patch_file.each_line do |patch_line|
    case patch_line
    when /^---\s+(\S+)/
      @source = $1[2..-1]
    when /^@@\s-(\d+),(\d+)/
      blame = `git blame -p -L #{$1},+#{$2} #{@source} | grep author`
      blame.each_line { |l| parse_blame(l.chomp) }
    end
  end
end

@authors.each_key do |a|
  puts a
end

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: My custom cccmd
  2009-10-15 21:37   ` Felipe Contreras
@ 2009-10-25 15:04     ` Felipe Contreras
  2009-10-27 21:53       ` Junio C Hamano
  0 siblings, 1 reply; 8+ messages in thread
From: Felipe Contreras @ 2009-10-25 15:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Oct 15, 2009 at 11:37 PM, Felipe Contreras
<felipe.contreras@gmail.com> wrote:
> On Thu, Oct 15, 2009 at 11:09 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>  #2. If you have two patch series that updates one file twice, some
>>     changes in your second patch could even be an update to the changes
>>     you introduced in your first patch.  After you fix issue #1, you
>>     would probably want to fix this by excluding the commits you have
>>     already sent the blames for.
>
> How exactly? By looking at the commit from 'git blame' and discarding
> it? That would be a bit tricky since each instance of 'cccmd' is not
> aware of the previous ones.

I explored this a bit more and I've come to the conclusion that we
actually don't wand to discard the previous commits in the patch
series. Let's think about this example:
0001 <- John
0002 <- Me overriding some changes from John

In this case we want John to appear in the CC list of 0002, because we
are changing his code. If it turns out that the whole patch series was
written by me, it's still ok for me to appear on the CC list, which
would essentially be "discarding" those commits.

So all we need to do is pick @from^ as the base of the git blame:
@@ -17,11 +17,12 @@ end
 ARGV.each do |filename|
   patch_file = File.open(filename)
   patch_file.each_line do |patch_line|
+    @from ||= patch_line[/From (\w+)/, 1]
     case patch_line
     when /^---\s+(\S+)/
       @source = $1[2..-1]
     when /^@@\s-(\d+),(\d+)/
-      blame = `git blame -p -L #{$1},+#{$2} #{@source} | grep author`
+      blame = `git blame --incremental -L #{$1},+#{$2} #{@source}
#{@from}^ | grep author`
       blame.each_line { |l| parse_blame(l.chomp) }
     end
   end

You can find the whole script here:
http://gist.github.com/218093

Cheers.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: My custom cccmd
  2009-10-25 15:04     ` Felipe Contreras
@ 2009-10-27 21:53       ` Junio C Hamano
  2009-10-30  8:39         ` Felipe Contreras
  0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2009-10-27 21:53 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: git

Felipe Contreras <felipe.contreras@gmail.com> writes:

> I explored this a bit more and I've come to the conclusion that we
> actually don't wand to discard the previous commits in the patch
> series. Let's think about this example:
> 0001 <- John
> 0002 <- Me overriding some changes from John
>
> In this case we want John to appear in the CC list of 0002, because we
> are changing his code.

There are two cases: your patch partially overrides his code, and your
patch completely rewrites/removes his code.

Obviously in the former case you do want to Cc, but there are parts of his
change that remain in the final result, so this case does not matter in
this discussion.  You will Cc him because of these remaining lines anyway.

In the latter case, the only contribution that remains from him is a
commit with his log message that does not help describing anything in the
end result, cluttering the history.

In such a case, I would imagine that the reviewers would want to see a
cleaned up series that does not have his patch that is irrelevant for
understanding the final result.  John might want to know about it, if only
to raise objection to the way how you arranged your series.  For that
reason, I think it makes sense to Cc him.

But it is a rather a convoluted logic, if you ask me.  You find John and
Cc him, primarily so that he can point out that you should be redoing the
series not to have his patch as an intermediate state in the series to
begin with, because his commit does not contribute to the end result.

It might make more sense if your tool told you about such a case directly,
rather than helping you find John so that he can tell you ;-).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: My custom cccmd
  2009-10-27 21:53       ` Junio C Hamano
@ 2009-10-30  8:39         ` Felipe Contreras
  2009-10-30 21:52           ` Junio C Hamano
  0 siblings, 1 reply; 8+ messages in thread
From: Felipe Contreras @ 2009-10-30  8:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, Oct 27, 2009 at 11:53 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Felipe Contreras <felipe.contreras@gmail.com> writes:
>
>> I explored this a bit more and I've come to the conclusion that we
>> actually don't wand to discard the previous commits in the patch
>> series. Let's think about this example:
>> 0001 <- John
>> 0002 <- Me overriding some changes from John
>>
>> In this case we want John to appear in the CC list of 0002, because we
>> are changing his code.
>
> There are two cases: your patch partially overrides his code, and your
> patch completely rewrites/removes his code.
>
> Obviously in the former case you do want to Cc, but there are parts of his
> change that remain in the final result, so this case does not matter in
> this discussion.  You will Cc him because of these remaining lines anyway.
>
> In the latter case, the only contribution that remains from him is a
> commit with his log message that does not help describing anything in the
> end result, cluttering the history.
>
> In such a case, I would imagine that the reviewers would want to see a
> cleaned up series that does not have his patch that is irrelevant for
> understanding the final result.  John might want to know about it, if only
> to raise objection to the way how you arranged your series.  For that
> reason, I think it makes sense to Cc him.
>
> But it is a rather a convoluted logic, if you ask me.  You find John and
> Cc him, primarily so that he can point out that you should be redoing the
> series not to have his patch as an intermediate state in the series to
> begin with, because his commit does not contribute to the end result.
>
> It might make more sense if your tool told you about such a case directly,
> rather than helping you find John so that he can tell you ;-).

But that's not the purpose of the cccmd tool.

I agree that such "patch series simplificator" tool would be very
useful, but that's out of scope for this. Isn't it?

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: My custom cccmd
  2009-10-30  8:39         ` Felipe Contreras
@ 2009-10-30 21:52           ` Junio C Hamano
  2009-11-02 14:25             ` Felipe Contreras
  0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2009-10-30 21:52 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: Junio C Hamano, git

Felipe Contreras <felipe.contreras@gmail.com> writes:

> On Tue, Oct 27, 2009 at 11:53 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Felipe Contreras <felipe.contreras@gmail.com> writes:
>>
>>> I explored this a bit more and I've come to the conclusion that we
>>> actually don't wand to discard the previous commits in the patch
>>> series. Let's think about this example:
>>> 0001 <- John
>>> 0002 <- Me overriding some changes from John
>>>
>>> In this case we want John to appear in the CC list of 0002, because we
>>> are changing his code.
>> ...
>> In such a case, I would imagine that the reviewers would want to see a
>> cleaned up series that does not have his patch that is irrelevant for
>> understanding the final result.  John might want to know about it, if only
>> to raise objection to the way how you arranged your series.  For that
>> reason, I think it makes sense to Cc him.
>>
>> But it is a rather a convoluted logic, if you ask me.  You find John and
>> Cc him, primarily so that he can point out that you should be redoing the
>> series not to have his patch as an intermediate state in the series to
>> begin with, because his commit does not contribute to the end result.
>>
>> It might make more sense if your tool told you about such a case directly,
>> rather than helping you find John so that he can tell you ;-).
>
> But that's not the purpose of the cccmd tool.
>
> I agree that such "patch series simplificator" tool would be very
> useful, but that's out of scope for this. Isn't it?

Exactly.

So you agree that you _do_ want to "discard the previous commits in the
patch series", because not doing so would mean the result would be a
half-cooked "patch series simplificator" that tries to do something that
is outside the scope of cccmd, right?

The "discarding the previous commits" happens to match what I suggested
earlier that lead to your "explored this a bit more":

On Thu, Oct 15, 2009 at 11:37 PM, Felipe Contreras <felipe.contreras@gmail.com> wrote:
> On Thu, Oct 15, 2009 at 11:09 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>  #2. If you have two patch series that updates one file twice, some
>>     changes in your second patch could even be an update to the changes
>>     you introduced in your first patch.  After you fix issue #1, you
>>     would probably want to fix this by excluding the commits you have
>>     already sent the blames for.

so I think we are in agreement.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: My custom cccmd
  2009-10-30 21:52           ` Junio C Hamano
@ 2009-11-02 14:25             ` Felipe Contreras
  0 siblings, 0 replies; 8+ messages in thread
From: Felipe Contreras @ 2009-11-02 14:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Fri, Oct 30, 2009 at 11:52 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Felipe Contreras <felipe.contreras@gmail.com> writes:
>
>> On Tue, Oct 27, 2009 at 11:53 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>> It might make more sense if your tool told you about such a case directly,
>>> rather than helping you find John so that he can tell you ;-).
>>
>> But that's not the purpose of the cccmd tool.
>>
>> I agree that such "patch series simplificator" tool would be very
>> useful, but that's out of scope for this. Isn't it?
>
> Exactly.
>
> So you agree that you _do_ want to "discard the previous commits in the
> patch series", because not doing so would mean the result would be a
> half-cooked "patch series simplificator" that tries to do something that
> is outside the scope of cccmd, right?
>
> The "discarding the previous commits" happens to match what I suggested
> earlier that lead to your "explored this a bit more":
>
> On Thu, Oct 15, 2009 at 11:37 PM, Felipe Contreras <felipe.contreras@gmail.com> wrote:
>> On Thu, Oct 15, 2009 at 11:09 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>>  #2. If you have two patch series that updates one file twice, some
>>>     changes in your second patch could even be an update to the changes
>>>     you introduced in your first patch.  After you fix issue #1, you
>>>     would probably want to fix this by excluding the commits you have
>>>     already sent the blames for.
>
> so I think we are in agreement.

Cool. Is this script something that would make sense in the contrib section?

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-11-02 14:26 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-15 13:20 My custom cccmd Felipe Contreras
2009-10-15 20:09 ` Junio C Hamano
2009-10-15 21:37   ` Felipe Contreras
2009-10-25 15:04     ` Felipe Contreras
2009-10-27 21:53       ` Junio C Hamano
2009-10-30  8:39         ` Felipe Contreras
2009-10-30 21:52           ` Junio C Hamano
2009-11-02 14:25             ` Felipe Contreras

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).