* Re: How best to handle multiple-authorship commits in GIT?
2012-02-02 12:25 How best to handle multiple-authorship commits in GIT? David Howells
@ 2012-02-02 13:41 ` Frans Klaver
2012-02-02 18:00 ` Valerie Aurora
` (2 subsequent siblings)
3 siblings, 0 replies; 11+ messages in thread
From: Frans Klaver @ 2012-02-02 13:41 UTC (permalink / raw)
To: David Howells; +Cc: git, valerie.aurora
Hi,
On Thu, Feb 2, 2012 at 1:25 PM, David Howells <dhowells@redhat.com> wrote:
>
> Hi,
>
> I've been assigned a stack of patches to maintain and try and get upstream by
> my employer. Most of the patches currently have the authorship set to Val,
> but since I'll be maintaining them if they go in upstream and I've changed
> them a lot, I feel I should reassign the author field to myself so people
> pester me rather than Val with questions about them. However, I don't want to
> deny Val or any other contributor credit for their work on the patches.
>
> I can see a number of ways of doing this, and am wondering which will be best:
>
> (1) Ascribe multiple authorship directly in the commit. I suspect this would
> require a change to GIT and its associated tools. That way I could put my
> name in the priority pestering spot, but doing a search on authorship
> would still credit Val and others.
>
> (2) Add an extra tag 'Originally-authored-by' (or maybe 'Coauthored-by' as I
> saw someone recommend) in amongst the 'Signed-off-by' list. But that
> doesn't give them credit in a gitweb search without changing gitweb.
>
> (3) Don't actually modify Val's commits to bring them up to date, but rather
> create a historical GIT tree with Val's commits committed as-are and then
> add my changes to the top in a number of large merge commits (there have
> been multiple major breakages due to different merge windows).
>
> I dislike this approach because it doesn't produce a nice set of patches I
> can give to someone to review (which is a must). Plus, for the most part,
> it's actually easier to port Val's patches individually.
>
> Can GIT be modified to do (1)? Gitweb's display need only show one of the
> authors in the single-row-per-patch list mode, but should find a patch by any
> of the authors in an author search and should display all the authors in the
> commit display.
I always thought of the author field as being an indication of who is
ultimately responsible for its implementation (the one in the
pestering spot). (1) may seem desirous, but doesn't (2) seem like a
cleaner and more maintainable solution? Gitweb will show the entire
log message if people are interested in the exact change, right?
Cheers,
Frans
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: How best to handle multiple-authorship commits in GIT?
2012-02-02 12:25 How best to handle multiple-authorship commits in GIT? David Howells
2012-02-02 13:41 ` Frans Klaver
@ 2012-02-02 18:00 ` Valerie Aurora
2012-02-02 19:57 ` Jakub Narebski
2012-02-02 18:36 ` David Howells
2012-02-02 20:33 ` David Howells
3 siblings, 1 reply; 11+ messages in thread
From: Valerie Aurora @ 2012-02-02 18:00 UTC (permalink / raw)
To: David Howells; +Cc: git@vger.kernel.org, dhowells@redhat.com
On Feb 2, 2012, at 4:25, David Howells <dhowells@redhat.com> wrote:
>
> Hi,
>
> I've been assigned a stack of patches to maintain and try and get upstream by
> my employer. Most of the patches currently have the authorship set to Val,
> but since I'll be maintaining them if they go in upstream and I've changed
> them a lot, I feel I should reassign the author field to myself so people
> pester me rather than Val with questions about them. However, I don't want to
> deny Val or any other contributor credit for their work on the patches.
>
> I can see a number of ways of doing this, and am wondering which will be best:
>
> (1) Ascribe multiple authorship directly in the commit. I suspect this would
> require a change to GIT and its associated tools. That way I could put my
> name in the priority pestering spot, but doing a search on authorship
> would still credit Val and others.
>
> (2) Add an extra tag 'Originally-authored-by' (or maybe 'Coauthored-by' as I
> saw someone recommend) in amongst the 'Signed-off-by' list. But that
> doesn't give them credit in a gitweb search without changing gitweb.
>
> (3) Don't actually modify Val's commits to bring them up to date, but rather
> create a historical GIT tree with Val's commits committed as-are and then
> add my changes to the top in a number of large merge commits (there have
> been multiple major breakages due to different merge windows).
>
> I dislike this approach because it doesn't produce a nice set of patches I
> can give to someone to review (which is a must). Plus, for the most part,
> it's actually easier to port Val's patches individually.
>
> Can GIT be modified to do (1)? Gitweb's display need only show one of the
> authors in the single-row-per-patch list mode, but should find a patch by any
> of the authors in an author search and should display all the authors in the
> commit display.
>
> David
Thanks, David! I had the same trouble with my set: while I entirely rewrote some patches, I still felt Jan Blunck deserved primary credit. I don't recall my solution, but I'm fine with mentioning my name in the commit message (and I think Jan should get credit too).
In general, this is a big problem for motivating contributors in other cases. Some maintainers have a habit of trivially rewriting patches so that, technically, no line is the same, then taking authorship and giving the actual author an ambiguous Signed-off-by. David hasn't done this here, of course - these are major rewrites - but when someone does all the hard work of finding and fixing a problem, the credit shouldn't go to the person who prettied it up. There is a line in the kernel doc saying how this should be handled, suggested by Rusty, but it's not being followed.
First class support for multiple authorship would be a big way to motivate contributors.
-VAL
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: How best to handle multiple-authorship commits in GIT?
2012-02-02 18:00 ` Valerie Aurora
@ 2012-02-02 19:57 ` Jakub Narebski
0 siblings, 0 replies; 11+ messages in thread
From: Jakub Narebski @ 2012-02-02 19:57 UTC (permalink / raw)
To: Valerie Aurora; +Cc: David Howells, git@vger.kernel.org
Valerie Aurora <valerie.aurora@gmail.com> writes:
> [...] I had the same trouble with my set: while I entirely
> rewrote some patches, I still felt Jan Blunck deserved primary
> credit. I don't recall my solution, but I'm fine with mentioning my
> name in the commit message (and I think Jan should get credit too).
That's what various *-by headers are for. Signed-off-by is for
provenance.
Nb. you can search the whole commit message in gitweb, not only author
or committer.
> In general, this is a big problem for motivating contributors in
> other cases. Some maintainers have a habit of trivially rewriting
> patches so that, technically, no line is the same, then taking
> authorship and giving the actual author an ambiguous Signed-off-by.
Maybe it was cause by tools accidentally stealing authorship? With
"git commit --amend --author=..." it is now easy to add authorship
back.
> David hasn't done this here, of course - these are major rewrites -
> but when someone does all the hard work of finding and fixing a
> problem, the credit shouldn't go to the person who prettied it up.
> There is a line in the kernel doc saying how this should be handled,
> suggested by Rusty, but it's not being followed.
Link?
> First class support for multiple authorship would be a big way to
> motivate contributors.
Well, multi-line commit headers were only recently added to git (when
adding signed pull / singed commit stuff), but I think in many places
git assumes single authorship, and it would be hard to change...
There was some workaround that people doing pair programming invented,
IIRC...
--
Jakub Narebski
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: How best to handle multiple-authorship commits in GIT?
2012-02-02 12:25 How best to handle multiple-authorship commits in GIT? David Howells
2012-02-02 13:41 ` Frans Klaver
2012-02-02 18:00 ` Valerie Aurora
@ 2012-02-02 18:36 ` David Howells
2012-02-03 2:18 ` Valerie Aurora
2012-02-03 13:47 ` David Howells
2012-02-02 20:33 ` David Howells
3 siblings, 2 replies; 11+ messages in thread
From: David Howells @ 2012-02-02 18:36 UTC (permalink / raw)
To: Valerie Aurora; +Cc: dhowells, git@vger.kernel.org
Valerie Aurora <valerie.aurora@gmail.com> wrote:
> There is a line in the kernel doc saying how this should be handled,
> suggested by Rusty, but it's not being followed.
Do you know where?
David
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: How best to handle multiple-authorship commits in GIT?
2012-02-02 18:36 ` David Howells
@ 2012-02-03 2:18 ` Valerie Aurora
2012-02-03 5:11 ` Junio C Hamano
2012-02-03 13:47 ` David Howells
1 sibling, 1 reply; 11+ messages in thread
From: Valerie Aurora @ 2012-02-03 2:18 UTC (permalink / raw)
To: David Howells; +Cc: git@vger.kernel.org, Rusty Russell
On Thu, Feb 2, 2012 at 10:36 AM, David Howells <dhowells@redhat.com> wrote:
> Valerie Aurora <valerie.aurora@gmail.com> wrote:
>
>> There is a line in the kernel doc saying how this should be handled,
>> suggested by Rusty, but it's not being followed.
>
> Do you know where?
A fault in my memory - I asked Rusty for advice on a related problem
and he gave me additional advice, he didn't suggest (so far as I know)
the changes to the kernel docs. Here's what SubmittingPatches has to
say:
"If you are a subsystem or branch maintainer, sometimes you need to slightly
modify patches you receive in order to merge them, because the code is not
exactly the same in your tree and the submitters'. If you stick strictly to
rule (c), you should ask the submitter to rediff, but this is a totally
counter-productive waste of time and energy. Rule (b) allows you to adjust
the code, but then it is very impolite to change one submitter's code and
make him endorse your bugs. To solve this problem, it is recommended that
you add a line between the last Signed-off-by header and yours, indicating
the nature of your changes. While there is nothing mandatory about this, it
seems like prepending the description with your mail and/or name, all
enclosed in square brackets, is noticeable enough to make it obvious that
you are responsible for last-minute changes. Example :
Signed-off-by: Random J Developer <random@developer.example.org>
[lucky@maintainer.example.org: struct foo moved from foo.c to foo.h]
Signed-off-by: Lucky K Maintainer <lucky@maintainer.example.org>
This practise is particularly helpful if you maintain a stable branch and
want at the same time to credit the author, track changes, merge the fix,
and protect the submitter from complaints. Note that under no circumstances
can you change the author's identity (the From header), as it is the one
which appears in the changelog."
And Rusty's practice as of a year or two ago is for "minor mods" to a
patch, to leave the authorship the same, and add a Signed-off-by:
Signed-off-by: Some Upstream Author
Signed-off-by: Maintainer or Merger (rewrote error handling)
And for a complete (meaningful) rewrite such as David has done, he
changes the commit authorship and adds a Signed-off-by for the
original author.
That's existing guidelines and practice.
-VAL
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: How best to handle multiple-authorship commits in GIT?
2012-02-03 2:18 ` Valerie Aurora
@ 2012-02-03 5:11 ` Junio C Hamano
0 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2012-02-03 5:11 UTC (permalink / raw)
To: Valerie Aurora; +Cc: David Howells, git@vger.kernel.org, Rusty Russell
Valerie Aurora <valerie.aurora@gmail.com> writes:
> And Rusty's practice as of a year or two ago is for "minor mods" to a
> patch, to leave the authorship the same, and add a Signed-off-by:
>
> Signed-off-by: Some Upstream Author
> Signed-off-by: Maintainer or Merger (rewrote error handling)
>
> And for a complete (meaningful) rewrite such as David has done, he
> changes the commit authorship and adds a Signed-off-by for the
> original author.
>
> That's existing guidelines and practice.
All sounds very sensible. Thanks for a summary.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: How best to handle multiple-authorship commits in GIT?
2012-02-02 18:36 ` David Howells
2012-02-03 2:18 ` Valerie Aurora
@ 2012-02-03 13:47 ` David Howells
2012-02-03 18:27 ` Junio C Hamano
2012-02-03 19:49 ` Valerie Aurora
1 sibling, 2 replies; 11+ messages in thread
From: David Howells @ 2012-02-03 13:47 UTC (permalink / raw)
To: Valerie Aurora; +Cc: dhowells, git@vger.kernel.org, Rusty Russell
Valerie Aurora <valerie.aurora@gmail.com> wrote:
> And for a complete (meaningful) rewrite such as David has done, he
> changes the commit authorship and adds a Signed-off-by for the
> original author.
Val[*] hasn't signed off all her patches, and indeed I've merged together some
patches that she has signed off and some she hasn't. I can't simply add
Signed-off-by her without her permission. However, if she's willing for me to
add such lines, then I can do so.
> Signed-off-by: Some Upstream Author
> Signed-off-by: Maintainer or Merger (rewrote error handling)
And if the changes are more than can be put in what's left of the line? I
would've thought it would make more sense to do something like:
Signed-off-by: Valerie Aurora <valerie.aurora@gmail.com> (Original author)
Signed-off-by: David Howells <dhowells@redhat.com> (Further development)
David
[*] Apologies for talking about/to you in the third person, Val.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: How best to handle multiple-authorship commits in GIT?
2012-02-03 13:47 ` David Howells
@ 2012-02-03 18:27 ` Junio C Hamano
2012-02-03 19:49 ` Valerie Aurora
1 sibling, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2012-02-03 18:27 UTC (permalink / raw)
To: David Howells; +Cc: Valerie Aurora, git@vger.kernel.org, Rusty Russell
David Howells <dhowells@redhat.com> writes:
> Valerie Aurora <valerie.aurora@gmail.com> wrote:
>
>> And for a complete (meaningful) rewrite such as David has done, he
>> changes the commit authorship and adds a Signed-off-by for the
>> original author.
>
> Val[*] hasn't signed off all her patches, and indeed I've merged together some
> patches that she has signed off and some she hasn't. I can't simply add
> Signed-off-by her without her permission. However, if she's willing for me to
> add such lines, then I can do so.
>
>> Signed-off-by: Some Upstream Author
>> Signed-off-by: Maintainer or Merger (rewrote error handling)
>
> And if the changes are more than can be put in what's left of the line? I
> would've thought it would make more sense to do something like:
>
> Signed-off-by: Valerie Aurora <valerie.aurora@gmail.com> (Original author)
> Signed-off-by: David Howells <dhowells@redhat.com> (Further development)
>
> David
That all sounds sensible.
I personally think the "recognition" factor Valerie alluded to in one of
her earlier message is a real and important issue, but I do not think
adding arbitrary number of "author" headers to the commit object would
help very much to solve it, for various reasons:
* While we made it easy to run "git shortlog -s -n --since=3.months" and
congratulate himself with "I now am the third most active person!" for
anybody, Git itself does not ship an equally easy way to analyze other
kinds of contributions to your project. I am merely a bystander, but
if I recall correctly, there were discussions on how to recognize
contributions by bug-reporters and testers using the history stored in
Git on the kernel list. The types of contribution you would want to
recognize however would be different from project to project. For that
kind of analysis, you would be better off doing something like what
lwn.net does, mining the text from the message part of the log.
* Even if we limit the issue to "who wrote X" (replace X with the name of
any piece of software), taking "author" field as anything more than an
approximation would be asking for a trouble. Not all patches are of
equal impact and importance.
* You would also have to think about how you would present "git shortlog"
output if you updated Git to record more than one "author" field in the
commit header. If Valerie wrote 27 patches by herself, 33 patches
together with you sitting next to each other, 17 patches with somebody
else, how would the entries for her, you and the third person look
like? Or would combinations of "Valerie & David", "Valerie & the
third person", etc. have separate entries in the output?
In short, I would say that you should take the name recorded in the
"author" field nothing more than the primary contact for a particular
commit to be used in case others have question on it later.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: How best to handle multiple-authorship commits in GIT?
2012-02-03 13:47 ` David Howells
2012-02-03 18:27 ` Junio C Hamano
@ 2012-02-03 19:49 ` Valerie Aurora
1 sibling, 0 replies; 11+ messages in thread
From: Valerie Aurora @ 2012-02-03 19:49 UTC (permalink / raw)
To: David Howells; +Cc: git@vger.kernel.org, Rusty Russell
On Fri, Feb 3, 2012 at 5:47 AM, David Howells <dhowells@redhat.com> wrote:
> Valerie Aurora <valerie.aurora@gmail.com> wrote:
>
>> And for a complete (meaningful) rewrite such as David has done, he
>> changes the commit authorship and adds a Signed-off-by for the
>> original author.
>
> Val[*] hasn't signed off all her patches, and indeed I've merged together some
> patches that she has signed off and some she hasn't. I can't simply add
> Signed-off-by her without her permission. However, if she's willing for me to
> add such lines, then I can do so.
Cc me on the next posting and I'll review and add my Signed-off-by if
it's missing anywhere it should be? Or send them privately, whichever
you prefer.
-VAL
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: How best to handle multiple-authorship commits in GIT?
2012-02-02 12:25 How best to handle multiple-authorship commits in GIT? David Howells
` (2 preceding siblings ...)
2012-02-02 18:36 ` David Howells
@ 2012-02-02 20:33 ` David Howells
3 siblings, 0 replies; 11+ messages in thread
From: David Howells @ 2012-02-02 20:33 UTC (permalink / raw)
To: Frans Klaver; +Cc: dhowells, git, valerie.aurora
Frans Klaver <fransklaver@gmail.com> wrote:
> I always thought of the author field as being an indication of who is
> ultimately responsible for its implementation (the one in the
> pestering spot).
Define 'ultimate responsibility for an implementation'. I'm further developing
patches that Val has (at least partially) implemented. By Val's admission some
of the patches she further developed beyond what Jan Blunck had implemented.
The chain may extend further. To that end all three of us are authors of some
of the patches.
However, if you meant 'maintenance' rather than 'implementation', then, yes,
that would be me (for the moment at least). But if that's the case, then
shouldn't it be 'Maintainer' and not 'Author'? And, besides, that's what the
MAINTAINERS file is for.
> (1) may seem desirous, but doesn't (2) seem like a cleaner and more
> maintainable solution?
No. I would say that properly supporting multiple authors in the commit object
is the cleaner solution. It's not the *easier* solution, however, and would
require an upgrade to the version of GIT used to parse these commits. That
I'll grant you.
> Gitweb will show the entire log message if people are interested in the
> exact change, right?
But if I say to Gitweb "show me the patches authored by Val" it will *not* turn
up these patches, and in that way will deny Val credit. Yes, you can see that
Val altered that patch if you look at that patch directly - but you have to
know where to go and look, in which case you already know or suspect that Val
is credited with patches in that area.
So to make (2) work, Gitweb needs to search for the additional authoring fields
when asked to credit people with the patches they've worked on.
Similarly gitk and possibly other tools would need to do the same.
*That* would be fine by me, I suppose. I don't think it's the correct way to
do it, but it might be the logical way since this wasn't build in from the
beginning - and the main thing would be to turn up the prior or joint
authorship to author-based searches.
David
^ permalink raw reply [flat|nested] 11+ messages in thread