* Cleaning up log messages @ 2008-07-27 17:50 Jon Smirl 2008-07-27 18:01 ` Johannes Schindelin 2008-07-27 18:47 ` Junio C Hamano 0 siblings, 2 replies; 10+ messages in thread From: Jon Smirl @ 2008-07-27 17:50 UTC (permalink / raw) To: Git Mailing List I was playing around with git log for the kernel and observed that there is a lot of noise when trying to do statistics on the number of commits. For example: Author: Greg K-H <gregkh@suse.de> Author: Greg KH <gregkh@suse.de> Author: Greg KH <greg@kroah.com> Author: Greg KH <greg@press.(none)> Author: gregkh@suse.de <gregkh@suse.de> Author: Greg Kroah-Hartman <gregkh@suse> Author: Greg Kroah-Hartman <gregkh@suse.de> Author: Greg Kroah-Hartman <greg@kroah.com> I don't see an obvious way to do this with git, but it would be neat to have a 'clean' option on git log that would take each email address (author, signed-off, acked, etc) and map it through a table which would convert old email addresses in to the current one and also standardize the formatting of the names. A cleaned log would be altered on display, but just don't clean it if you want the original. Of course this initial map would need to be built by hand. New commits could be checked against the map and the mapped updated if the person really has a new email address. Checking new commits against the map would help clean things up going forward. checkpatch.pl could also validate against the mapping file. No pressing need to for this, it would just be a nice toy. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Cleaning up log messages 2008-07-27 17:50 Cleaning up log messages Jon Smirl @ 2008-07-27 18:01 ` Johannes Schindelin 2008-07-27 18:16 ` Jon Smirl 2008-07-27 18:47 ` Junio C Hamano 1 sibling, 1 reply; 10+ messages in thread From: Johannes Schindelin @ 2008-07-27 18:01 UTC (permalink / raw) To: Jon Smirl; +Cc: Git Mailing List Hi, On Sun, 27 Jul 2008, Jon Smirl wrote: > I was playing around with git log for the kernel and observed that there > is a lot of noise when trying to do statistics on the number of commits. > > For example: > > Author: Greg K-H <gregkh@suse.de> > Author: Greg KH <gregkh@suse.de> > Author: Greg KH <greg@kroah.com> > Author: Greg KH <greg@press.(none)> > Author: gregkh@suse.de <gregkh@suse.de> > Author: Greg Kroah-Hartman <gregkh@suse> > Author: Greg Kroah-Hartman <gregkh@suse.de> > Author: Greg Kroah-Hartman <greg@kroah.com> > > I don't see an obvious way to do this with git, but it would be neat > to have a 'clean' option on git log that would take each email address > (author, signed-off, acked, etc) and map it through a table which > would convert old email addresses in to the current one and also > standardize the formatting of the names. Something like .mailmap? And to show the mapped author name instead of the committed one, you would use "--pretty=format:%aN"? (Needs 1.6.0-rc0 at least, IIRC) Ciao, Dscho ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Cleaning up log messages 2008-07-27 18:01 ` Johannes Schindelin @ 2008-07-27 18:16 ` Jon Smirl 2008-07-27 18:33 ` Petr Baudis 0 siblings, 1 reply; 10+ messages in thread From: Jon Smirl @ 2008-07-27 18:16 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Git Mailing List On 7/27/08, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > Hi, > > > On Sun, 27 Jul 2008, Jon Smirl wrote: > > > I was playing around with git log for the kernel and observed that there > > is a lot of noise when trying to do statistics on the number of commits. > > > > For example: > > > > Author: Greg K-H <gregkh@suse.de> > > Author: Greg KH <gregkh@suse.de> > > Author: Greg KH <greg@kroah.com> > > Author: Greg KH <greg@press.(none)> > > Author: gregkh@suse.de <gregkh@suse.de> > > Author: Greg Kroah-Hartman <gregkh@suse> > > Author: Greg Kroah-Hartman <gregkh@suse.de> > > Author: Greg Kroah-Hartman <greg@kroah.com> > > > > I don't see an obvious way to do this with git, but it would be neat > > to have a 'clean' option on git log that would take each email address > > (author, signed-off, acked, etc) and map it through a table which > > would convert old email addresses in to the current one and also > > standardize the formatting of the names. > > > Something like .mailmap? > > And to show the mapped author name instead of the committed one, you would > use "--pretty=format:%aN"? (Needs 1.6.0-rc0 at least, IIRC) So we can already do this? Where is a .mailmap for the kernel tree? -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Cleaning up log messages 2008-07-27 18:16 ` Jon Smirl @ 2008-07-27 18:33 ` Petr Baudis 2008-07-27 19:07 ` Jon Smirl 0 siblings, 1 reply; 10+ messages in thread From: Petr Baudis @ 2008-07-27 18:33 UTC (permalink / raw) To: Jon Smirl; +Cc: Johannes Schindelin, Git Mailing List On Sun, Jul 27, 2008 at 02:16:30PM -0400, Jon Smirl wrote: > On 7/27/08, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > Something like .mailmap? > > > > And to show the mapped author name instead of the committed one, you would > > use "--pretty=format:%aN"? (Needs 1.6.0-rc0 at least, IIRC) > > So we can already do this? Where is a .mailmap for the kernel tree? http://repo.or.cz/w/linux-2.6.git?a=blob;f=.mailmap ...right there. :-) Petr "Pasky" Baudis ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Cleaning up log messages 2008-07-27 18:33 ` Petr Baudis @ 2008-07-27 19:07 ` Jon Smirl 2008-07-27 19:20 ` Johannes Schindelin 0 siblings, 1 reply; 10+ messages in thread From: Jon Smirl @ 2008-07-27 19:07 UTC (permalink / raw) To: Petr Baudis; +Cc: Johannes Schindelin, Git Mailing List On 7/27/08, Petr Baudis <pasky@suse.cz> wrote: > On Sun, Jul 27, 2008 at 02:16:30PM -0400, Jon Smirl wrote: > > On 7/27/08, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > > > Something like .mailmap? > > > > > > And to show the mapped author name instead of the committed one, you would > > > use "--pretty=format:%aN"? (Needs 1.6.0-rc0 at least, IIRC) > > > > So we can already do this? Where is a .mailmap for the kernel tree? > > > http://repo.or.cz/w/linux-2.6.git?a=blob;f=.mailmap > > ...right there. :-) I updated to 1.6.0-rc0 and this is working. mailmap needs some cleanup. Errors are still in the list, but this is a lot better than it was. That made about 800 'contributors' disappear. Is there a way to do short log and have it map the names? What about replacing the emails with their current email address? Random missing entries.... Greg KH Greg Kroah-Hartman Hans J Koch Hans J. Koch Jean-Christophe Dubois Jean-Christophe DUBOIS Miguel Boton Miguel Botón -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Cleaning up log messages 2008-07-27 19:07 ` Jon Smirl @ 2008-07-27 19:20 ` Johannes Schindelin 2008-07-27 19:31 ` Jon Smirl 0 siblings, 1 reply; 10+ messages in thread From: Johannes Schindelin @ 2008-07-27 19:20 UTC (permalink / raw) To: Jon Smirl; +Cc: Petr Baudis, Git Mailing List Hi, On Sun, 27 Jul 2008, Jon Smirl wrote: > On 7/27/08, Petr Baudis <pasky@suse.cz> wrote: > > On Sun, Jul 27, 2008 at 02:16:30PM -0400, Jon Smirl wrote: > > > On 7/27/08, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > > > > > Something like .mailmap? > > > > > > > > And to show the mapped author name instead of the committed one, you would > > > > use "--pretty=format:%aN"? (Needs 1.6.0-rc0 at least, IIRC) > > > > > > So we can already do this? Where is a .mailmap for the kernel tree? > > > > http://repo.or.cz/w/linux-2.6.git?a=blob;f=.mailmap > > > > ...right there. :-) > > I updated to 1.6.0-rc0 and this is working. mailmap needs some > cleanup. Errors are still in the list, but this is a lot better than > it was. That made about 800 'contributors' disappear. > > Is there a way to do short log and have it map the names? Yes, as of v1.6.0-rc0~58 you can pass --pretty=format: to shortlog. > What about replacing the emails with their current email address? Nope, that was never meant to be done. Ciao, Dscho ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Cleaning up log messages 2008-07-27 19:20 ` Johannes Schindelin @ 2008-07-27 19:31 ` Jon Smirl 2008-07-27 20:16 ` Johannes Schindelin 0 siblings, 1 reply; 10+ messages in thread From: Jon Smirl @ 2008-07-27 19:31 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Petr Baudis, Git Mailing List On 7/27/08, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > Hi, > > > On Sun, 27 Jul 2008, Jon Smirl wrote: > > > On 7/27/08, Petr Baudis <pasky@suse.cz> wrote: > > > On Sun, Jul 27, 2008 at 02:16:30PM -0400, Jon Smirl wrote: > > > > On 7/27/08, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > > > > > > > Something like .mailmap? > > > > > > > > > > And to show the mapped author name instead of the committed one, you would > > > > > use "--pretty=format:%aN"? (Needs 1.6.0-rc0 at least, IIRC) > > > > > > > > So we can already do this? Where is a .mailmap for the kernel tree? > > > > > > http://repo.or.cz/w/linux-2.6.git?a=blob;f=.mailmap > > > > > > ...right there. :-) > > > > I updated to 1.6.0-rc0 and this is working. mailmap needs some > > cleanup. Errors are still in the list, but this is a lot better than > > it was. That made about 800 'contributors' disappear. > > > > Is there a way to do short log and have it map the names? > > > Yes, as of v1.6.0-rc0~58 you can pass --pretty=format: to shortlog. How do you do it with git log? --pretty overrides the default of medium --pretty[=<format>] Pretty-print the contents of the commit logs in a given format, where <format> can be one of oneline, short, medium, full, fuller, email, raw and format:<string>. When omitted, the format defaults to medium. > > > > What about replacing the emails with their current email address? > > > Nope, that was never meant to be done. > > Ciao, > Dscho > > -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Cleaning up log messages 2008-07-27 19:31 ` Jon Smirl @ 2008-07-27 20:16 ` Johannes Schindelin 0 siblings, 0 replies; 10+ messages in thread From: Johannes Schindelin @ 2008-07-27 20:16 UTC (permalink / raw) To: Jon Smirl; +Cc: Petr Baudis, Git Mailing List Hi, On Sun, 27 Jul 2008, Jon Smirl wrote: > How do you do it with git log? --pretty overrides the default of medium > > --pretty[=<format>] > > Pretty-print the contents of the commit logs in a given format, > where <format> can be one of oneline, short, medium, full, fuller, > email, raw and format:<string>. When omitted, the format defaults to > medium. You get it _almost_ with $ f='commit %H%nAuthor: %aN <%ae>%nDate: %ad%n%n%s%n%n%b' $ git log --pretty="format:$f" The only difference being that the commit message is not indented. If you really need that, it is easy to add. But I rather doubt that you need it, as you want to make statistics, and therefore need to pipe the output into a script anyway. Ciao, Dscho ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Cleaning up log messages 2008-07-27 17:50 Cleaning up log messages Jon Smirl 2008-07-27 18:01 ` Johannes Schindelin @ 2008-07-27 18:47 ` Junio C Hamano 2008-07-27 20:52 ` Jon Smirl 1 sibling, 1 reply; 10+ messages in thread From: Junio C Hamano @ 2008-07-27 18:47 UTC (permalink / raw) To: Jon Smirl; +Cc: Git Mailing List "Jon Smirl" <jonsmirl@gmail.com> writes: > I was playing around with git log for the kernel and observed that > there is a lot of noise when trying to do statistics on the number of > commits. > > For example: > > Author: Greg K-H <gregkh@suse.de> > Author: Greg KH <gregkh@suse.de> > ... > Author: Greg Kroah-Hartman <greg@kroah.com> We have had .mailmap since a24e658 (git-shortlog: make the mailmap configurable., 2005-10-06); maybe the kernel tree wants a maintainer for the .mailmap file? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Cleaning up log messages 2008-07-27 18:47 ` Junio C Hamano @ 2008-07-27 20:52 ` Jon Smirl 0 siblings, 0 replies; 10+ messages in thread From: Jon Smirl @ 2008-07-27 20:52 UTC (permalink / raw) To: Junio C Hamano; +Cc: Git Mailing List On 7/27/08, Junio C Hamano <gitster@pobox.com> wrote: > "Jon Smirl" <jonsmirl@gmail.com> writes: > > > I was playing around with git log for the kernel and observed that > > there is a lot of noise when trying to do statistics on the number of > > commits. > > > > For example: > > > > Author: Greg K-H <gregkh@suse.de> > > Author: Greg KH <gregkh@suse.de> > > ... > > > Author: Greg Kroah-Hartman <greg@kroah.com> > > > We have had .mailmap since a24e658 (git-shortlog: make the mailmap > configurable., 2005-10-06); maybe the kernel tree wants a maintainer for > the .mailmap file? This seems to be the main problem. There are so many missing entries from the .mailmap file that I didn't think this feature was implemented. I'd guestimate that 300-400 needed entries are missing. I've made a few attempts at writing a script to fix the easy ones but I don't have a good solution yet. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2008-07-27 20:53 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-07-27 17:50 Cleaning up log messages Jon Smirl 2008-07-27 18:01 ` Johannes Schindelin 2008-07-27 18:16 ` Jon Smirl 2008-07-27 18:33 ` Petr Baudis 2008-07-27 19:07 ` Jon Smirl 2008-07-27 19:20 ` Johannes Schindelin 2008-07-27 19:31 ` Jon Smirl 2008-07-27 20:16 ` Johannes Schindelin 2008-07-27 18:47 ` Junio C Hamano 2008-07-27 20:52 ` Jon Smirl
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).