git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@osdl.org>
To: Junio C Hamano <junkio@cox.net>
Cc: git@vger.kernel.org, Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: Re: [PATCH] (experimental) per-topic shortlog.
Date: Sun, 26 Nov 2006 17:06:08 -0800 (PST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0611261652520.30076@woody.osdl.org> (raw)
In-Reply-To: <7v8xhxsopp.fsf@assigned-by-dhcp.cox.net>



On Sun, 26 Nov 2006, Junio C Hamano wrote:
>
> This implements an experimental "git log-fpc" command that shows
> short-log style output sorted by topics.
> 
> A "topic" is identified by going through the first-parent
> chains; this ignores the fast-forward case, but for a top-level
> integrator it often is good enough.

Umm. May I suggest that you try this with the kernel repo too..

There, the "first parent chain" tends to be less interesting than a lot of 
other heuristics:

 - committer

   If the committer changes, you should probably consider it a break, the 
   same way a second parent would be a break. You probably won't see this 
   in the git archive, because there tends to be a single committer, but 
   on something like the kernel where we really merge other peoples repos, 
   it's going to be as good (or better) than looking at "other parents".

 - subdirectory heuristics

   Again, with git it's not very interesting, but I bet that you'd be able 
   to use heuristics like "the bulk of the changes were contained within 
   this directory tree" for projects like the kernel, and automatically 
   decide on "topics" like drivers/scsi, fs/ext3 etc.

In other words, I don't think the "fpc" decision is even very interesting. 
If you _really_ want to do a cool shortlogger, I bet it can be done, but I 
suspect that it would be a LOT cooler to do some automatic bayesian 
clustering based on committer, author and list of filenames changed.

Of course, such a thing done well would probably be worthy of a doctoral 
thesis or something. Maybe somebody on this list who is into bayesian 
clustering and doesn't have a thesis subject...

(Of course, since I haven't been in a University setting for the last ten 
years, maybe bayesian clustering isn't the cool thing to work on any 
more).

Anyway, "topics" really should be something that is extremely open to 
various clustering models, bayesian or not ..


  reply	other threads:[~2006-11-27  1:06 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-11-27  0:44 [PATCH] (experimental) per-topic shortlog Junio C Hamano
2006-11-27  1:06 ` Linus Torvalds [this message]
2006-11-27  1:38   ` Junio C Hamano
2006-11-27  1:53     ` Linus Torvalds
2006-11-27  1:55   ` Junio C Hamano
2006-11-27  2:52     ` Linus Torvalds
2006-11-27  6:48       ` Junio C Hamano
2006-11-27 16:20         ` Linus Torvalds
2006-11-27 23:46   ` Johannes Schindelin
2006-11-28  0:09     ` Junio C Hamano
2006-11-28 13:11       ` Jeff King
2006-11-28 13:43         ` Johannes Schindelin
2006-11-28 13:56           ` Jeff King
2006-11-29  0:57         ` Junio C Hamano
2006-12-01  8:11           ` Jeff King
2006-12-01 10:55             ` Junio C Hamano
2006-12-01 11:00               ` Junio C Hamano
2006-12-01 11:23               ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0611261652520.30076@woody.osdl.org \
    --to=torvalds@osdl.org \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).