git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] gitweb wishlist and TODO list
@ 2006-10-09 12:49 Jakub Narebski
  2006-10-10  1:47 ` Luben Tuikov
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Jakub Narebski @ 2006-10-09 12:49 UTC (permalink / raw)
  To: git

There is a new part of gitweb TODO list and wishlist; planned features and
features it would be nice to have. If you have new ideas, if you want some
features to be implemented first, if you use some web interface to some SCM
you like, please do contribute.

I've tried to divide this TODO/wishlist into categories.


1. Cleanups and refactoring

 * HTML and CSS cleanup. All (or almost all) styling should be done using
   CSS, and not embedded style or presentation elements. All HTML elements
   except perhaps a few should have either class attribute (if such
   element can be multiple times on page) or id attribute (if there can be
   only one such element). Perhaps some class attributes should be changed
   to id attributes. Gitweb has much improved from the incorporation in
   this area.

 * CSS refactoring. Try to avoid repeating the same styling, using
   combination of descendant and _child_ selectors, perhaps also
   adjacent sibling selector and attribute/attribute value selector.
   Perhaps large reorganization patch (moving contents within CSS and adding
   comments) is to be done...

 * Code refactoring. Separate/refactor common parts and put them into
   separate subrotines OR collapse similar subrotines into one subroutine
   with an argument selecting the case. For example git_blob and
   git_blob_plain could be collapsed; git_shortlog, git_tags/git_heads,
   git_log, git_history do similar work. When the new
   --grep/--author/--commiter options to git-rev-list hits released version,
   perhaps also git_search could be put together with the previous set.
   git_rss does similar work as a git_summary.

 * Refactor printing related links (like "blame | history | raw" for blob
   entries in tree view) into separate subroutine. The list depends both on
   the kind of object pointed, and on the current action/view.

 * Perhaps refactor reading and validation of parameters, except the ones
   used for dispatch i.e. project and action parameters, into separate
   subroutine
 
        my ($hash, $hash_base) = gitweb_params('hash', 'hash_base');    

   I'm not sure if it is/would be usefull.


2. Git.pm-ish (subroutines which in generalized version are/could be
   in Git.pm)
 
 * Refactor calling a git command and reading it's output into separate
   subroutine git_command/git_pipe, so for example if someone _has_ to use
   gitweb with ancient Perl version which does not understand list version
   of magic "-|" open could do it changing only one subroutine. Well, we can
   use Git.pm when it hits main release.

 * Add subroutine/subroutines, which given a full name of ref, returns
   either another ref if input ref was symlink/symref, or hash of the
   pointed object, and which work not only with ordinary loose refs, but
   also with symlinks, symrefs (up to some level of recursion) and packed
   refs. All without calling any git command. But I guess that currently
   it is not needed at all.

 * Add simplified git config file parser, which would _read_ only gitweb
   entries (and convert them to bool/int if necessary). With this we could
   move description, category, export_ok, .hide, cloneurl to config file,
   instead of cluttering $GIT_DIR. Or just make it an option (read file
   first, if it doesn't exist try config file).

 * Parsing of remotes/ files _and_ equivalent config entries, for adding
   information (tooltip?) about tracking branches in heads view, and for
   adding information about given subdirectory in refs/remotes/ (see below).


3. Optimizing gitweb

 * Use git-for-each-ref (when it hits released version) to speed up of
   generation of summary, heads and tags views. It would also enable the
   option of having most recent commit date in projects list view, and not
   most recent commit in current branch (in HEAD).
 
 * Add better support for mod_perl, e.g. $r->path_info(), via checking for
   MOD_PERL enviromental variable.

 * Better support for mod_perl/FastCGI, perhaps wrapping the changeable part
   into gitweb_handler subroutine, and calling it.


4. New features

 * Add support for other directories in $refs/ besides "heads" and
   "tags" directories, for example refs/remotes/ generated when cloning with
   --use-separate-remote option. On short TODO list.

 * Add categories support a la gitweb-xmms2 to the projects list view (and
   perhaps also OPML); perhaps with option to use first part of path to
   repository as category.

 * Code highlighting (or generic filter) support in blob view, perhaps as
   a feature. Proposed tools for generating syntax highlighting include
   Highlight (http://www.andre-simon.de) and GNU Source Highlight
   (http://www.gnu.org/software/src-highlite) a la gitweb-xmms2.
   Gitweb-xmms2 uses Highlight, and due to the tags support uses temporary
   files. I think that CSS for code highlighting should be in separate file,
   and that selecting syntax to use should be done using mime.types like
   file rather than gitweb-xmms2 internal configuration (hash of
   extensions).

 * Committags support from gitweb-xmms2 in commit, commitdiff, log views and
   in the top commit summary/title link on most pages. There was preliminary
   patch on git mailing list for committags support (more general than
   the support in gitweb-xmms2), with current commitsha link (now in
   format_log_line_html) implemented as committag. Junio had quite
   a good idea how to avoid having to do committags _after_ HTML escaping,
   and how to stack committags. I'm not sure if it wouldn't be better to try
   to do all committags in one go, instead of stacking. Perhaps also commit
   message "syntax highlighting" (i.e. highlighting signoff lines) and empty
   lines simplification should be done using committags.

 * Crossreferencing in blob view. Gitweb-xmms2 uses if I remember correctly
   etags to generate anchors and to generate hyperlinks to definition of
   function. GNU Highlight can use encumberant-tags IIRC. Both need I think
   temporary files for index. Perhaps this should be done rather as a part
   of gitweb/git integration with LXR Cross Referencer. 

   Do you know other projects that could be used instead of etags here?

   I'm not sure if it is worth to pursue implementing it now.

 * Improve blame view, making use of --porcelain option to git-blame (for
   later). Perhaps change blame view from table based one to div based one.
   Use different colors for different commits (graph coloring problem).

 * Perhaps add some kind of finding closest preceding/following tag. and on
   which branch we are on. Tempered of course by the concerns of
   performance. What is possible for locally run history browser like gitk
   or qgit, might be not feasible on server run web interface.

 * Add information from remotes/ to heads view, for example the following
     tracks branch 'master' of git://git.kernel.org/pub/scm/git/git (origin)
   as a tooltip for 'origin' branch. But what if one branch tracks more than
   one remote? Needs to use also config file.

 * Support for tracking renames in history view. Simple rename tracking
   I think could be done directly in gitweb; more advanced would need
   --follow option (i.e. core git improvement).

 * log/shortlog should be a format, so we could have log-like history, tags,
   heads views.

 * add summary of number of lines changed for each file (a la darcsview)
   in the difftree part of commit and *diff* views, e.g.

        blame.c   +1 -0  diff | history | blame

   or something like that.

 * add extended header to the commitdiff and perhaps blobdiff views,
   hyperlinked. _This_ would add some patches to commitdiff view, which are
   now IIRC visible only in difftree part now.

 * enable sorting tags/heads view by name instead of sorting it by date.


5. New views

 * Reflog view (most probably limited to heads only). I'm not sure if it is
   worth time spend on calling git commands to mark unreachable commits for
   example using strikethrough, and hyperlink reachable. Any ideas how such
   a view should look like?

 * ViewVC-like tree-blame view. There was RFC patch adding tree_blame view
   some time ago here, on git mailing list. The main problem of course is
   performance. We could implement tree_blame purely in gitweb as it was
   done in mentioned patch (having --stdin option to git-ls-tree would
   help), or add new core command/extend git-blame for directories. There is
   also a question if we want to find blame for tree entries, or not.

 * "List of the files in given directory, touched by given commit"

 * Perhaps ad Atom feed support as an alternative to RSS, and XOXO as an
   alternative to OPML.

 * Graph of number of changed files in given branch; probably should be
   cached.


X. Proposed improvements to core git commands
 * add --stdin option to git-ls-tree, a la --stdin option to git-diff-tree.
 * add --follow option to git-rev-list, allow to provide path limiter via
   stdin (with --stdin option) in git-diff-tree
 * add --numstat option to git-diff; currently only git-apply has it.


Thoughts? Comments?
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] gitweb wishlist and TODO list
  2006-10-09 12:49 [RFC] gitweb wishlist and TODO list Jakub Narebski
@ 2006-10-10  1:47 ` Luben Tuikov
  2006-10-10  8:54   ` Jakub Narebski
  2006-10-11  5:52 ` Junio C Hamano
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Luben Tuikov @ 2006-10-10  1:47 UTC (permalink / raw)
  To: Jakub Narebski, git

--- Jakub Narebski <jnareb@gmail.com> wrote:
>  * Improve blame view, making use of --porcelain option to git-blame (for
>    later). Perhaps change blame view from table based one to div based one.

>    Use different colors for different commits (graph coloring problem).

Oh, no please no.

Why do you think I left the color list as a list?  I did try to use
more colors when I wrote it, and it was ugly as h3ll and very distracting
when doing real work.  So I ended up with the two color (shades) we have
now and this is what I submitted.

Also, any kind of "graph coloring problem" would make blame slow.

In any way, if you/someone does implement this "coloring" can you please
make it an option, because I'll never turn it on.  Thanks!

     Luben

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] gitweb wishlist and TODO list
  2006-10-10  1:47 ` Luben Tuikov
@ 2006-10-10  8:54   ` Jakub Narebski
  0 siblings, 0 replies; 10+ messages in thread
From: Jakub Narebski @ 2006-10-10  8:54 UTC (permalink / raw)
  To: Luben Tuikov; +Cc: git

Luben Tuikov wrote:
> --- Jakub Narebski <jnareb@gmail.com> wrote:
> >  * Improve blame view, making use of --porcelain option to git-blame (for
> >    later). Perhaps change blame view from table based one to div based one.
> 
> >    Use different colors for different commits (graph coloring problem).
> 
> Oh, no please no.
> 
> Why do you think I left the color list as a list?  I did try to use
> more colors when I wrote it, and it was ugly as h3ll and very distracting
> when doing real work.  So I ended up with the two color (shades) we have
> now and this is what I submitted.
> 
> Also, any kind of "graph coloring problem" would make blame slow.

One of ideas (without having some nice _mathematical_ solution, i.e. not
having for to try and check different coloring but calculating coloring, 
of a blame graph coloring problem[*1*]) was to use few colors, 3, 6, 8
based on some hash of sha1 of commit (for example first character) plus
alternating "darkness" of those colors to ensure that neighbours would
have different colors. Another was to use first 6 characters of sha1 as
a color, then flatten the color to suitable for background (perhaps also
with some kind of ensuring that neigbour blames would have different color). 

Junio idea of basing color/brighness (of some part of blame output at
least) on the _age_ of region (perhaps using two altrenating _colors_)
has also it's merit.

Nevertheless, such change would be preceded by an RFC, and discussion.

> In any way, if you/someone does implement this "coloring" can you please
> make it an option, because I'll never turn it on.  Thanks!

Not a problem to make blame coloring a feature.


Footnotes:
[*1*] Blame graph coloring problem: 1) regions blamed on the same commit
should have the same color 2) neighbour blame regions should have different
colors.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] gitweb wishlist and TODO list
  2006-10-09 12:49 [RFC] gitweb wishlist and TODO list Jakub Narebski
  2006-10-10  1:47 ` Luben Tuikov
@ 2006-10-11  5:52 ` Junio C Hamano
  2006-10-11  9:20   ` Jakub Narebski
  2006-10-12 10:03   ` Junio C Hamano
  2006-10-11 15:09 ` Jakub Narebski
  2006-10-11 23:05 ` [RFC] gitweb wishlist and TODO list Jakub Narebski
  3 siblings, 2 replies; 10+ messages in thread
From: Junio C Hamano @ 2006-10-11  5:52 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> 1. Cleanups and refactoring
>
>  * HTML and CSS cleanup. All (or almost all) styling should be done using
>    CSS, and not embedded style or presentation elements. All HTML elements
>    except perhaps a few should have either class attribute (if such
>    element can be multiple times on page) or id attribute (if there can be
>    only one such element). Perhaps some class attributes should be changed
>    to id attributes. Gitweb has much improved from the incorporation in
>    this area.

It scares me when somebody says "all X should do Y".  Aiming for
consistency and cleanliness is good but taking it to extreme and
becoming dogmatic about it isn't.  Let's not repeat the crusade
against redundant links.

>    ... When the new
>    --grep/--author/--commiter options to git-rev-list hits released version,
>    perhaps also git_search could be put together with the previous set.

Sounds like a good idea, but I think people can (and should)
start preparing for it in "next"; after all that is what "next"
is for.

>  * Refactor calling a git command and reading it's output into separate
>    subroutine git_command/git_pipe, so for example if someone _has_ to use
>    gitweb with ancient Perl version which does not understand list version
>    of magic "-|" open could do it changing only one subroutine. Well, we can
>    use Git.pm when it hits main release.

I agree this is a good thing to do while refactoring.  There are
too many similar looking code sprinkled all over.  Git.pm is
already in the "master" and there is nothing cooking in "next".

>  * Add simplified git config file parser, which would _read_ only gitweb
>    entries (and convert them to bool/int if necessary). With this we could
>    move description, category, export_ok, .hide, cloneurl to config file,
>    instead of cluttering $GIT_DIR. Or just make it an option (read file
>    first, if it doesn't exist try config file).

I do not see why you would need anything "simplified"; I think
writing a .git/config parser purely in Perl is much easier than
waiting for libified interface that talks .xs and would run just
as efficient -- after all Perl is the ideal tool for text file
processing like this.  And I do not particularly worry about
issues that could arise from two different configuration parsers
having different set of bugs.  The file format is simple enough.
It would be a very good addition to Git.pm suite.

>  * Add categories support a la gitweb-xmms2 to the projects list view (and
>    perhaps also OPML); perhaps with option to use first part of path to
>    repository as category.

Perhaps; hosting site would want this.

>  * Code highlighting (or generic filter) support in blob view, perhaps as
>    a feature.

Not particularly interested myself but as long as it would not
add huge load on the server I would not much object either.

>  * Crossreferencing in blob view.

Lxr is certainly interesting, but I would rather use local "git grep".

>  * add summary of number of lines changed for each file (a la darcsview)
>    in the difftree part of commit and *diff* views, e.g.
>
>         blame.c   +1 -0  diff | history | blame
>
>    or something like that.

I'll place "diff --numstat" to the stack of "things to do on the
core side".  Should be trivial.

>  * Reflog view (most probably limited to heads only). I'm not sure if it is
>    worth time spend on calling git commands to mark unreachable commits for
>    example using strikethrough, and hyperlink reachable. Any ideas how such
>    a view should look like?

If the feature is useful, do not be afraid to add core side
support for it.  As long as the proposed core side support is
reasonable and not too specific to a niche task, that is.

>  * "List of the files in given directory, touched by given commit"

Have no idea what you mean.  "diff-tree -r --name-only $commit"?

> X. Proposed improvements to core git commands
>  * add --stdin option to git-ls-tree, a la --stdin option to git-diff-tree.

Not particularly interested, as it is unclear how the output
boundary should be marked, but should be trivial to add once we
know what the output should look like.

>  * add --follow option to git-rev-list, allow to provide path limiter via
>    stdin (with --stdin option) in git-diff-tree

The "path limiter via stdin" part is murky.  I would not object
to "rev-list --follow=$this_path_at_the_tip $start_at_this_commit"
which I can see clear semantics for. 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] gitweb wishlist and TODO list
  2006-10-11  5:52 ` Junio C Hamano
@ 2006-10-11  9:20   ` Jakub Narebski
  2006-10-12 10:03   ` Junio C Hamano
  1 sibling, 0 replies; 10+ messages in thread
From: Jakub Narebski @ 2006-10-11  9:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
>> 1. Cleanups and refactoring
>>
>>  * HTML and CSS cleanup. All (or almost all) styling should be done using
>>    CSS, and not embedded style or presentation elements. All HTML elements
>>    except perhaps a few should have either class attribute (if such
>>    element can be multiple times on page) or id attribute (if there can be
>>    only one such element). Perhaps some class attributes should be changed
>>    to id attributes. Gitweb has much improved from the incorporation in
>>    this area.
> 
> It scares me when somebody says "all X should do Y".  Aiming for
> consistency and cleanliness is good but taking it to extreme and
> becoming dogmatic about it isn't.  Let's not repeat the crusade
> against redundant links.

Well, after writing this part I have checked that we don't use
"style" attribute in gitweb anymore. But we still do use presentational
elements, like <i>. IMHO we should use CSS for styling.

>>    ... When the new
>>    --grep/--author/--commiter options to git-rev-list hits released version,
>>    perhaps also git_search could be put together with the previous set.
> 
> Sounds like a good idea, but I think people can (and should)
> start preparing for it in "next"; after all that is what "next"
> is for.

Using --grep/--author/--commiter would have the advantage of to be
easily able to paginate "log search" in gitweb.

I forgot to add splitting git_search into "log search" and "pickaxe search",
and perhaps adding "file search" aka. "grep search" to gitweb.
 
>>  * Refactor calling a git command and reading it's output into separate
>>    subroutine git_command/git_pipe, so for example if someone _has_ to use
>>    gitweb with ancient Perl version which does not understand list version
>>    of magic "-|" open could do it changing only one subroutine. Well, we can
>>    use Git.pm when it hits main release.
> 
> I agree this is a good thing to do while refactoring.  There are
> too many similar looking code sprinkled all over.  Git.pm is
> already in the "master" and there is nothing cooking in "next".

I'm not sure if I would like to use Git.pm repository abstraction.
But converting gitweb to use Git.pm would be a good idea, I agree.
Although I'd rather have any gitweb patches which need _unreleased_
features to be in 'next'.
 
BTW. the Git.pm-ish ideas (config parser, remotes parser, symrefs and
packed refs parser) should perhaps be added (at least in 'next')
via Git.pm.

>>  * Add simplified git config file parser, which would _read_ only gitweb
>>    entries (and convert them to bool/int if necessary). With this we could
>>    move description, category, export_ok, .hide, cloneurl to config file,
>>    instead of cluttering $GIT_DIR. Or just make it an option (read file
>>    first, if it doesn't exist try config file).
> 
> I do not see why you would need anything "simplified"; I think
> writing a .git/config parser purely in Perl is much easier than
> waiting for libified interface that talks .xs and would run just
> as efficient -- after all Perl is the ideal tool for text file
> processing like this.  And I do not particularly worry about
> issues that could arise from two different configuration parsers
> having different set of bugs.  The file format is simple enough.
> It would be a very good addition to Git.pm suite.

There are many INI file parsers in CPAN, but I guess that Git adds
it's own config file syntax (e.g. branch and remote config:
  [branch "quoted branch name with funny characters]
which is not yet documented if I remember correctly) that we want our
own parser; this would also reduce dependencies.

"Simplified" because of not implementing "extended syntax" mentioned
above, and because implementing only reading config file. It is harder
to make it write config file, preserving comments etc.
 
By the way, would it be better to use CGI like syntax of 
$repo->config("gitweb.$key"), or tie hash?

>>  * Code highlighting (or generic filter) support in blob view, perhaps as
>>    a feature.
> 
> Not particularly interested myself but as long as it would not
> add huge load on the server I would not much object either.

It should be: as a feature, not perhaps as a feature. Perhaps
make highlighting configurable (program to use, filename to mode
mapping, etc.)
 
>>  * Crossreferencing in blob view.
> 
> LXR is certainly interesting, but I would rather use local "git grep".

This is far, far in the queue at least for me. And I'm not sure
if crossreferencing can be done without creating temporary files,
something we tried to avoid (e.g. creating diffs on-the-fly now).
 
>>  * add summary of number of lines changed for each file (a la darcsview)
>>    in the difftree part of commit and *diff* views, e.g.
>>
>>         blame.c   +1 -0  diff | history | blame
>>
>>    or something like that.
> 
> I'll place "diff --numstat" to the stack of "things to do on the
> core side".  Should be trivial.

Thanks. I did wonder why git-apply (!) has "--numstat" but git-diff
has not... 

>>  * "List of the files in given directory, touched by given commit"
> 
> Have no idea what you mean.  "diff-tree -r --name-only $commit"?

I'm repeating verbatim someone idea. IIRC it meant adding list
of affected files (like difftre part of "commit" and "commitdiff"
views) to the "log" and "search" views...
 
>> X. Proposed improvements to core git commands
>>  * add --stdin option to git-ls-tree, a la --stdin option to git-diff-tree.
> 
> Not particularly interested, as it is unclear how the output
> boundary should be marked, but should be trivial to add once we
> know what the output should look like.

The output format for git-ls-tree is
	<mode> SP <type> SP <object> TAB <file>
It is fairly easy to distinguish such line from the
	<sha1 of tree-ish>
line. 

The idea was that 
	echo "tree-ish 1" "tree-ish 2" | git ls-tree --stdin
output would be
	<sha1 of tree-ish 1>
	<mode> SP <type> SP <object> TAB <file>
	...
	<mode> SP <type> SP <object> TAB <file>
	<sha1 of tree-ish 2>
	<mode> SP <type> SP <object> TAB <file>
	...
	<mode> SP <type> SP <object> TAB <file>

We could add some records (trees) seperating, for example by NUL
character like in git-rev-list --headers.
 
>>  * add --follow option to git-rev-list, allow to provide path limiter via
>>    stdin (with --stdin option) in git-diff-tree
> 
> The "path limiter via stdin" part is murky.  I would not object
> to "rev-list --follow=$this_path_at_the_tip $start_at_this_commit"
> which I can see clear semantics for. 
 
You can provide <tree-ish> or pair of <tree-ish> from stdin for
git-diff-tree --stdin. You can provide path limiter _only_ as an
argument to git-diff-tree. Proposed extension is to be able to
use
	<tree-ish> [<tree-ish>] [<path>...]
from stdin, perhaps _forcing_ to use
	 <tree-ish> [<tree-ish>] ['--' <path>...]
syntax.

Alternatively, change semantics of path limiter if path limiter
match _exactly_ --follow argument.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] gitweb wishlist and TODO list
  2006-10-09 12:49 [RFC] gitweb wishlist and TODO list Jakub Narebski
  2006-10-10  1:47 ` Luben Tuikov
  2006-10-11  5:52 ` Junio C Hamano
@ 2006-10-11 15:09 ` Jakub Narebski
  2006-10-15 15:17   ` [RFC] Ideas for new "stats" view in gitweb Jakub Narebski
  2006-10-11 23:05 ` [RFC] gitweb wishlist and TODO list Jakub Narebski
  3 siblings, 1 reply; 10+ messages in thread
From: Jakub Narebski @ 2006-10-11 15:09 UTC (permalink / raw)
  To: git

Jakub Narebski wrote:

>  * Graph of number of changed files in given branch; probably should be
>    cached.

See for example StatCVS and FishEye
  http://www-128.ibm.com/developerworks/java/library/j-statcvs/
  http://statcvs.sourceforge.net/statcvs-stats/

  http://fisheye.codehaus.org/browse/activecluster
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] gitweb wishlist and TODO list
  2006-10-09 12:49 [RFC] gitweb wishlist and TODO list Jakub Narebski
                   ` (2 preceding siblings ...)
  2006-10-11 15:09 ` Jakub Narebski
@ 2006-10-11 23:05 ` Jakub Narebski
  3 siblings, 0 replies; 10+ messages in thread
From: Jakub Narebski @ 2006-10-11 23:05 UTC (permalink / raw)
  To: git

4. New features

 * Better support for symlinks in the "tree" view, perhaps in the
   "_filename_ -> _target_" form instead of simply "_filename_"
   if the symlink is relative, and the target is inside repository
   (not checking if it exists), "_filename_ -> target" otherwise.
   Needs some normalizing (removing of '/./' and '/../') of the symlink
   target.

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] gitweb wishlist and TODO list
  2006-10-11  5:52 ` Junio C Hamano
  2006-10-11  9:20   ` Jakub Narebski
@ 2006-10-12 10:03   ` Junio C Hamano
  2006-10-13 19:55     ` Jakub Narebski
  1 sibling, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2006-10-12 10:03 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Junio C Hamano <junkio@cox.net> writes:

> Jakub Narebski <jnareb@gmail.com> writes:
>
>>  * add summary of number of lines changed for each file (a la darcsview)
>>    in the difftree part of commit and *diff* views, e.g.
>>
>>         blame.c   +1 -0  diff | history | blame
>>
>>    or something like that.
>
> I'll place "diff --numstat" to the stack of "things to do on the
> core side".  Should be trivial.

This is only lightly tested. I haven't done test suite nor
documentation, which the list should be able to take care of,
now my git day for this week is over ;-).

-- >8 --
[PATCH] diff --numstat

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
 combine-diff.c |    9 ++++++---
 diff.c         |   29 +++++++++++++++++++++++++++--
 diff.h         |   15 ++++++++-------
 3 files changed, 41 insertions(+), 12 deletions(-)

diff --git a/combine-diff.c b/combine-diff.c
index 46d9121..65c7868 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -856,8 +856,10 @@ void diff_tree_combined(const unsigned c
 		/* show stat against the first parent even
 		 * when doing combined diff.
 		 */
-		if (i == 0 && opt->output_format & DIFF_FORMAT_DIFFSTAT)
-			diffopts.output_format = DIFF_FORMAT_DIFFSTAT;
+		int stat_opt = (opt->output_format &
+				(DIFF_FORMAT_NUMSTAT|DIFF_FORMAT_DIFFSTAT));
+		if (i == 0 && stat_opt)
+			diffopts.output_format = stat_opt;
 		else
 			diffopts.output_format = DIFF_FORMAT_NO_OUTPUT;
 		diff_tree_sha1(parent[i], sha1, "", &diffopts);
@@ -887,7 +889,8 @@ void diff_tree_combined(const unsigned c
 			}
 			needsep = 1;
 		}
-		else if (opt->output_format & DIFF_FORMAT_DIFFSTAT)
+		else if (opt->output_format &
+			 (DIFF_FORMAT_NUMSTAT|DIFF_FORMAT_DIFFSTAT))
 			needsep = 1;
 		if (opt->output_format & DIFF_FORMAT_PATCH) {
 			if (needsep)
diff --git a/diff.c b/diff.c
index fb82432..2dcad19 100644
--- a/diff.c
+++ b/diff.c
@@ -795,6 +795,23 @@ static void show_stats(struct diffstat_t
 	       set, total_files, adds, dels, reset);
 }
 
+static void show_numstat(struct diffstat_t* data, struct diff_options *options)
+{
+	int i;
+
+	for (i = 0; i < data->nr; i++) {
+		struct diffstat_file *file = data->files[i];
+
+		printf("%d\t%d\t", file->added, file->deleted);
+		if (options->line_termination &&
+		    quote_c_style(file->name, NULL, NULL, 0))
+			quote_c_style(file->name, NULL, stdout, 0);
+		else
+			fputs(file->name, stdout);
+		putchar(options->line_termination);
+	}
+}
+
 struct checkdiff_t {
 	struct xdiff_emit_state xm;
 	const char *filename;
@@ -1731,6 +1748,7 @@ int diff_setup_done(struct diff_options 
 				      DIFF_FORMAT_CHECKDIFF |
 				      DIFF_FORMAT_NO_OUTPUT))
 		options->output_format &= ~(DIFF_FORMAT_RAW |
+					    DIFF_FORMAT_NUMSTAT |
 					    DIFF_FORMAT_DIFFSTAT |
 					    DIFF_FORMAT_SUMMARY |
 					    DIFF_FORMAT_PATCH);
@@ -1740,6 +1758,7 @@ int diff_setup_done(struct diff_options 
 	 * recursive bits for other formats here.
 	 */
 	if (options->output_format & (DIFF_FORMAT_PATCH |
+				      DIFF_FORMAT_NUMSTAT |
 				      DIFF_FORMAT_DIFFSTAT |
 				      DIFF_FORMAT_CHECKDIFF))
 		options->recursive = 1;
@@ -1828,6 +1847,9 @@ int diff_opt_parse(struct diff_options *
 	else if (!strcmp(arg, "--patch-with-raw")) {
 		options->output_format |= DIFF_FORMAT_PATCH | DIFF_FORMAT_RAW;
 	}
+	else if (!strcmp(arg, "--numstat")) {
+		options->output_format |= DIFF_FORMAT_NUMSTAT;
+	}
 	else if (!strncmp(arg, "--stat", 6)) {
 		char *end;
 		int width = options->stat_width;
@@ -2602,7 +2624,7 @@ void diff_flush(struct diff_options *opt
 		separator++;
 	}
 
-	if (output_format & DIFF_FORMAT_DIFFSTAT) {
+	if (output_format & (DIFF_FORMAT_DIFFSTAT|DIFF_FORMAT_NUMSTAT)) {
 		struct diffstat_t diffstat;
 
 		memset(&diffstat, 0, sizeof(struct diffstat_t));
@@ -2612,7 +2634,10 @@ void diff_flush(struct diff_options *opt
 			if (check_pair_status(p))
 				diff_flush_stat(p, options, &diffstat);
 		}
-		show_stats(&diffstat, options);
+		if (output_format & DIFF_FORMAT_NUMSTAT)
+			show_numstat(&diffstat, options);
+		if (output_format & DIFF_FORMAT_DIFFSTAT)
+			show_stats(&diffstat, options);
 		separator++;
 	}
 
diff --git a/diff.h b/diff.h
index b48c991..435c70c 100644
--- a/diff.h
+++ b/diff.h
@@ -26,20 +26,21 @@ typedef void (*diff_format_fn_t)(struct 
 
 #define DIFF_FORMAT_RAW		0x0001
 #define DIFF_FORMAT_DIFFSTAT	0x0002
-#define DIFF_FORMAT_SUMMARY	0x0004
-#define DIFF_FORMAT_PATCH	0x0008
+#define DIFF_FORMAT_NUMSTAT	0x0004
+#define DIFF_FORMAT_SUMMARY	0x0008
+#define DIFF_FORMAT_PATCH	0x0010
 
 /* These override all above */
-#define DIFF_FORMAT_NAME	0x0010
-#define DIFF_FORMAT_NAME_STATUS	0x0020
-#define DIFF_FORMAT_CHECKDIFF	0x0040
+#define DIFF_FORMAT_NAME	0x0100
+#define DIFF_FORMAT_NAME_STATUS	0x0200
+#define DIFF_FORMAT_CHECKDIFF	0x0400
 
 /* Same as output_format = 0 but we know that -s flag was given
  * and we should not give default value to output_format.
  */
-#define DIFF_FORMAT_NO_OUTPUT	0x0080
+#define DIFF_FORMAT_NO_OUTPUT	0x0800
 
-#define DIFF_FORMAT_CALLBACK	0x0100
+#define DIFF_FORMAT_CALLBACK	0x1000
 
 struct diff_options {
 	const char *filter;
-- 
1.4.3.rc2.gdce3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [RFC] gitweb wishlist and TODO list
  2006-10-12 10:03   ` Junio C Hamano
@ 2006-10-13 19:55     ` Jakub Narebski
  0 siblings, 0 replies; 10+ messages in thread
From: Jakub Narebski @ 2006-10-13 19:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano wrote:
> This is only lightly tested. I haven't done test suite nor
> documentation, which the list should be able to take care of,
> now my git day for this week is over ;-).
> 
> -- >8 --
> [PATCH] diff --numstat

Does for example

	git diff-tree --numstat --patch-with-stat <tree-ish>

or

	git diff-tree --numstat -p <tree-ish>

work as expected, i.e. prepend diffstat in machine friendly (numstat)
format? What happens if one uses both --stat and --numstat?

-- >8 --
Documenting diff --numstat

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
Is it enough documentation? Should we provide also numstat format
description in Documentation/diff-format.txt?

 Documentation/diff-options.txt |    5 +++++
 diff.h                         |    1 +
 2 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 7b7b9e8..e112172 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -16,6 +16,11 @@
 	The width of the filename part can be controlled by
 	giving another width to it separated by a comma.
 
+--numstat::
+	Similar to \--stat, but shows number of added and
+	deleted lines in decimal notation and pathname without
+	abbreviation, to make it more machine friendly.
+
 --summary::
 	Output a condensed summary of extended header information
 	such as creations, renames and mode changes.
diff --git a/diff.h b/diff.h
index 435c70c..ce3058e 100644
--- a/diff.h
+++ b/diff.h
@@ -171,6 +171,7 @@ #define COMMON_DIFF_OPTIONS_HELP \
 "  --patch-with-raw\n" \
 "                output both a patch and the diff-raw format.\n" \
 "  --stat        show diffstat instead of patch.\n" \
+"  --numstat     show numeric diffstat instead of patch.\n" \
 "  --patch-with-stat\n" \
 "                output a patch and prepend its diffstat.\n" \
 "  --name-only   show only names of changed files.\n" \

-- 
Jakub Narebski
Poland

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC] Ideas for new "stats" view in gitweb
  2006-10-11 15:09 ` Jakub Narebski
@ 2006-10-15 15:17   ` Jakub Narebski
  0 siblings, 0 replies; 10+ messages in thread
From: Jakub Narebski @ 2006-10-15 15:17 UTC (permalink / raw)
  To: git

Jakub Narebski wrote:

> Jakub Narebski wrote:
> 
>>  * Graph of number of changed files in given branch; probably should be
>>    cached.
> 
> See for example StatCVS and FishEye
>   http://www-128.ibm.com/developerworks/java/library/j-statcvs/
>   http://statcvs.sourceforge.net/statcvs-stats/
> 
>   http://fisheye.codehaus.org/browse/activecluster

See also StatCVS-XML (http://statcvs-xml.berlios.de) and SvnStat
(http://svnstat.sourceforge.net), both derivatives of StatCVS, with
a few plots/charts added. If you know other graphical SCM statistics
tools, please mention them.

Would additional "stats" view help? Does some bit of stats help (well,
besides diffstat in commitdiff; perhaps later graphical diffstat in
gitweb)?

For example all (I think) above projects include plot of "size" (usually in
lines of code) of repository versus time, sometimes split into few top
authors or few top subdirectories; sometimes limited to some subdirectory or
even to some file only, together with the plot of "commit volume" (usually
number of commits per unit, e.g. number of commits per day, but it could be
numbers of files changed and/or number of lines added/deleted) vs time.
Tags are marked on the time scale. This supposedly helps to realize if
project/part of project/individual file is in development, refactoring or
maintenace stage. And graph with different top authors plotted using
different lines visualises which were active during which point of project
history. I'm not sure what "commit volume" plot tells us.

Next there are various tables and plots gathering statistics about authors:
lines of code + percentage, numbers of changes (commits) + percentage,
average number of lines per change, ratio of modifications to adding new
code. Git has git-shortlog for creating similar summary. That probably
helps to realize who takes what part in development of project. And there
can be similar tables, charts and plots but with module/subdirectory/file
instead of author. For example top files with respect to size, changes,
or number of revisions in history.

Then there are IMVHO not very useful (except for satisfying idle curiosity)
histograms of activity (either number of commits, or number of changed
lines) per hour of day, or per day of week, or per month of year (in older
projects).


There are some other plots, charts, tables, graphs... Please do tell which
ones would be good to have in gitweb.

BTW. we most certainly would have to use some cache I guess... and we have
just removed the need for temporary files for creating diffs...
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2006-10-15 15:17 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-09 12:49 [RFC] gitweb wishlist and TODO list Jakub Narebski
2006-10-10  1:47 ` Luben Tuikov
2006-10-10  8:54   ` Jakub Narebski
2006-10-11  5:52 ` Junio C Hamano
2006-10-11  9:20   ` Jakub Narebski
2006-10-12 10:03   ` Junio C Hamano
2006-10-13 19:55     ` Jakub Narebski
2006-10-11 15:09 ` Jakub Narebski
2006-10-15 15:17   ` [RFC] Ideas for new "stats" view in gitweb Jakub Narebski
2006-10-11 23:05 ` [RFC] gitweb wishlist and TODO list Jakub Narebski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).