git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jakub Narebski <jnareb@gmail.com>
Cc: Nanako Shiraishi <nanako3@lavabit.com>,
	git@vger.kernel.org, Luben Tuikov <ltuikov@yahoo.com>
Subject: Re: [PATCH 2/3] gitweb: Cache $parent_commit info in git_blame()
Date: Wed, 10 Dec 2008 12:27:24 -0800	[thread overview]
Message-ID: <7v7i67zsgj.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: 200812101439.18526.jnareb@gmail.com

Jakub Narebski <jnareb@gmail.com> writes:

> On Wed, 10 Dec 2008, Nanako Shiraishi wrote:
>> Quoting Jakub Narebski <jnareb@gmail.com>:
>> 
>> > Unfortunately the implementation in 244a70e used one call for
>> > git-rev-parse to find parent revision per line in file, instead of
>> > using long lived "git cat-file --batch-check" (which might not existed
>> > then), or changing validate_refname to validate_revision and made it
>> > accept <rev>^, <rev>^^, <rev>^^^ etc. syntax.
>> 
>> Could you substantiate why this is "Unfortunate"?
>
> Because it calls git-rev-parse once for _each line_, even if for lines
> in the group of neighbour lines blamed by same commit $parent_commit
> is the same, and even if you need to calculate $parent_commit only once
> per unique individual commit present in blame output.

It probably was obvious that this was meant as a patch for better
performance without changing the functionality.  I tend to think the
presentation wasn't so great, though.

    Luben Tuikov changed 'lineno' link from leading to commit which lead
    to current version of given block of lines, to leading to parent of
    this commit in 244a70e (Blame "linenr" link jumps to previous state at
    "orig_lineno").  This supposedly made data mining possible (or just
    better).

Other than "supposedly" which I do not think should be there, I think this
is a great opening paragraph to establish the context.

    Unfortunately the implementation in 244a70e used one call for
    git-rev-parse to find parent revision per line in file, instead of
    using long lived "git cat-file --batch-check" (which might not existed
    then), or changing validate_refname to validate_revision and made it
    accept <rev>^, <rev>^^, <rev>^^^ etc. syntax.

But I do not think this is such a great second paragraph that states what
problem it tries to solve.  I'd say something like this instead:

        The current implementation calls rev-parse once per line to find
        parent revision of blamed commit, even when the same commit
        appears more than once, which is inefficient.

And then the outline of the solution:

    This patch attempts to migitate issue a bit by caching $parent_commit
    info in %metainfo, which makes gitweb to call git-rev-parse only once
    per unique commit in blame output.

which is very good, except that I do not think you need to say "a bit".
And have your benchmark (two tables and footnotes) after this outline of
the solution.

I think the first part of "Unfortunately" paragraph can be dropped
(because that is already in the first half of problem description) and the
rest can come as the last paragraph as "Possible future enhancements".

> Appendix A:
> ~~~~~~~~~~~
> #!/bin/bash
>
> export GATEWAY_INTERFACE="CGI/1.1"
> export HTTP_ACCEPT="*/*"
> export REQUEST_METHOD="GET"
> export QUERY_STRING=""$1""
> export PATH_INFO=""$2""
>
> export GITWEB_CONFIG="/home/jnareb/git/gitweb/gitweb_config.perl"
>
> perl -- /home/jnareb/git/gitweb/gitweb.perl
>
> # end of gitweb-run.sh

I'd suggest making a separate patch to add "gitweb-run.sh" in contrib/ so
that other people can use it when checking their changes to gitweb.  The
script might need some more polishing, though.  For example, it is not
very obvious if you have *_config.perl only to customize for your
environment (e.g. where the test repositories are) or you need to have
some overrides in there when you are running gitweb as a standalone script.

To recap, I think the commit log for this patch would have been much
easier to read if it were presented in this order:

	a paragraph to establish the context;

	a paragraph to state what problem it tries to solve;

        a paragraph (or more) to explain the solution; and finally

	a paragraph to discuss possible future enhancements.

  reply	other threads:[~2008-12-10 20:28 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-09 22:43 [PATCH 0/3] gitweb: Improve git_blame in preparation for incremental blame Jakub Narebski
2008-12-09 22:46 ` [PATCH 1/3] gitweb: Move 'lineno' id from link to row element in git_blame Jakub Narebski
2008-12-10  5:55   ` Luben Tuikov
2008-12-17  8:13   ` Petr Baudis
2008-12-09 22:48 ` [PATCH 2/3] gitweb: Cache $parent_commit info in git_blame() Jakub Narebski
2008-12-10  3:49   ` Nanako Shiraishi
2008-12-10 13:39     ` Jakub Narebski
2008-12-10 20:27       ` Junio C Hamano [this message]
2008-12-11  0:33         ` [PATCH 2/3 (edit v2)] " Jakub Narebski
2008-12-11  4:08           ` Luben Tuikov
2008-12-11  4:18             ` Junio C Hamano
2008-12-12  3:05           ` Junio C Hamano
2008-12-12 17:20             ` Jakub Narebski
2008-12-17  8:19           ` Petr Baudis
2008-12-17  8:34             ` Junio C Hamano
2008-12-10  6:20   ` [PATCH 2/3] " Luben Tuikov
2008-12-10 15:15     ` Jakub Narebski
2008-12-10 20:05       ` Luben Tuikov
2008-12-10 21:03         ` Jakub Narebski
2008-12-10 21:15           ` Luben Tuikov
2008-12-09 22:48 ` [PATCH 3/3] gitweb: A bit of code cleanup " Jakub Narebski
2008-12-10  2:13   ` Jakub Narebski
2008-12-10  8:35     ` Junio C Hamano
2008-12-10  6:24   ` Luben Tuikov
2008-12-10 20:11 ` [RFC/PATCH 4/3] gitweb: Incremental blame (proof of concept) Jakub Narebski
2008-12-11  0:47   ` Junio C Hamano
2008-12-11  1:22     ` Jakub Narebski
2008-12-11 17:28   ` Jakub Narebski
2008-12-11 22:34     ` Jakub Narebski
2008-12-14  0:17   ` [RFC/PATCH v2] " Jakub Narebski
2008-12-14 16:11     ` [RFC] gitweb: Incremental blame - suggestions for improvements Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7v7i67zsgj.fsf@gitster.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    --cc=ltuikov@yahoo.com \
    --cc=nanako3@lavabit.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).