git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Narebski <jnareb@gmail.com>
To: git@vger.kernel.org
Subject: Re: [RFC] Possible optimization for gitweb
Date: Tue, 19 Dec 2006 22:45:32 +0100	[thread overview]
Message-ID: <em9mcs$moo$1@sea.gmane.org> (raw)
In-Reply-To: 20061219205422.GA17864@localhost

[Please send replies Cc: git mailing list]

Robert Fitzsimons wrote:

> While looking at the gitweb source yesterday, I noticed a number of
> similar expensive workflows used by a number of actions (summary,
> shortlog, log, rss, atom, and history).
> 
> The current workflows are:
>       get ~100 sha1's using rev-list
>       foreach sha1
>               get/parse 1 commit using rev-list
>               output commit
> 
> The new workflows I'm proposing would be:
>       get/parse ~100 commit's using rev-list
>       foreach commit
>               output commit

I have tried this approach too. Take a look at

  http://repo.or.cz/w/git/jnareb-git.git?a=log;h=Attic/gitweb/parse_rev_list

or at discussion started with
  Message-Id: <200609061504.40725.jnareb@gmail.com>
  http://mid.gmane.org/200609061504.40725.jnareb@gmail.com

> The following simplified commands gives an idea of the git only overhead
> between these two workflows.
> 
> time \
> for r in `git-rev-list --max-count=100 HEAD --` ; \
> do git-rev-list --header --parents --max-count=1 $r -- ; \
> done > /dev/null
> 
> real    0m0.490s
> user    0m0.224s
> sys     0m0.228s
> 
> time \
> git-rev-list --header --parents --max-count=100 HEAD -- > /dev/null
> 
> real    0m0.058s
> user    0m0.008s
> sys     0m0.004s
> 
> There would seems to be a benefit from making the proposed change to
> these workflows, when run on my machine against a clone of Linus's tree.

The problem is that it works only for "log" and "shortlog" views, but
it doesn't work for "history" view. Now both share the same infrastructure.
The problem is that when there is path limiter (be it file or directory)
the history is simplified, and parents are _rewritten_ according to
simplified history. And this happen depending on strange combination
of --header, --parents and --full-history. Should be somewhere in archives.

And we don't want to use parents from commit object, because there might
be grafts, or it might be shallow clone.

On the other hand, we don't really need parents for log, shortlog and
history...

> One issue with this change is that, gitweb is page orientated.  Page 0
> shows the first 100 items from a given hash, page 1 uses the same given
> hash but show 100 to 199 items, etc.  Using 'git-rev-list --header
> --parents' and then throwing away most of the result is very wasteful.
> 
> So I'm suggesting we add a new option to git-rev-list which will only
> start show results once its has iterated past a given number of items.
> Using a caret or tilde doesn't seem to return the same result.
> 
> I've attached a discussion patch which adds a new option --start-count
> to git-rev-list and changed the summary and showlog actions of gitweb to
> use this new option.

Very nice idea.
 
> I'm sure there are many improvements to this patch, comments?

Perhaps this patch should be split in two? (Usually either second mail is
reply to first mail, or both are replies to introductory letter, usually
with table of contents and diffstat of series).

[...]

Documentation (of --start-count / --skip option), please?


P.S. Thanks for the patches.

P.P.S. Do you have any comments to latest "[RFC] gitweb wishlist and TODO
list" series?
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


  reply	other threads:[~2006-12-19 21:43 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-19 20:54 [RFC] Possible optimization for gitweb Robert Fitzsimons
2006-12-19 21:45 ` Jakub Narebski [this message]
2006-12-20  0:52   ` Robert Fitzsimons
2006-12-19 22:10 ` Junio C Hamano
2006-12-19 22:22   ` Jakub Narebski
2006-12-20  0:29     ` [PATCH] rev-list: Add a new option --skip Robert Fitzsimons
2006-12-20  1:09       ` Junio C Hamano
2006-12-20 14:59         ` [PATCH] rev-list: Document --skip and add test cases Robert Fitzsimons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='em9mcs$moo$1@sea.gmane.org' \
    --to=jnareb@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).