From: Jon Seymour <jon.seymour@gmail.com>
To: Git Mailing List <git@vger.kernel.org>
Subject: [PATCH] Modify git-rev-list ... [ resend to fix whitespace mangle ]
Date: Sat, 4 Jun 2005 08:34:31 +1000 [thread overview]
Message-ID: <2cfc40320506031534456c8619@mail.gmail.com> (raw)
In-Reply-To: <2cfc4032050603153419b72444@mail.gmail.com>
On 6/4/05, Junio C Hamano <junkio@cox.net> wrote:
> I only fast-forwarded over maths on your website about the epoch
> ordering, but what the patch tries to do sounds right to me.
>
> One nitpick request I have is to name its test t6000 series, to
> match the naming convention Pasky came up with?
>
> First digit tells the family:
>
> 0 - the absolute basics and global stuff
> 1 - the basic commands concerning database
> 2 - the basic commands concerning the working tree
> 3 - the other basic commands (e.g. ls-files)
> 4 - the diff commands
> 5 - the pull and exporting commands
> 6 - the revision tree commands (even e.g. merge-base)
>
Sure. I'll modify the patch and repost.
> Also I wonder what the performance implication of this patch is.
>
It actually works quite well. I took some care to make sure the
algorithm didn't have to scan the entire tree if you only want part of
it and is roughly linear in the number of edges it actually processes.
It does this by borrowing the original heuristic of a latest date
first scan, and also by accumulating "work" to do prior to propagating
it through the network. This means that most nodes are only visited (a
small-multiple of) once.
One thing to be aware of is that it does have to scan to epoch
boundaries which, in the Linux kernel, are spaced an average of 100
commits apart. The first time you run it on a cold-cache, the output
pauses for a while after the first few commits. But run it again when
the cache is warm and the performance is, IMO, quite ok.
Here are the cold-cache stats on the Linux-2.6 tree which has 1300
commits and quite a lot of merges and forks
$ time git-rev-list --merge-order --show-breaks HEAD > out 2>err
real 0m32.283s
user 0m0.135s
sys 0m0.141s
And the warm cache stats:
$ time git-rev-list --merge-order --show-breaks HEAD > out 2>err
real 0m0.181s
user 0m0.119s
sys 0m0.040s
Compared with the standard algorithm:
$ time git-rev-list HEAD > out 2>err
real 0m0.112s
user 0m0.065s
sys 0m0.032s
So, it is perhaps 50-100% slower on a warm-cache than the standard algorithm.
jon.
--
homepage: http://www.zeta.org.au/~jon/
blog: http://orwelliantremors.blogspot.com/
prev parent reply other threads:[~2005-06-03 22:31 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-06-02 15:49 [PATCH] Modify git-rev-list ... [ resend to fix whitespace mangle ] jon
[not found] ` <7v8y1rkrxk.fsf@assigned-by-dhcp.cox.net>
[not found] ` <2cfc4032050603153419b72444@mail.gmail.com>
2005-06-03 22:34 ` Jon Seymour [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2cfc40320506031534456c8619@mail.gmail.com \
--to=jon.seymour@gmail.com \
--cc=git@vger.kernel.org \
--cc=jon@blackcubes.dyndns.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).