From: Linus Torvalds <torvalds@osdl.org>
To: Marco Costalba <mcostalba@gmail.com>
Cc: Junio C Hamano <junkio@cox.net>, git@vger.kernel.org
Subject: Re: [WISH] Store also tag dereferences in packed-refs
Date: Sun, 19 Nov 2006 12:36:46 -0800 (PST) [thread overview]
Message-ID: <Pine.LNX.4.64.0611191219350.3692@woody.osdl.org> (raw)
In-Reply-To: <e5bfff550611191209s63982818vd3999b543e68e8df@mail.gmail.com>
On Sun, 19 Nov 2006, Marco Costalba wrote:
>
> It does not seems there are strange delays, but total time it's high
> (very I/O bound)
This looks more normal. No truly horrid IO times. With your disk, having
an uncached "stat64()" taking ~50ms is not at all impossible, if you just
end up having to do a few seeks for directory/inode information.
> $ time strace -o tracefile -Ttt git show-ref -d >> /dev/null
> 0.02user 0.01system 0:02.39elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (127major+894minor)pagefaults 0swaps
So in addition to the "stat()" calls on all the objects you have
referenced, you also had 127 page faults that needed to do IO (probably a
combination of executable and pack-file accesses).
I think the only way to avoid this is likely to try to either not do the
object lookups at all (which you really cannot currently avoid with "-d",
since the whole point is to dereference the objects if they are tags), or
to do some silly optimizations like fsck does.
For example, it's often (but not always) faster to do all the readdir's
separately, and then sort the thing by inode number, and try to avoid
back-and-forth movement. But quite frankly, that kind of stuff probably
isn't sane to do in "git show-refs".
So the optimizations that _can_ be done are:
- add dereference info to .git/packed-refs
This would allow us to simply not do the expensive object lookup for
every single tag. We'd still have to do it for non-packed objects, of
course, but the cost here tends to be that over time you might have
hundreds of tags, and even if each tag only takes 0.02s to look up,
you're going to be slow.
- avoid the references for "heads/" (which we know are supposed to be
commits, and cannot be tags) and when not specifying "-d". This won't
help your case very much, though. If you want "-d", you want it, and
the _big_ number of refs tends to be in tags, not branches, anyway.
- using a filesystem wih nicer locality behaviour for directory entries
and inodes. This can cut down costs of cold-cache case by a factor of
two, but right now there are no good filesystems that do this (but see
for example "spadfs" that Mikulas Patocka announced a few weeks ago on
linux-kernel - it would seem to have the possibility of being better in
this area. I looked at the code and it looked like it could become
very reasonable, but I've not actually _tested_ it, soo...)
Anyway, I think that if we really want to make "git show-refs" go fast
when things are cold in the cache, and with lots ot tags and "-d" (which
is a reasonable case to optimize for: it's probably exactly what we end up
doing both for gitweb _and_ for "git-send-pack"), we'd need to expand the
packed-refs file with the deref cache.
Junio?
next prev parent reply other threads:[~2006-11-19 20:37 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-11-18 9:15 [WISH] Store also tag dereferences in packed-refs Marco Costalba
2006-11-18 18:38 ` Junio C Hamano
2006-11-18 18:43 ` Petr Baudis
2006-11-18 18:47 ` Marco Costalba
2006-11-18 19:04 ` Junio C Hamano
2006-11-19 0:28 ` Marco Costalba
2006-11-19 1:11 ` Linus Torvalds
2006-11-19 1:40 ` Junio C Hamano
2006-11-19 1:45 ` Junio C Hamano
2006-11-19 1:59 ` Linus Torvalds
2006-11-19 9:40 ` Marco Costalba
2006-11-19 18:05 ` Linus Torvalds
2006-11-19 19:07 ` Marco Costalba
2006-11-19 20:09 ` Marco Costalba
2006-11-19 20:36 ` Linus Torvalds [this message]
2006-11-19 20:44 ` Linus Torvalds
2006-11-19 21:01 ` Junio C Hamano
2006-11-19 21:14 ` Linus Torvalds
2006-11-19 21:24 ` Jakub Narebski
2006-11-19 23:36 ` Linus Torvalds
2006-11-20 2:35 ` Junio C Hamano
2006-11-20 9:40 ` Jakub Narebski
2006-11-20 12:56 ` Marco Costalba
2006-11-20 16:29 ` Linus Torvalds
2006-11-20 19:32 ` Junio C Hamano
2006-11-19 22:25 ` Marco Costalba
2006-11-19 23:26 ` Linus Torvalds
2006-11-19 20:18 ` Linus Torvalds
[not found] ` <200611201154.08732.jnareb@gmail.com>
[not found] ` <7vu00u2wln.fsf@assigned-by-dhcp.cox.net>
2006-11-20 11:33 ` Jakub Narebski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0611191219350.3692@woody.osdl.org \
--to=torvalds@osdl.org \
--cc=git@vger.kernel.org \
--cc=junkio@cox.net \
--cc=mcostalba@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).