git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@osdl.org>
To: Marco Costalba <mcostalba@gmail.com>
Cc: Junio C Hamano <junkio@cox.net>, git@vger.kernel.org
Subject: Re: [WISH] Store also tag dereferences in packed-refs
Date: Sun, 19 Nov 2006 12:36:46 -0800 (PST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0611191219350.3692@woody.osdl.org> (raw)
In-Reply-To: <e5bfff550611191209s63982818vd3999b543e68e8df@mail.gmail.com>



On Sun, 19 Nov 2006, Marco Costalba wrote:
> 
> It does not seems there are strange delays, but total time it's high
> (very I/O bound)

This looks more normal. No truly horrid IO times. With your disk, having 
an uncached "stat64()" taking ~50ms is not at all impossible, if you just 
end up having to do a few seeks for directory/inode information.

> $ time strace -o tracefile -Ttt git show-ref -d >> /dev/null
> 0.02user 0.01system 0:02.39elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (127major+894minor)pagefaults 0swaps

So in addition to the "stat()" calls on all the objects you have 
referenced, you also had 127 page faults that needed to do IO (probably a 
combination of executable and pack-file accesses). 

I think the only way to avoid this is likely to try to either not do the 
object lookups at all (which you really cannot currently avoid with "-d", 
since the whole point is to dereference the objects if they are tags), or 
to do some silly optimizations like fsck does.

For example, it's often (but not always) faster to do all the readdir's 
separately, and then sort the thing by inode number, and try to avoid 
back-and-forth movement. But quite frankly, that kind of stuff probably 
isn't sane to do in "git show-refs".

So the optimizations that _can_ be done are:

 - add dereference info to .git/packed-refs

   This would allow us to simply not do the expensive object lookup for 
   every single tag. We'd still have to do it for non-packed objects, of 
   course, but the cost here tends to be that over time you might have 
   hundreds of tags, and even if each tag only takes 0.02s to look up, 
   you're going to be slow.

 - avoid the references for "heads/" (which we know are supposed to be 
   commits, and cannot be tags) and when not specifying "-d". This won't 
   help your case very much, though. If you want "-d", you want it, and 
   the _big_ number of refs tends to be in tags, not branches, anyway.

 - using a filesystem wih nicer locality behaviour for directory entries 
   and inodes. This can cut down costs of cold-cache case by a factor of 
   two, but right now there are no good filesystems that do this (but see 
   for example "spadfs" that Mikulas Patocka announced a few weeks ago on 
   linux-kernel - it would seem to have the possibility of being better in 
   this area. I looked at the code and it looked like it could become 
   very reasonable, but I've not actually _tested_ it, soo...)

Anyway, I think that if we really want to make "git show-refs" go fast 
when things are cold in the cache, and with lots ot tags and "-d" (which 
is a reasonable case to optimize for: it's probably exactly what we end up 
doing both for gitweb _and_ for "git-send-pack"), we'd need to expand the 
packed-refs file with the deref cache.

Junio?


  reply	other threads:[~2006-11-19 20:37 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-11-18  9:15 [WISH] Store also tag dereferences in packed-refs Marco Costalba
2006-11-18 18:38 ` Junio C Hamano
2006-11-18 18:43   ` Petr Baudis
2006-11-18 18:47     ` Marco Costalba
2006-11-18 19:04       ` Junio C Hamano
2006-11-19  0:28         ` Marco Costalba
2006-11-19  1:11           ` Linus Torvalds
2006-11-19  1:40             ` Junio C Hamano
2006-11-19  1:45               ` Junio C Hamano
2006-11-19  1:59                 ` Linus Torvalds
2006-11-19  9:40             ` Marco Costalba
2006-11-19 18:05               ` Linus Torvalds
2006-11-19 19:07                 ` Marco Costalba
2006-11-19 20:09                   ` Marco Costalba
2006-11-19 20:36                     ` Linus Torvalds [this message]
2006-11-19 20:44                       ` Linus Torvalds
2006-11-19 21:01                       ` Junio C Hamano
2006-11-19 21:14                         ` Linus Torvalds
2006-11-19 21:24                           ` Jakub Narebski
2006-11-19 23:36                             ` Linus Torvalds
2006-11-20  2:35                               ` Junio C Hamano
2006-11-20  9:40                                 ` Jakub Narebski
2006-11-20 12:56                                   ` Marco Costalba
2006-11-20 16:29                                 ` Linus Torvalds
2006-11-20 19:32                                   ` Junio C Hamano
2006-11-19 22:25                       ` Marco Costalba
2006-11-19 23:26                         ` Linus Torvalds
2006-11-19 20:18                   ` Linus Torvalds
     [not found] ` <200611201154.08732.jnareb@gmail.com>
     [not found]   ` <7vu00u2wln.fsf@assigned-by-dhcp.cox.net>
2006-11-20 11:33     ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0611191219350.3692@woody.osdl.org \
    --to=torvalds@osdl.org \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=mcostalba@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).