git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Craig Boston <craig@olyun.gank.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Junio C Hamano <junkio@cox.net>, Git Mailing List <git@vger.kernel.org>
Subject: Re: Efficient way to import snapshots?
Date: Mon, 30 Jul 2007 20:17:07 -0500	[thread overview]
Message-ID: <20070731011707.GA91930@nowhere> (raw)
In-Reply-To: <alpine.LFD.0.999.0707301629230.4161@woody.linux-foundation.org>

On Mon, Jul 30, 2007 at 04:30:22PM -0700, Linus Torvalds wrote:
> > # On branch cvs_RELENG_4
> > nothing to commit (working directory clean)
> > git: 67.65 seconds
> 
> So I _seriously_ hope that about 65 of those 67 seconds was the "cvs 
> update -d" or something like that. 

No, the only thing included in that is

git ls-files -o | git update-index --add --stdin
git commit -a -m "${COMMITMSG}"

> Anything that takes a minute in git is way way *way* too slow. Any 
> half-way normal git operations should take less than a second.

That said, I don't think it's git's fault.  I think most of the time is
spent calling stat() on all the files.  The machine that took 60 seconds
isn't what I'd call top-of-the-line:

1st or maybe 2nd-gen Willamette CPU
512MB memory (stupid motherboard that won't accept more)
Slow disks in RAID-5 configuration
Running ZFS with less than half of the recommended minimum memory, to
the point where I had to reduce the number of vnodes that the kernel is
allowed to cache to avoid running out of KVA

A simple find(1) over the CVS checkout directory takes almost as long.
I don't think it has enough memory to cache the whole thing.  Actually I
know it can't since maxvnodes is set to 25,000 and there's 37,000 files
in the cvs checkout, so it will have to pull some directory entries from
disk regardless.

Just to be sure, I copied the cvs checkout directory and git repository
to a newer, faster dual-core machine with plenty of memory available for
caching.

The first run of 'git status' (cold cache):
git status  1.08s user 3.68s system 13% cpu 34.043 total

The second run:
git status  1.05s user 2.68s system 85% cpu 4.373 total

Based on that I'm fairly confident that most of the 60 seconds is being
spent waiting on data from the disks.  On a tmpfs filesystem I can get
it even faster (1.897 seconds)

As it's a file server for which network is the usual bottleneck, and all
the git operations will be running out of cron, I'm not too worried
about it.

Craig

  reply	other threads:[~2007-07-31  1:17 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-30 18:07 Efficient way to import snapshots? Craig Boston
2007-07-30 18:56 ` Linus Torvalds
2007-07-30 19:29   ` Craig Boston
2007-07-30 19:52     ` Linus Torvalds
2007-07-30 20:10       ` Craig Boston
2007-07-30 21:29         ` Junio C Hamano
2007-07-30 21:49           ` Craig Boston
2007-07-30 21:04       ` Junio C Hamano
2007-07-30 23:19         ` Linus Torvalds
2007-07-30 21:55       ` Junio C Hamano
2007-07-30 23:27         ` Linus Torvalds
2007-07-30 23:59           ` Junio C Hamano
2007-07-31  0:45             ` Linus Torvalds
2007-07-31  0:47               ` Junio C Hamano
2007-07-30 22:20       ` Craig Boston
2007-07-30 23:30         ` Linus Torvalds
2007-07-31  1:17           ` Craig Boston [this message]
2007-07-31  1:44             ` Linus Torvalds
2007-07-31  4:23               ` Theodore Tso
2007-07-31 13:53                 ` Craig Boston
2007-07-31 15:50                   ` Linus Torvalds
2007-07-31 16:15                     ` Theodore Tso
2007-07-31  6:23           ` David Kastrup
2007-07-31  7:54             ` Florian Weimer
2007-07-31  8:48               ` David Kastrup
2007-07-30 21:22   ` Jakub Narebski
2007-07-30 21:54 ` David Kastrup

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070731011707.GA91930@nowhere \
    --to=craig@olyun.gank.org \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).