From: Nicolas Pitre <nico@cam.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>,
Git Mailing List <git@vger.kernel.org>, Chris Lee <clee@kde.org>
Subject: Re: kde.git is now online
Date: Thu, 05 Apr 2007 18:00:11 -0400 (EDT) [thread overview]
Message-ID: <alpine.LFD.0.98.0704051703140.28181@xanadu.home> (raw)
In-Reply-To: <Pine.LNX.4.64.0704051338290.6730@woody.linux-foundation.org>
On Thu, 5 Apr 2007, Linus Torvalds wrote:
> Without "--full", it doesn't actually really do anything much, since it
> will basically ignore objects that are in the pack.
>
> With --full, there are certainly things that we could improve upon. We
> currently tend to walk things a few times for pack contents:
> - first we do the SHA1 of the full pack
> - then we go back, and unpack and fsck each entry in the pack.
>
> So if the pack-file is too big to fit in memory, we'll basically always
> read it at least twice (and that's ignoring the fact that delta lookup
> will obviously seek back and forth, which makes access patterns worse).
>
> On the other hand, there's a perfectly good reason why we don't actually
> fsck pack-files by default. They're "stable storage". You don't normally
> need to. So I'd not worry too much about fsck performance.
Well.... still it certainly can be helped a bit. I wouldn't mind it
spending half an hour of CPU if it needs to. But I just interrupted it
with ^C with the following result so far:
real 75m44.374s
user 2m5.318s
sys 0m54.059s
(I should have used /usr/bin/time to see the number of page faults).
> I suspect you'll find that with 1GB or RAM you'll have other
> performance problems that are more pressing ("git clone" comes to mind
> ;)
Well... same issue actually. git-pack-objects spent about 40 secs
firmly at 100% CPU usage counting objects.
Then it got stuck on:
remote: Done counting 4111366 objects.
again spending 3% CPU and the rest waiting for IO with the disk
definitely trashing. It didn't allocate more than 47% of memory during
that phase which lasted a few minutes.
Then, the "Indexing 4111366 objects." message appeared and CPU usage
went up to 6% CPU with 67% memory for pack-objects and 30% CPU and 7%
memory for index-pack while the rest was spent waiting for IO. This
also took maybe two minutes.
And now it reached the "Resolving 3305158 deltas." phase with only
index-pack on the radar with approx 10% CPU and 19% memory, and the rest
of the time waiting for IO again.
It has been probably half an our now and the thing is at:
21% (710502/3305158) done
So it will work and eventually complete. And the good news is that the
worst part performance wise is on the client side. But it looks like
we're definitely trashing the kernel buffer cache.
Nicolas
next prev parent reply other threads:[~2007-04-05 22:00 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-04-05 17:03 kde.git is now online H. Peter Anvin
2007-04-05 17:30 ` Linus Torvalds
2007-04-05 17:38 ` Nicolas Pitre
2007-04-05 19:45 ` Nicolas Pitre
2007-04-05 20:51 ` Linus Torvalds
2007-04-05 22:00 ` Nicolas Pitre [this message]
2007-04-06 1:24 ` Linus Torvalds
2007-04-05 18:03 ` Chris Lee
2007-04-05 21:26 ` Junio C Hamano
2007-04-06 11:32 ` Geert Bosch
2007-04-06 12:59 ` Nicolas Pitre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.0.98.0704051703140.28181@xanadu.home \
--to=nico@cam.org \
--cc=clee@kde.org \
--cc=git@vger.kernel.org \
--cc=hpa@zytor.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).