git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicolas Pitre <nico@cam.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Shawn O. Pearce" <spearce@spearce.org>,
	Geert Bosch <bosch@adacore.com>, Andi Kleen <andi@firstfloor.org>,
	Ken Pratt <ken@kenpratt.net>,
	git@vger.kernel.org
Subject: Re: pack operation is thrashing my server
Date: Thu, 14 Aug 2008 17:50:18 -0400 (EDT)	[thread overview]
Message-ID: <alpine.LFD.1.10.0808141633080.4352@xanadu.home> (raw)
In-Reply-To: <alpine.LFD.1.10.0808141215520.3324@nehalem.linux-foundation.org>

On Thu, 14 Aug 2008, Linus Torvalds wrote:

> Here's a hint: the cost of a cache miss is generally about a hundred times 
> the cost of just about anything else. 
> 
> So to make a convincing argument, you'd have to show that the actual 
> memory access patterns are also much better.
> 
> No, zlib isn't perfect, and nope, inflate_fast() is no "memcpy()". And 
> yes, I'm sure a pure memcpy would be much faster. But I seriously suspect 
> that a lot of the cost is literally in bringing in the source data to the 
> CPU. Because we just mmap() the whole pack-file, the first access to the 
> data is going to see the cost of the cache misses.

Possible.  However, the fact that both the "Compressing objects" and the 
"Writing objects" phases during a repack (without -f) together are 
_faster_ than the "Counting objects" phase is a sign that something is 
more significant than cache misses here, especially when tree 
information is a small portion of the total pack data size.

Of course we can do further profiling, say with core.compression set to 
0 and a full repack, or even hacking the pack-objects code to force a 
compression level of 0 for tree objects, and possibly commits too since 
pack v4 intend to deflate only the log text).  Tree objects delta very 
well, but they don't deflate well at all.

OK, so I did, and the quick test for the kernel is:

|nico@xanadu:linux-2.6> time git rev-list --all --objects > /dev/null
|
|real    0m14.737s
|user    0m14.432s
|sys     0m0.296s

That's for 1031404 objects, hence we're now talking around 70k 
objects/sec instead of 48k objects/sec.  _Only_ by removing zlib out of 
the equation despite the fact that the pack is now larger.  So I bet 
that additional improvements from pack v4 could improve things even 
more, including the object lookup avoidance optimization I mentioned 
previously.


Nicolas

  parent reply	other threads:[~2008-08-14 21:51 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-10 19:47 pack operation is thrashing my server Ken Pratt
2008-08-10 23:06 ` Martin Langhoff
2008-08-10 23:12   ` Ken Pratt
2008-08-10 23:30     ` Martin Langhoff
2008-08-10 23:34       ` Ken Pratt
2008-08-11  3:04 ` Shawn O. Pearce
2008-08-11  7:43   ` Ken Pratt
2008-08-11 15:01     ` Shawn O. Pearce
2008-08-11 15:40       ` Avery Pennarun
2008-08-11 15:59         ` Shawn O. Pearce
2008-08-11 19:13       ` Ken Pratt
2008-08-11 19:10     ` Andi Kleen
2008-08-11 19:15       ` Ken Pratt
2008-08-13  2:38         ` Nicolas Pitre
2008-08-13  2:50           ` Andi Kleen
2008-08-13  2:57             ` Shawn O. Pearce
2008-08-11 19:22       ` Shawn O. Pearce
2008-08-11 19:29         ` Ken Pratt
2008-08-11 19:34           ` Shawn O. Pearce
2008-08-11 20:10             ` Andi Kleen
2008-08-13  3:12       ` Geert Bosch
2008-08-13  3:15         ` Shawn O. Pearce
2008-08-13  3:58           ` Geert Bosch
2008-08-13 14:37             ` Nicolas Pitre
2008-08-13 14:56               ` Jakub Narebski
2008-08-13 15:04                 ` Shawn O. Pearce
2008-08-13 15:26                   ` David Tweed
2008-08-13 23:54                     ` Martin Langhoff
2008-08-14  9:04                       ` David Tweed
2008-08-13 16:10                   ` Johan Herland
2008-08-13 17:38                     ` Ken Pratt
2008-08-13 17:57                       ` Nicolas Pitre
2008-08-13 14:35         ` Nicolas Pitre
2008-08-13 14:59           ` Shawn O. Pearce
2008-08-13 15:43             ` Nicolas Pitre
2008-08-13 15:50               ` Shawn O. Pearce
2008-08-13 17:04                 ` Nicolas Pitre
2008-08-13 17:19                   ` Shawn O. Pearce
2008-08-14  6:33                   ` Andreas Ericsson
2008-08-14 10:04                     ` Thomas Rast
2008-08-14 10:15                       ` Andreas Ericsson
2008-08-14 22:33                         ` Shawn O. Pearce
2008-08-15  1:46                           ` Nicolas Pitre
2008-08-14 14:01                     ` Nicolas Pitre
2008-08-14 17:21                   ` Linus Torvalds
2008-08-14 17:58                     ` Linus Torvalds
2008-08-14 19:04                       ` Nicolas Pitre
2008-08-14 19:44                         ` Linus Torvalds
2008-08-14 21:30                           ` Andi Kleen
2008-08-15 16:15                             ` Linus Torvalds
2008-08-14 21:50                           ` Nicolas Pitre [this message]
2008-08-14 23:14                             ` Linus Torvalds
2008-08-14 23:39                               ` Björn Steinbrink
2008-08-15  0:06                                 ` Linus Torvalds
2008-08-15  0:25                                   ` Linus Torvalds
2008-08-16 12:47                                   ` Björn Steinbrink
2008-08-16  0:34                               ` Linus Torvalds
2008-09-07  1:03                                 ` Junio C Hamano
2008-09-07  1:46                                   ` Linus Torvalds
2008-09-07  2:33                                     ` Junio C Hamano
2008-09-07 17:11                                       ` Nicolas Pitre
2008-09-07 17:41                                         ` Junio C Hamano
2008-09-07  2:50                                     ` Jon Smirl
2008-09-07  3:07                                       ` Linus Torvalds
2008-09-07  3:43                                         ` Jon Smirl
2008-09-07  4:50                                           ` Linus Torvalds
2008-09-07 13:58                                             ` Jon Smirl
2008-09-07 17:08                                               ` Nicolas Pitre
2008-09-07 20:33                                                 ` Jon Smirl
2008-09-08 14:17                                                   ` Nicolas Pitre
2008-09-08 15:12                                                     ` Jon Smirl
2008-09-08 16:01                                                       ` Jon Smirl
2008-09-07  8:18                                         ` Andreas Ericsson
2008-09-07  7:45                                     ` Mike Hommey
2008-08-14 18:38                     ` Nicolas Pitre
2008-08-14 18:55                       ` Linus Torvalds
2008-08-13 16:01           ` Geert Bosch
2008-08-13 17:13             ` Dana How
2008-08-13 17:26             ` Nicolas Pitre
2008-08-13 12:43 ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.1.10.0808141633080.4352@xanadu.home \
    --to=nico@cam.org \
    --cc=andi@firstfloor.org \
    --cc=bosch@adacore.com \
    --cc=git@vger.kernel.org \
    --cc=ken@kenpratt.net \
    --cc=spearce@spearce.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).