All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andreas Ericsson <ae@op5.se>
To: Nicolas Pitre <nico@cam.org>
Cc: "Shawn O. Pearce" <spearce@spearce.org>,
	Geert Bosch <bosch@adacore.com>, Andi Kleen <andi@firstfloor.org>,
	Ken Pratt <ken@kenpratt.net>,
	git@vger.kernel.org
Subject: Re: pack operation is thrashing my server
Date: Thu, 14 Aug 2008 08:33:59 +0200	[thread overview]
Message-ID: <48A3D1D7.5030805@op5.se> (raw)
In-Reply-To: <alpine.LFD.1.10.0808131228270.4352@xanadu.home>

Nicolas Pitre wrote:
> On Wed, 13 Aug 2008, Shawn O. Pearce wrote:
> 
>> Nicolas Pitre <nico@cam.org> wrote:
>>> Well, we are talking about 50MB which is not that bad.
>> I think we're closer to 100MB here due to the extra overheads
>> I just alluded to above, and which weren't in your 104 byte
>> per object figure.
> 
> Sure.  That should still be workable on a machine with 256MB of RAM.
> 
>>> However there is a point where we should be realistic and just admit 
>>> that you need a sufficiently big machine if you have huge repositories 
>>> to deal with.  Git should be fine serving pull requests with relatively 
>>> little memory usage, but anything else such as the initial repack simply 
>>> require enough RAM to be effective.
>> Yea.  But it would also be nice to be able to just concat packs
>> together.  Especially if the repository in question is an open source
>> one and everything published is already known to be in the wild,
>> as say it is also available over dumb HTTP.  Yea, I know people
>> like the 'security feature' of the packer not including objects
>> which aren't reachable.
> 
> It is not only that, even if it is a point I consider important.  If you 
> end up with 10 packs, it is likely that a base object in each of those 
> packs could simply be a delta against a single common base object, and 
> therefore the amount of data to transfer might be up to 10 times higher 
> than necessary.
> 

[cut]

>> This is also true for many internal corporate repositories.
>> Users probably have full read access to the object database anyway,
>> and maybe even have direct write access to it.  Doing the object
>> enumeration there is pointless as a security measure.
> 
> It is good for network bandwidth efficiency as I mentioned.
> 

As a corporate git user, I can say that I'm very rarely worried
about how much data gets sent over our in-office gigabit network.
My primary concern wrt server side git is cpu- and IO-heavy
operations, as we run the entire machine in a vmware guest os
which just plain sucks at such things.

With that in mind, a config variable in /etc/gitconfig would
work wonderfully for that situation, as our central watering
hole only ever serves locally.

>> I'm too busy to write a pack concat implementation proposal, so
>> I'll just shutup now.  But it wouldn't be hard if someone wanted
>> to improve at least the initial clone serving case.
> 
> A much better solution would consist of finding just _why_ object 
> enumeration is so slow.  This is indeed my biggest grip with git 
> performance at the moment.
> 
> |nico@xanadu:linux-2.6> time git rev-list --objects --all > /dev/null
> |
> |real    0m21.742s
> |user    0m21.379s
> |sys     0m0.360s
> 
> That's way too long for 1030198 objects (roughly 48k objects/sec).  And 
> it gets even worse with the gcc repository:
> 
> |nico@xanadu:gcc> time git rev-list --objects --all > /dev/null
> |
> |real    1m51.591s
> |user    1m50.757s
> |sys     0m0.810s
> 
> That's for 1267993 objects, or about 11400 objects/sec.
> 
> Clearly something is not scaling here.
> 

What are the different packing options for the two repositories?
A longer deltachain and larger packwindow would increase the
enumeration time, wouldn't it?

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

  parent reply	other threads:[~2008-08-14  6:35 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-10 19:47 pack operation is thrashing my server Ken Pratt
2008-08-10 23:06 ` Martin Langhoff
2008-08-10 23:12   ` Ken Pratt
2008-08-10 23:30     ` Martin Langhoff
2008-08-10 23:34       ` Ken Pratt
2008-08-11  3:04 ` Shawn O. Pearce
2008-08-11  7:43   ` Ken Pratt
2008-08-11 15:01     ` Shawn O. Pearce
2008-08-11 15:40       ` Avery Pennarun
2008-08-11 15:59         ` Shawn O. Pearce
2008-08-11 19:13       ` Ken Pratt
2008-08-11 19:10     ` Andi Kleen
2008-08-11 19:15       ` Ken Pratt
2008-08-13  2:38         ` Nicolas Pitre
2008-08-13  2:50           ` Andi Kleen
2008-08-13  2:57             ` Shawn O. Pearce
2008-08-11 19:22       ` Shawn O. Pearce
2008-08-11 19:29         ` Ken Pratt
2008-08-11 19:34           ` Shawn O. Pearce
2008-08-11 20:10             ` Andi Kleen
2008-08-13  3:12       ` Geert Bosch
2008-08-13  3:15         ` Shawn O. Pearce
2008-08-13  3:58           ` Geert Bosch
2008-08-13 14:37             ` Nicolas Pitre
2008-08-13 14:56               ` Jakub Narebski
2008-08-13 15:04                 ` Shawn O. Pearce
2008-08-13 15:26                   ` David Tweed
2008-08-13 23:54                     ` Martin Langhoff
2008-08-14  9:04                       ` David Tweed
2008-08-13 16:10                   ` Johan Herland
2008-08-13 17:38                     ` Ken Pratt
2008-08-13 17:57                       ` Nicolas Pitre
2008-08-13 14:35         ` Nicolas Pitre
2008-08-13 14:59           ` Shawn O. Pearce
2008-08-13 15:43             ` Nicolas Pitre
2008-08-13 15:50               ` Shawn O. Pearce
2008-08-13 17:04                 ` Nicolas Pitre
2008-08-13 17:19                   ` Shawn O. Pearce
2008-08-14  6:33                   ` Andreas Ericsson [this message]
2008-08-14 10:04                     ` Thomas Rast
2008-08-14 10:15                       ` Andreas Ericsson
2008-08-14 22:33                         ` Shawn O. Pearce
2008-08-15  1:46                           ` Nicolas Pitre
2008-08-14 14:01                     ` Nicolas Pitre
2008-08-14 17:21                   ` Linus Torvalds
2008-08-14 17:58                     ` Linus Torvalds
2008-08-14 19:04                       ` Nicolas Pitre
2008-08-14 19:44                         ` Linus Torvalds
2008-08-14 21:30                           ` Andi Kleen
2008-08-15 16:15                             ` Linus Torvalds
2008-08-14 21:50                           ` Nicolas Pitre
2008-08-14 23:14                             ` Linus Torvalds
2008-08-14 23:39                               ` Björn Steinbrink
2008-08-15  0:06                                 ` Linus Torvalds
2008-08-15  0:25                                   ` Linus Torvalds
2008-08-16 12:47                                   ` Björn Steinbrink
2008-08-16  0:34                               ` Linus Torvalds
2008-09-07  1:03                                 ` Junio C Hamano
2008-09-07  1:46                                   ` Linus Torvalds
2008-09-07  2:33                                     ` Junio C Hamano
2008-09-07 17:11                                       ` Nicolas Pitre
2008-09-07 17:41                                         ` Junio C Hamano
2008-09-07  2:50                                     ` Jon Smirl
2008-09-07  3:07                                       ` Linus Torvalds
2008-09-07  3:43                                         ` Jon Smirl
2008-09-07  4:50                                           ` Linus Torvalds
2008-09-07 13:58                                             ` Jon Smirl
2008-09-07 17:08                                               ` Nicolas Pitre
2008-09-07 20:33                                                 ` Jon Smirl
2008-09-08 14:17                                                   ` Nicolas Pitre
2008-09-08 15:12                                                     ` Jon Smirl
2008-09-08 16:01                                                       ` Jon Smirl
2008-09-07  8:18                                         ` Andreas Ericsson
2008-09-07  7:45                                     ` Mike Hommey
2008-08-14 18:38                     ` Nicolas Pitre
2008-08-14 18:55                       ` Linus Torvalds
2008-08-13 16:01           ` Geert Bosch
2008-08-13 17:13             ` Dana How
2008-08-13 17:26             ` Nicolas Pitre
2008-08-13 12:43 ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48A3D1D7.5030805@op5.se \
    --to=ae@op5.se \
    --cc=andi@firstfloor.org \
    --cc=bosch@adacore.com \
    --cc=git@vger.kernel.org \
    --cc=ken@kenpratt.net \
    --cc=nico@cam.org \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.