* gc getting called on each git command ... what's wrong? @ 2011-06-08 1:33 Geoff Russell 2011-06-08 1:48 ` Peter Harris 0 siblings, 1 reply; 10+ messages in thread From: Geoff Russell @ 2011-06-08 1:33 UTC (permalink / raw) To: git Hi all, I'm running git version 1.7.0.4 on Ubuntu 10.04 LTS As of today, almost every time I do a git command, gc is getting invoked. This is a multi-gigabyte repository with over half a million objects, so this takes a while ... and I'm guessing that it shouldn't be happening anyway! I've run an fsck (which doesn't do a gc!) and the repository looks clean ... no output. I have packSizeLimit set to 30M ... not sure why I did this, was investigating something I didn't understand. There are 96 pack files. Any help greatly appreciated, many thanks, Cheers, Geoff ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: gc getting called on each git command ... what's wrong? 2011-06-08 1:33 gc getting called on each git command ... what's wrong? Geoff Russell @ 2011-06-08 1:48 ` Peter Harris 2011-06-08 16:02 ` Drew Northup 2011-06-08 17:09 ` Jakub Narebski 0 siblings, 2 replies; 10+ messages in thread From: Peter Harris @ 2011-06-08 1:48 UTC (permalink / raw) To: geoffrey.russell; +Cc: git On Tue, Jun 7, 2011 at 9:33 PM, Geoff Russell wrote: > > As of today, almost every time I do a git command, gc is getting > invoked. > There are 96 pack files. That's why. See gc.autopacklimit in "git help config" -- by default, git will gc if there are more than 50 pack files. Peter Harris ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: gc getting called on each git command ... what's wrong? 2011-06-08 1:48 ` Peter Harris @ 2011-06-08 16:02 ` Drew Northup 2011-06-08 16:29 ` Brandon Casey 2011-06-08 16:50 ` Junio C Hamano 2011-06-08 17:09 ` Jakub Narebski 1 sibling, 2 replies; 10+ messages in thread From: Drew Northup @ 2011-06-08 16:02 UTC (permalink / raw) To: Peter Harris; +Cc: geoffrey.russell, git On Tue, 2011-06-07 at 21:48 -0400, Peter Harris wrote: > On Tue, Jun 7, 2011 at 9:33 PM, Geoff Russell wrote: > > > > As of today, almost every time I do a git command, gc is getting > > invoked. <re-added> > > I have packSizeLimit set to 30M </re-added> > > There are 96 pack files. > > That's why. See gc.autopacklimit in "git help config" -- by default, > git will gc if there are more than 50 pack files. Do we want to consider ignoring (or automatically doubling, or something like that) gc.autopacklimit if that number of packs meet or exceed gc.packSizeLimit? I have no idea what the patch for this might look like, but it seems to make more sense than this situation. Just a random brain fart... -- -Drew Northup ________________________________________________ "As opposed to vegetable or mineral error?" -John Pescatore, SANS NewsBites Vol. 12 Num. 59 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: gc getting called on each git command ... what's wrong? 2011-06-08 16:02 ` Drew Northup @ 2011-06-08 16:29 ` Brandon Casey 2011-06-08 16:50 ` Junio C Hamano 1 sibling, 0 replies; 10+ messages in thread From: Brandon Casey @ 2011-06-08 16:29 UTC (permalink / raw) To: Drew Northup; +Cc: Peter Harris, geoffrey.russell, git On 06/08/2011 11:02 AM, Drew Northup wrote: > > On Tue, 2011-06-07 at 21:48 -0400, Peter Harris wrote: >> On Tue, Jun 7, 2011 at 9:33 PM, Geoff Russell wrote: >>> >>> As of today, almost every time I do a git command, gc is getting >>> invoked. > <re-added> >>> I have packSizeLimit set to 30M > </re-added> >>> There are 96 pack files. >> >> That's why. See gc.autopacklimit in "git help config" -- by default, >> git will gc if there are more than 50 pack files. > > Do we want to consider ignoring (or automatically doubling, or something > like that) gc.autopacklimit if that number of packs meet or exceed > gc.packSizeLimit? I have no idea what the patch for this might look > like, but it seems to make more sense than this situation. > > Just a random brain fart... > Or just ignore the packs that exceed pack.packSizeLimit... diff --git a/builtin/gc.c b/builtin/gc.c index ff5f73b..7be14ab 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -26,6 +26,7 @@ static int pack_refs = 1; static int aggressive_window = 250; static int gc_auto_threshold = 6700; static int gc_auto_pack_limit = 50; +static off_t pack_size_limit; static const char *prune_expire = "2.weeks.ago"; #define MAX_ADD 10 @@ -64,6 +65,10 @@ static int gc_config(const char *var, const char *value, void *cb) } return git_config_string(&prune_expire, var, value); } + if (!strcmp(var, "pack.packsizelimit")) { + pack_size_limit = git_config_ulong(var, value); + return 0; + } return git_default_config(var, value, cb); } @@ -135,10 +140,8 @@ static int too_many_packs(void) continue; if (p->pack_keep) continue; - /* - * Perhaps check the size of the pack and count only - * very small ones here? - */ + if (pack_size_limit && p->pack_size >= pack_size_limit) + continue; cnt++; } return gc_auto_pack_limit <= cnt; ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: gc getting called on each git command ... what's wrong? 2011-06-08 16:02 ` Drew Northup 2011-06-08 16:29 ` Brandon Casey @ 2011-06-08 16:50 ` Junio C Hamano 1 sibling, 0 replies; 10+ messages in thread From: Junio C Hamano @ 2011-06-08 16:50 UTC (permalink / raw) To: Drew Northup; +Cc: Peter Harris, geoffrey.russell, git Drew Northup <drew.northup@maine.edu> writes: >> That's why. See gc.autopacklimit in "git help config" -- by default, >> git will gc if there are more than 50 pack files. > > Do we want to consider ignoring (or automatically doubling, or something > like that) gc.autopacklimit if that number of packs meet or exceed > gc.packSizeLimit? I have no idea what the patch for this might look > like, but it seems to make more sense than this situation. This is unrelated to the auto-gc, but it also would be fruitful to question if it is a sane setting to limit packfiles to 30M, when the repository needs 100 of them (total around 3G??). Just like having too many loose object files degrade performance (and that is one of the reasons we pack them in the first place), having many packs will degrade performance unnecessarily and to a worse degree, as "check which pack has this particular object" code has to examine all packs, unlike the loose object case where we let the .git/objects/?? fan-out to give us some hashing and the filesystem to do the heavylifting for us. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: gc getting called on each git command ... what's wrong? 2011-06-08 1:48 ` Peter Harris 2011-06-08 16:02 ` Drew Northup @ 2011-06-08 17:09 ` Jakub Narebski 2011-06-15 1:28 ` Geoff Russell [not found] ` <BANLkTi=w10KQ3MSd5YuYR+S=eMgywNTY-A@mail.gmail.com> 1 sibling, 2 replies; 10+ messages in thread From: Jakub Narebski @ 2011-06-08 17:09 UTC (permalink / raw) To: Peter Harris; +Cc: geoffrey.russell, git Peter Harris <git@peter.is-a-geek.org> writes: > On Tue, Jun 7, 2011 at 9:33 PM, Geoff Russell wrote: > > > > As of today, almost every time I do a git command, gc is getting > > invoked. > > > There are 96 pack files. > > That's why. See gc.autopacklimit in "git help config" -- by default, > git will gc if there are more than 50 pack files. Actually it looks like it is combination of this and packSizeLimit set to 30M. Git notices that it has too many packfiles, and tries to repack them, but packlimit forces Git to split it into small packfiles... and end up with more packfiles than limit anyway. Perhaps git should notice that it has nonsensical combination of options... -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: gc getting called on each git command ... what's wrong? 2011-06-08 17:09 ` Jakub Narebski @ 2011-06-15 1:28 ` Geoff Russell [not found] ` <BANLkTi=w10KQ3MSd5YuYR+S=eMgywNTY-A@mail.gmail.com> 1 sibling, 0 replies; 10+ messages in thread From: Geoff Russell @ 2011-06-15 1:28 UTC (permalink / raw) To: git On Thu, Jun 9, 2011 at 2:39 AM, Jakub Narebski <jnareb@gmail.com> wrote: > > Peter Harris <git@peter.is-a-geek.org> writes: > > > On Tue, Jun 7, 2011 at 9:33 PM, Geoff Russell wrote: > > > > > > As of today, almost every time I do a git command, gc is getting > > > invoked. > > > > > There are 96 pack files. > > > > That's why. See gc.autopacklimit in "git help config" -- by default, > > git will gc if there are more than 50 pack files. Thanks to everybody. This is exactly what was happening and the problems went away when I set the packSizeLimit higher ... 3000M > > Actually it looks like it is combination of this and packSizeLimit set > to 30M. Git notices that it has too many packfiles, and tries to > repack them, but packlimit forces Git to split it into small > packfiles... and end up with more packfiles than limit anyway. > > Perhaps git should notice that it has nonsensical combination of > options... That would be nice. It should be reasonably easy to work out that the packSizeLimit will guarantee too many pack files after the gc. Disobeying a users wishes shouldn't be undertaken lightly, but sometimes we stuff up :) Cheers, Geoff. -- 6 Fifth Ave, St Morris, S.A. 5068 Australia Ph: 041 8805 184 / 08 8332 5069 http://perfidy.com.au ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <BANLkTi=w10KQ3MSd5YuYR+S=eMgywNTY-A@mail.gmail.com>]
* Re: gc getting called on each git command ... what's wrong? [not found] ` <BANLkTi=w10KQ3MSd5YuYR+S=eMgywNTY-A@mail.gmail.com> @ 2011-06-15 15:35 ` Jakub Narebski 2011-06-16 1:46 ` Geoff Russell 0 siblings, 1 reply; 10+ messages in thread From: Jakub Narebski @ 2011-06-15 15:35 UTC (permalink / raw) To: Geoff Russell; +Cc: Peter Harris, git On Wed, 15 Jun 2011, Geoff Russell wrote: > On Thu, Jun 9, 2011 at 2:39 AM, Jakub Narebski <jnareb@gmail.com> wrote: > > Peter Harris <git@peter.is-a-geek.org> writes: > > > On Tue, Jun 7, 2011 at 9:33 PM, Geoff Russell wrote: > > > > > > > > As of today, almost every time I do a git command, gc is getting > > > > invoked. > > > > > > > There are 96 pack files. > > > > > > That's why. See gc.autopacklimit in "git help config" -- by default, > > > git will gc if there are more than 50 pack files. > > > > Actually it looks like it is combination of this and packSizeLimit set > > to 30M. Git notices that it has too many packfiles, and tries to > > repack them, but packlimit forces Git to split it into small > > packfiles... and end up with more packfiles than limit anyway. > > Thanks to everybody. This is exactly what was happening and the problems > went away when I set the packSizeLimit higher ... 3000M Why did you set packSizeLimit at all? > > > > Perhaps git should notice that it has nonsensical combination of > > options... > > That would be nice. It should be reasonably easy to work out that the > packSizeLimit will guarantee too many pack files after the gc. > Disobeying a users wishes shouldn't be undertaken lightly, but sometimes > we stuff up :) Well, git can simply notice that each except perhaps on file has size greater or equal to gc.packSizeLimit, and then ignore gc.autopacklimit hint, because repacking would not reduce number of packs, and not lower it below gc.autopacklimit. If `git gc` is called interactively, we can warn user about this situation... -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: gc getting called on each git command ... what's wrong? 2011-06-15 15:35 ` Jakub Narebski @ 2011-06-16 1:46 ` Geoff Russell 2011-06-16 14:14 ` Jakub Narebski 0 siblings, 1 reply; 10+ messages in thread From: Geoff Russell @ 2011-06-16 1:46 UTC (permalink / raw) To: Jakub Narebski; +Cc: Peter Harris, git 2011/6/16 Jakub Narebski <jnareb@gmail.com> > > > Why did you set packSizeLimit at all? > > Some time ago (31/8/2010) I had a problem which seemed to be caused by large packs (>4GB), you can find it in the git list with a subject of "Large pack causes git clone failures ... what to do?" Anyway, I set packSizeLimit and fiddled around for a bit ... eventually the problem went away when I moved the central repository to another machine with less load and more memory. At which point I gave a sigh of relief and forgot to remove the packSizeLimit until recently bitten. But the original problem was probably nothing to do with large packs and hasn't recurred. Cheers, Geoff. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: gc getting called on each git command ... what's wrong? 2011-06-16 1:46 ` Geoff Russell @ 2011-06-16 14:14 ` Jakub Narebski 0 siblings, 0 replies; 10+ messages in thread From: Jakub Narebski @ 2011-06-16 14:14 UTC (permalink / raw) To: geoffrey.russell; +Cc: Peter Harris, git On Thu, 16 Jun 2011, Geoff Russell wrote: > 2011/6/16 Jakub Narebski <jnareb@gmail.com> > > > > Why did you set packSizeLimit at all? > > Some time ago (31/8/2010) I had a problem which seemed to be caused by > large packs (>4GB), you can find it in the git list with a subject of > "Large pack causes git clone failures ... what to do?" So why did you set packSizeLimit to such ridiculous low value, instead of 2g (2 GB) or something? -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-06-16 14:14 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-06-08 1:33 gc getting called on each git command ... what's wrong? Geoff Russell 2011-06-08 1:48 ` Peter Harris 2011-06-08 16:02 ` Drew Northup 2011-06-08 16:29 ` Brandon Casey 2011-06-08 16:50 ` Junio C Hamano 2011-06-08 17:09 ` Jakub Narebski 2011-06-15 1:28 ` Geoff Russell [not found] ` <BANLkTi=w10KQ3MSd5YuYR+S=eMgywNTY-A@mail.gmail.com> 2011-06-15 15:35 ` Jakub Narebski 2011-06-16 1:46 ` Geoff Russell 2011-06-16 14:14 ` Jakub Narebski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).