* auto gc again
@ 2008-03-18 18:01 Jens Axboe
2008-03-18 18:14 ` Linus Torvalds
2008-03-19 21:27 ` Junio C Hamano
0 siblings, 2 replies; 30+ messages in thread
From: Jens Axboe @ 2008-03-18 18:01 UTC (permalink / raw)
To: git; +Cc: Linus Torvalds
Hi,
Could we please PLEASE kill this auto gc thing? I've complained about
this in the past and disabled it through the gc.auto config entry,
however now git seems to be happily auto running gc even with gc.auto=0.
So there's probably some new magic I need to know.
But the new magic is really beside the point. Doing this 'for you' is
extremely annoying behaviour. I often work on my notebook, so disk is
both slow and battery is precious. I DON'T want gc to run automatically,
EVER. Not on repos I have had going for ages, not on ones I just cloned.
Please bury this silly policy and replace it with a printf() telling me
that I may increase my performance by running git gc. Don't just do it.
git does not know better.
--
Jens Axboe
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-18 18:01 auto gc again Jens Axboe
@ 2008-03-18 18:14 ` Linus Torvalds
2008-03-18 18:19 ` Jens Axboe
2008-03-19 21:27 ` Junio C Hamano
1 sibling, 1 reply; 30+ messages in thread
From: Linus Torvalds @ 2008-03-18 18:14 UTC (permalink / raw)
To: Jens Axboe; +Cc: git
On Tue, 18 Mar 2008, Jens Axboe wrote:
>
> Could we please PLEASE kill this auto gc thing? I've complained about
> this in the past and disabled it through the gc.auto config entry,
> however now git seems to be happily auto running gc even with gc.auto=0.
> So there's probably some new magic I need to know.
Do you do something odd with your repositories? I don't even touch autogc
on my systems, but I have never had that thing trigger, even when I apply
series of patches from Andrew with hundreds of messages.
So what is it that you do to even get this behaviour in the first place?
Linus
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-18 18:14 ` Linus Torvalds
@ 2008-03-18 18:19 ` Jens Axboe
2008-03-18 18:24 ` Jens Axboe
2008-03-19 20:37 ` Nicolas Pitre
0 siblings, 2 replies; 30+ messages in thread
From: Jens Axboe @ 2008-03-18 18:19 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
On Tue, Mar 18 2008, Linus Torvalds wrote:
>
>
> On Tue, 18 Mar 2008, Jens Axboe wrote:
> >
> > Could we please PLEASE kill this auto gc thing? I've complained about
> > this in the past and disabled it through the gc.auto config entry,
> > however now git seems to be happily auto running gc even with gc.auto=0.
> > So there's probably some new magic I need to know.
>
> Do you do something odd with your repositories? I don't even touch autogc
> on my systems, but I have never had that thing trigger, even when I apply
> series of patches from Andrew with hundreds of messages.
Not to my knowledge, I haven't changed anything in my setup or behaviour
in ages.
> So what is it that you do to even get this behaviour in the first place?
The last few times it was:
$ git checkout master
$ git branch some-test-branch
$ git checkout some-test-branch
$ git pull . some-devel-branch
and after that pull, I get to sit around waiting git gc. Well I don't
since I ctrl-c it because it's inconvenient.
But freshly pulled repo, git auto gc is enabled. And that is my main
annoyance, I just don't think that type of policy should be in there.
Print the warning, include info on how to run git gc or even how to turn
it on automatically. But I'll bet you that most users will NOT want auto
gc. Ever.
--
Jens Axboe
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-18 18:19 ` Jens Axboe
@ 2008-03-18 18:24 ` Jens Axboe
2008-03-18 18:33 ` Linus Torvalds
2008-03-19 20:37 ` Nicolas Pitre
1 sibling, 1 reply; 30+ messages in thread
From: Jens Axboe @ 2008-03-18 18:24 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
On Tue, Mar 18 2008, Jens Axboe wrote:
> On Tue, Mar 18 2008, Linus Torvalds wrote:
> >
> >
> > On Tue, 18 Mar 2008, Jens Axboe wrote:
> > >
> > > Could we please PLEASE kill this auto gc thing? I've complained about
> > > this in the past and disabled it through the gc.auto config entry,
> > > however now git seems to be happily auto running gc even with gc.auto=0.
> > > So there's probably some new magic I need to know.
> >
> > Do you do something odd with your repositories? I don't even touch autogc
> > on my systems, but I have never had that thing trigger, even when I apply
> > series of patches from Andrew with hundreds of messages.
>
> Not to my knowledge, I haven't changed anything in my setup or behaviour
> in ages.
>
> > So what is it that you do to even get this behaviour in the first place?
>
> The last few times it was:
>
> $ git checkout master
> $ git branch some-test-branch
> $ git checkout some-test-branch
> $ git pull . some-devel-branch
axboe@carl:~/git/linux-2.6-block> git count-objects
901 objects, 6448 kilobytes
xboe@carl:~/git/linux-2.6-block> git pull
remote: Counting objects: 320, done.
remote: Compressing objects: 100% (43/43), done.
remote: Total 214 (delta 171), reused 214 (delta 171)
Receiving objects: 100% (214/214), 31.78 KiB, done.
Resolving deltas: 100% (171/171), completed with 68 local objects.
From ssh://git.kernel.dk/data/git/linux-2.6-block
bde4f8f..f920bb6 master -> origin/master
Updating bde4f8f..f920bb6
Fast forward
Auto packing your repository for optimum performance. You may also
run "git gc" manually. See "git help gc" for more information.
^C
So 901 objects, pulled 68 objects. And auto gc kicks in. WTF? The git
before was from probably a week ago, this above run was done with git
just updated.
git version 1.5.5.rc0.6.gdeda
--
Jens Axboe
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-18 18:24 ` Jens Axboe
@ 2008-03-18 18:33 ` Linus Torvalds
2008-03-18 18:39 ` Jens Axboe
0 siblings, 1 reply; 30+ messages in thread
From: Linus Torvalds @ 2008-03-18 18:33 UTC (permalink / raw)
To: Jens Axboe; +Cc: git
On Tue, 18 Mar 2008, Jens Axboe wrote:
>
> axboe@carl:~/git/linux-2.6-block> git count-objects
> 901 objects, 6448 kilobytes
The default auto-gc threshold is 6700 objects. You should *not* be even
close to hitting it.
But there's a 20-pack pack-limit. Do you have lots of pack-files? But you
can disable that one with
[gc]
autopacklimit = 0
and I do think the default might be a bit low.
Linus
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-18 18:33 ` Linus Torvalds
@ 2008-03-18 18:39 ` Jens Axboe
2008-03-19 20:22 ` Johannes Schindelin
0 siblings, 1 reply; 30+ messages in thread
From: Jens Axboe @ 2008-03-18 18:39 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
On Tue, Mar 18 2008, Linus Torvalds wrote:
>
>
> On Tue, 18 Mar 2008, Jens Axboe wrote:
> >
> > axboe@carl:~/git/linux-2.6-block> git count-objects
> > 901 objects, 6448 kilobytes
>
> The default auto-gc threshold is 6700 objects. You should *not* be even
> close to hitting it.
>
> But there's a 20-pack pack-limit. Do you have lots of pack-files? But you
> can disable that one with
>
> [gc]
> autopacklimit = 0
>
> and I do think the default might be a bit low.
I let gc run last time to get rid of the complaint, so I cannot answer
that question. It's probably the pack limit if that is newer, since the
object count was so low.
But you never answer the question on whether you really consider any
form of autopacking or auto gc sane? Next time some other limit is added
for auto gc, it'll be annoying once more.
--
Jens Axboe
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-18 18:39 ` Jens Axboe
@ 2008-03-19 20:22 ` Johannes Schindelin
2008-03-19 21:14 ` Jens Axboe
0 siblings, 1 reply; 30+ messages in thread
From: Johannes Schindelin @ 2008-03-19 20:22 UTC (permalink / raw)
To: Jens Axboe; +Cc: Linus Torvalds, git
Hi,
On Tue, 18 Mar 2008, Jens Axboe wrote:
> But you never answer the question on whether you really consider any
> form of autopacking or auto gc sane? Next time some other limit is added
> for auto gc, it'll be annoying once more.
The problem is: if people do not bother to "git gc" their repositories,
git operations get slow. We just had enough of that, and decided to "git
gc" automatically for people who did not know about it, or were to lazy
and then complained about git for being slow.
Hth,
Dscho
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-18 18:19 ` Jens Axboe
2008-03-18 18:24 ` Jens Axboe
@ 2008-03-19 20:37 ` Nicolas Pitre
2008-03-19 21:17 ` Jens Axboe
2008-03-19 21:27 ` Brandon Casey
1 sibling, 2 replies; 30+ messages in thread
From: Nicolas Pitre @ 2008-03-19 20:37 UTC (permalink / raw)
To: Jens Axboe; +Cc: Linus Torvalds, git
On Tue, 18 Mar 2008, Jens Axboe wrote:
> But freshly pulled repo, git auto gc is enabled. And that is my main
> annoyance, I just don't think that type of policy should be in there.
Just do this once:
git config --global gc.auto 0
git config --global gc.autopacklimit 0
and be happy.
> Print the warning, include info on how to run git gc or even how to turn
> it on automatically. But I'll bet you that most users will NOT want auto
> gc. Ever.
Unfortunately, the harshest complaints about this whole issue were the
opposite.
Nicolas
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-19 20:22 ` Johannes Schindelin
@ 2008-03-19 21:14 ` Jens Axboe
2008-03-19 21:44 ` Johannes Schindelin
0 siblings, 1 reply; 30+ messages in thread
From: Jens Axboe @ 2008-03-19 21:14 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Linus Torvalds, git
On Wed, Mar 19 2008, Johannes Schindelin wrote:
> Hi,
>
> On Tue, 18 Mar 2008, Jens Axboe wrote:
>
> > But you never answer the question on whether you really consider any
> > form of autopacking or auto gc sane? Next time some other limit is added
> > for auto gc, it'll be annoying once more.
>
> The problem is: if people do not bother to "git gc" their repositories,
> git operations get slow. We just had enough of that, and decided to "git
> gc" automatically for people who did not know about it, or were to lazy
> and then complained about git for being slow.
Sorry I disagree, it's policy and that is usually a bad thing. In this
case it definitely is.
--
Jens Axboe
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-19 20:37 ` Nicolas Pitre
@ 2008-03-19 21:17 ` Jens Axboe
2008-03-19 23:05 ` Nicolas Pitre
2008-03-19 21:27 ` Brandon Casey
1 sibling, 1 reply; 30+ messages in thread
From: Jens Axboe @ 2008-03-19 21:17 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Linus Torvalds, git
On Wed, Mar 19 2008, Nicolas Pitre wrote:
> On Tue, 18 Mar 2008, Jens Axboe wrote:
>
> > But freshly pulled repo, git auto gc is enabled. And that is my main
> > annoyance, I just don't think that type of policy should be in there.
>
> Just do this once:
>
> git config --global gc.auto 0
> git config --global gc.autopacklimit 0
>
> and be happy.
You don't get it. I did gc.auto 0. And know some other limit crops up, I
have to do gc.autopacklimit 0. I have LOTS of git trees. On many
machines. It's just annoying, period.
> > Print the warning, include info on how to run git gc or even how to turn
> > it on automatically. But I'll bet you that most users will NOT want auto
> > gc. Ever.
>
> Unfortunately, the harshest complaints about this whole issue were the
> opposite.
I just don't buy that, I have more faith in users. If they come around
and complain it's slow, heck you told them it would be.
But it's not a big deal, I'll just carry a local patch that disables
this crap and forget the whole deal. I just worry that if this is where
git 'usability' is heading, it wont be a good thing in the long run.
--
Jens Axboe
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-18 18:01 auto gc again Jens Axboe
2008-03-18 18:14 ` Linus Torvalds
@ 2008-03-19 21:27 ` Junio C Hamano
2008-03-19 21:52 ` Linus Torvalds
1 sibling, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2008-03-19 21:27 UTC (permalink / raw)
To: Jens Axboe; +Cc: git, Linus Torvalds
Jens Axboe <jens.axboe@oracle.com> writes:
> But the new magic is really beside the point. Doing this 'for you' is
> extremely annoying behaviour. I often work on my notebook, so disk is
> both slow and battery is precious. I DON'T want gc to run automatically,
> EVER. Not on repos I have had going for ages, not on ones I just cloned.
> Please bury this silly policy and replace it with a printf() telling me
> that I may increase my performance by running git gc. Don't just do it.
> git does not know better.
Well, earlier, git used to be "kick-ass fast, flexible and powerful if you
knew what you are doing, and if you don't, then you are forever lost" type
of a system, and I think early adopters even took pride in saying so.
Being in the scene myself from early on, I certainly sympathise with that
feeling, and sometimes when a newcomer starts making noises about dumbing
git down without understanding implications (e.g. hiding or removing the
index), I have to resist the urge to say "you need to learn certain new
concepts that do not even exist counterparts in earlier crap systems you
are used to. If you feel you are confused, that's your problem. Get
enlightened first." I rarely say that out loud, to be more diplomatic,
though.
But judging from the fact that some kernel folks talking about having a
7GB kernel repository, I think supposedly early adoptors may not really
know what they are doing, and some automation, if done correctly would be
a good thing.
Having said that, I am not sure how the auto gc is triggering for your
(presumably reasonably well maintained) repository that has only small
number of loose objects. I haven't seen auto-gc annoyance myself (and
git.git is not the only project I have my git experience with), and Linus
also said he hasn't seen breakages.
I think we did have a few patches to the area recently and we should not
rule out the possibility that we broke the criteria "gc --auto" kicks in.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-19 20:37 ` Nicolas Pitre
2008-03-19 21:17 ` Jens Axboe
@ 2008-03-19 21:27 ` Brandon Casey
2008-03-19 21:53 ` [PATCH] builtin-gc.c: allow disabling all auto-gc'ing by assigning 0 to gc.auto Brandon Casey
` (2 more replies)
1 sibling, 3 replies; 30+ messages in thread
From: Brandon Casey @ 2008-03-19 21:27 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Jens Axboe, Linus Torvalds, git
Nicolas Pitre wrote:
> On Tue, 18 Mar 2008, Jens Axboe wrote:
>
>> But freshly pulled repo, git auto gc is enabled. And that is my main
>> annoyance, I just don't think that type of policy should be in there.
>
> Just do this once:
>
> git config --global gc.auto 0
> git config --global gc.autopacklimit 0
Is there any reason why gc.auto=0 couldn't be used to disable auto
packing entirely?
Said differently, are there valid use cases where one might want automatic
repacking based on the number of packs but _not_ based on the number of
loose objects?
If the answer is "no", then "gc.auto=0 means completely disable auto-gc"
seems intuitive and would have protected Jens in this case.
-brandon
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-19 21:14 ` Jens Axboe
@ 2008-03-19 21:44 ` Johannes Schindelin
2008-03-20 6:00 ` Jens Axboe
0 siblings, 1 reply; 30+ messages in thread
From: Johannes Schindelin @ 2008-03-19 21:44 UTC (permalink / raw)
To: Jens Axboe; +Cc: Linus Torvalds, git
Hi,
On Wed, 19 Mar 2008, Jens Axboe wrote:
> On Wed, Mar 19 2008, Johannes Schindelin wrote:
>
> > On Tue, 18 Mar 2008, Jens Axboe wrote:
> >
> > > But you never answer the question on whether you really consider any
> > > form of autopacking or auto gc sane? Next time some other limit is
> > > added for auto gc, it'll be annoying once more.
> >
> > The problem is: if people do not bother to "git gc" their
> > repositories, git operations get slow. We just had enough of that,
> > and decided to "git gc" automatically for people who did not know
> > about it, or were to lazy and then complained about git for being
> > slow.
>
> Sorry I disagree,
In this case, you can disagree as much as you want and you are still
wrong.
The problem is that you are more intelligent than most others, and now you
experience the downsides of it.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-19 21:27 ` Junio C Hamano
@ 2008-03-19 21:52 ` Linus Torvalds
2008-03-19 22:28 ` Junio C Hamano
0 siblings, 1 reply; 30+ messages in thread
From: Linus Torvalds @ 2008-03-19 21:52 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jens Axboe, git
On Wed, 19 Mar 2008, Junio C Hamano wrote:
>
> Having said that, I am not sure how the auto gc is triggering for your
> (presumably reasonably well maintained) repository that has only small
> number of loose objects. I haven't seen auto-gc annoyance myself (and
> git.git is not the only project I have my git experience with), and Linus
> also said he hasn't seen breakages.
I think it was 'autopacklimit'.
I think the correct solution is along the following lines:
- disable "git gc --auto" entirely when "gc.auto <= 0" (ie we don't even
care about 'autopacklimit' unless automatic packing is on at all)
Rationale: I do think that if you set gc.auto to zero, you should
expect git gc --auto to be disabled.
- make the default for autopacklimit rather higher (pick number at
random: 50 instead of 20).
Rationale: the reason for "git gc --auto" wasn't to keep things
perfectly packed, but to avoid the _really_ bad cases. The old default
of 20 may be fine if you want to always keep the repo very tight, but
that wasn't why "git gc --auto" was done, was it?
Suggested patch appended. Comments?
Linus
---
builtin-gc.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/builtin-gc.c b/builtin-gc.c
index 95917d7..16a912a 100644
--- a/builtin-gc.c
+++ b/builtin-gc.c
@@ -25,7 +25,7 @@ static const char * const builtin_gc_usage[] = {
static int pack_refs = 1;
static int aggressive_window = -1;
static int gc_auto_threshold = 6700;
-static int gc_auto_pack_limit = 20;
+static int gc_auto_pack_limit = 50;
static char *prune_expire = "2.weeks.ago";
#define MAX_ADD 10
@@ -163,7 +163,7 @@ static int need_to_gc(void)
* Setting gc.auto and gc.autopacklimit to 0 or negative can
* disable the automatic gc.
*/
- if (gc_auto_threshold <= 0 && gc_auto_pack_limit <= 0)
+ if (gc_auto_threshold <= 0)
return 0;
/*
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH] builtin-gc.c: allow disabling all auto-gc'ing by assigning 0 to gc.auto
2008-03-19 21:27 ` Brandon Casey
@ 2008-03-19 21:53 ` Brandon Casey
2008-03-20 7:08 ` Teemu Likonen
2008-03-19 22:56 ` auto gc again Nicolas Pitre
2008-03-20 6:01 ` Jens Axboe
2 siblings, 1 reply; 30+ messages in thread
From: Brandon Casey @ 2008-03-19 21:53 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Jens Axboe, Linus Torvalds, Git Mailing List, Junio C Hamano
The gc.auto configuration variable is somewhat ambiguous now that there
is also a gc.autopacklimit setting. Some users may assume that it controls
all auto-gc'ing. Also, now users must set two configuration variables to
zero when they want to disable autopacking. Since it is unlikely that users
will want to autopack based on some threshold of pack files when they have
disabled autopacking based on the number of loose objects, be nice and allow
a setting of zero for gc.auto to disable all autopacking.
Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
---
builtin-gc.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/builtin-gc.c b/builtin-gc.c
index 95917d7..509bb9c 100644
--- a/builtin-gc.c
+++ b/builtin-gc.c
@@ -160,10 +160,10 @@ static int too_many_packs(void)
static int need_to_gc(void)
{
/*
- * Setting gc.auto and gc.autopacklimit to 0 or negative can
- * disable the automatic gc.
+ * Setting gc.auto to 0 or negative can disable the
+ * automatic gc.
*/
- if (gc_auto_threshold <= 0 && gc_auto_pack_limit <= 0)
+ if (gc_auto_threshold <= 0)
return 0;
/*
--
1.5.4.4.481.g5075
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-19 21:52 ` Linus Torvalds
@ 2008-03-19 22:28 ` Junio C Hamano
2008-03-19 23:16 ` Nicolas Pitre
0 siblings, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2008-03-19 22:28 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Jens Axboe, git
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Wed, 19 Mar 2008, Junio C Hamano wrote:
>>
>> Having said that, I am not sure how the auto gc is triggering for your
>> (presumably reasonably well maintained) repository that has only small
>> number of loose objects. I haven't seen auto-gc annoyance myself (and
>> git.git is not the only project I have my git experience with), and Linus
>> also said he hasn't seen breakages.
>
> I think it was 'autopacklimit'.
>
> I think the correct solution is along the following lines:
>
> - disable "git gc --auto" entirely when "gc.auto <= 0" (ie we don't even
> care about 'autopacklimit' unless automatic packing is on at all)
>
> Rationale: I do think that if you set gc.auto to zero, you should
> expect git gc --auto to be disabled.
Sensible, I would say.
> - make the default for autopacklimit rather higher (pick number at
> random: 50 instead of 20).
>
> Rationale: the reason for "git gc --auto" wasn't to keep things
> perfectly packed, but to avoid the _really_ bad cases. The old default
> of 20 may be fine if you want to always keep the repo very tight, but
> that wasn't why "git gc --auto" was done, was it?
I do not think "very tight" was the reason, but on the other hand, my
personal feeling is that 20 was already 10 too many pack idx files we have
to walk linearly while looking for objects at runtime.
Each auto gc that sees too many loose objects will add a new packfile (we
do not do "repack -a" for obvious reasons) that would hopefully contain
6-7k objects, so you would need to generate 120-140k objects before
hitting the existing 20 limit.
And then auto gc will notice you have too many packs, and "repack -A" to
pack them down in a single new pack, and you are back to "single pack with
less than 6-7k loose objects" situation for the cycle to continue.
At least, that is the theory.
The kernel history with 87k commits have 720k objects, which roughly
translates to 8 objects per commit on average. You would need to perform
13k commits to generate 100k new loose objects. I am sensing that Jens is
mightily annoyed, rightfully so, by observing much shorter cycle than that
for "gc --auto" to kick in ("rev-list --author=Jens --since=8.month master"
tells me there are 145 commits in the last 8 months, far smaller than
13k). So there is something else going on.
Perhaps fetching with dumb transports should run "gc --auto" (or even an
unconditional "repack -a -d") at the end?
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-19 21:27 ` Brandon Casey
2008-03-19 21:53 ` [PATCH] builtin-gc.c: allow disabling all auto-gc'ing by assigning 0 to gc.auto Brandon Casey
@ 2008-03-19 22:56 ` Nicolas Pitre
2008-03-20 6:01 ` Jens Axboe
2 siblings, 0 replies; 30+ messages in thread
From: Nicolas Pitre @ 2008-03-19 22:56 UTC (permalink / raw)
To: Brandon Casey; +Cc: Jens Axboe, Linus Torvalds, git
On Wed, 19 Mar 2008, Brandon Casey wrote:
> Nicolas Pitre wrote:
> > On Tue, 18 Mar 2008, Jens Axboe wrote:
> >
> >> But freshly pulled repo, git auto gc is enabled. And that is my main
> >> annoyance, I just don't think that type of policy should be in there.
> >
> > Just do this once:
> >
> > git config --global gc.auto 0
> > git config --global gc.autopacklimit 0
>
> Is there any reason why gc.auto=0 couldn't be used to disable auto
> packing entirely?
I think that would be a good thing to do indeed.
> Said differently, are there valid use cases where one might want automatic
> repacking based on the number of packs but _not_ based on the number of
> loose objects?
>
> If the answer is "no", then "gc.auto=0 means completely disable auto-gc"
> seems intuitive and would have protected Jens in this case.
Agreed.
Nicolas
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-19 21:17 ` Jens Axboe
@ 2008-03-19 23:05 ` Nicolas Pitre
2008-03-20 7:40 ` Jens Axboe
0 siblings, 1 reply; 30+ messages in thread
From: Nicolas Pitre @ 2008-03-19 23:05 UTC (permalink / raw)
To: Jens Axboe; +Cc: Linus Torvalds, git
On Wed, 19 Mar 2008, Jens Axboe wrote:
> On Wed, Mar 19 2008, Nicolas Pitre wrote:
> > On Tue, 18 Mar 2008, Jens Axboe wrote:
> >
> > > But freshly pulled repo, git auto gc is enabled. And that is my main
> > > annoyance, I just don't think that type of policy should be in there.
> >
> > Just do this once:
> >
> > git config --global gc.auto 0
> > git config --global gc.autopacklimit 0
> >
> > and be happy.
>
> You don't get it. I did gc.auto 0. And know some other limit crops up, I
> have to do gc.autopacklimit 0. I have LOTS of git trees. On many
> machines. It's just annoying, period.
As suggested, gc.auto = 0 should probably be made to disable it
entirely, regardless of any other parameters that might exist.
> > > Print the warning, include info on how to run git gc or even how to turn
> > > it on automatically. But I'll bet you that most users will NOT want auto
> > > gc. Ever.
> >
> > Unfortunately, the harshest complaints about this whole issue were the
> > opposite.
>
> I just don't buy that, I have more faith in users.
We also did in the past... even for a long period...
Alas, it is the users who made us (and actually made Linus, who was the
last to resist) change our minds.
> If they come around and complain it's slow, heck you told them it
> would be.
But they don't. They just presume that Git is crap and move on.
> But it's not a big deal, I'll just carry a local patch that disables
> this crap and forget the whole deal. I just worry that if this is where
> git 'usability' is heading, it wont be a good thing in the long run.
I wish the majority of users was thinking like you. I, too, have some
conceptual problems with this auto gc things. With the experience
we've gathered, the current state appears to be the
lesser of all evils though.
Nicolas
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-19 22:28 ` Junio C Hamano
@ 2008-03-19 23:16 ` Nicolas Pitre
2008-03-19 23:25 ` Junio C Hamano
0 siblings, 1 reply; 30+ messages in thread
From: Nicolas Pitre @ 2008-03-19 23:16 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Linus Torvalds, Jens Axboe, git
On Wed, 19 Mar 2008, Junio C Hamano wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
> > On Wed, 19 Mar 2008, Junio C Hamano wrote:
> >>
> >> Having said that, I am not sure how the auto gc is triggering for your
> >> (presumably reasonably well maintained) repository that has only small
> >> number of loose objects. I haven't seen auto-gc annoyance myself (and
> >> git.git is not the only project I have my git experience with), and Linus
> >> also said he hasn't seen breakages.
> >
> > I think it was 'autopacklimit'.
> >
> > I think the correct solution is along the following lines:
> >
> > - disable "git gc --auto" entirely when "gc.auto <= 0" (ie we don't even
> > care about 'autopacklimit' unless automatic packing is on at all)
> >
> > Rationale: I do think that if you set gc.auto to zero, you should
> > expect git gc --auto to be disabled.
>
> Sensible, I would say.
Seconded.
> > - make the default for autopacklimit rather higher (pick number at
> > random: 50 instead of 20).
> >
> > Rationale: the reason for "git gc --auto" wasn't to keep things
> > perfectly packed, but to avoid the _really_ bad cases. The old default
> > of 20 may be fine if you want to always keep the repo very tight, but
> > that wasn't why "git gc --auto" was done, was it?
>
> I do not think "very tight" was the reason, but on the other hand, my
> personal feeling is that 20 was already 10 too many pack idx files we have
> to walk linearly while looking for objects at runtime.
Since commit f7c22cc68ccb this is no longer such an issue.
> Each auto gc that sees too many loose objects will add a new packfile (we
> do not do "repack -a" for obvious reasons) that would hopefully contain
> 6-7k objects, so you would need to generate 120-140k objects before
> hitting the existing 20 limit.
>
> And then auto gc will notice you have too many packs, and "repack -A" to
> pack them down in a single new pack, and you are back to "single pack with
> less than 6-7k loose objects" situation for the cycle to continue.
>
> At least, that is the theory.
Note that the current fetch.unpackLimit might play a role as well,
especially if you fetch often (often meaning that you're more likely to
have the received pack exploded into loose objects, or you're
accumulating many small packs).
Nicolas
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-19 23:16 ` Nicolas Pitre
@ 2008-03-19 23:25 ` Junio C Hamano
2008-03-20 3:13 ` Nicolas Pitre
0 siblings, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2008-03-19 23:25 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Linus Torvalds, Jens Axboe, git
Nicolas Pitre <nico@cam.org> writes:
> On Wed, 19 Mar 2008, Junio C Hamano wrote:
>
>> Linus Torvalds <torvalds@linux-foundation.org> writes:
>>
>> > On Wed, 19 Mar 2008, Junio C Hamano wrote:
>> ...
>> > - make the default for autopacklimit rather higher (pick number at
>> > random: 50 instead of 20).
>> >
>> > Rationale: the reason for "git gc --auto" wasn't to keep things
>> > perfectly packed, but to avoid the _really_ bad cases. The old default
>> > of 20 may be fine if you want to always keep the repo very tight, but
>> > that wasn't why "git gc --auto" was done, was it?
>>
>> I do not think "very tight" was the reason, but on the other hand, my
>> personal feeling is that 20 was already 10 too many pack idx files we have
>> to walk linearly while looking for objects at runtime.
>
> Since commit f7c22cc68ccb this is no longer such an issue.
Notice that I did not say "19 too many". I know f7c22cc (always start
looking up objects in the last used pack first, 2007-05-30) was meant to
alleviate the situation, but isn't "no longer" a gross exaggeration?
> Note that the current fetch.unpackLimit might play a role as well,
> especially if you fetch often (often meaning that you're more likely to
> have the received pack exploded into loose objects, or you're
> accumulating many small packs).
Ah, yes, native fetch will also result in a new pack, so even if you do
not do anything else, if you fetch once a day, you will accumulate 20
packs in that many days.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-19 23:25 ` Junio C Hamano
@ 2008-03-20 3:13 ` Nicolas Pitre
2008-03-20 4:09 ` Junio C Hamano
0 siblings, 1 reply; 30+ messages in thread
From: Nicolas Pitre @ 2008-03-20 3:13 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Linus Torvalds, Jens Axboe, git
On Wed, 19 Mar 2008, Junio C Hamano wrote:
> Nicolas Pitre <nico@cam.org> writes:
>
> >On Wed, 19 Mar 2008, Junio C Hamano wrote:
> >
> >> I do not think "very tight" was the reason, but on the other hand, my
> >> personal feeling is that 20 was already 10 too many pack idx files we have
> >> to walk linearly while looking for objects at runtime.
> >
> > Since commit f7c22cc68ccb this is no longer such an issue.
>
> Notice that I did not say "19 too many". I know f7c22cc (always start
> looking up objects in the last used pack first, 2007-05-30) was meant to
> alleviate the situation, but isn't "no longer" a gross exaggeration?
Not at all. Please have a second look at the performance numbers in
that commit log, and take into accound the most important metric that I
unfortunately failed to mention there (although I subsequently posted it
to the list: http://marc.info/?l=git&m=118058197921642&w=2), wich is the
time to perform the same operation with a single pack.
So you have 17.1 seconds for a single pack vs 18.4 seconds for 66 packs.
Compare that to 24.9s without that patch.
And I still have some further optimizations to implement eventually
(http://marc.info/?l=git&m=118062793413099&w=2), but which would
probably make a significant difference only in the hundreds-of-packs
case anyway.
So I really think that the default gc.autopacklimit could be raised.
Nicolas
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-20 3:13 ` Nicolas Pitre
@ 2008-03-20 4:09 ` Junio C Hamano
2008-03-20 4:40 ` Nicolas Pitre
0 siblings, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2008-03-20 4:09 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Linus Torvalds, Jens Axboe, git
Nicolas Pitre <nico@cam.org> writes:
> So you have 17.1 seconds for a single pack vs 18.4 seconds for 66 packs.
>
> Compare that to 24.9s without that patch.
Very interesting --- why should it affect a single pack case at all?
> And I still have some further optimizations to implement eventually
> (http://marc.info/?l=git&m=118062793413099&w=2), but which would
> probably make a significant difference only in the hundreds-of-packs
> case anyway.
>
> So I really think that the default gc.autopacklimit could be raised.
Thanks, let's raise it to 50 then.
But I am still puzzled...
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-20 4:09 ` Junio C Hamano
@ 2008-03-20 4:40 ` Nicolas Pitre
2008-03-20 4:49 ` Junio C Hamano
0 siblings, 1 reply; 30+ messages in thread
From: Nicolas Pitre @ 2008-03-20 4:40 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Linus Torvalds, Jens Axboe, git
On Wed, 19 Mar 2008, Junio C Hamano wrote:
> Nicolas Pitre <nico@cam.org> writes:
>
> > So you have 17.1 seconds for a single pack vs 18.4 seconds for 66 packs.
> >
> > Compare that to 24.9s without that patch.
>
> Very interesting --- why should it affect a single pack case at all?
It is not:
Single pack = 17.1s
66 packs with commit f7c22cc6 = 18.4s
66 packs without commit f7c22cc6 = 24.9s
The point is that having many packs doesn't impose a significant
overhead anymore when comparing to the single pack case.
> Thanks, let's raise it to 50 then.
Having only to set gc.auto=0 to disable it entirely is also a good
thing.
> But I am still puzzled...
Please tell me why if this is still the case.
Nicolas
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-20 4:40 ` Nicolas Pitre
@ 2008-03-20 4:49 ` Junio C Hamano
0 siblings, 0 replies; 30+ messages in thread
From: Junio C Hamano @ 2008-03-20 4:49 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Linus Torvalds, Jens Axboe, git
Nicolas Pitre <nico@cam.org> writes:
> On Wed, 19 Mar 2008, Junio C Hamano wrote:
>
>> Nicolas Pitre <nico@cam.org> writes:
>>
>> > So you have 17.1 seconds for a single pack vs 18.4 seconds for 66 packs.
>> >
>> > Compare that to 24.9s without that patch.
>>
>> Very interesting --- why should it affect a single pack case at all?
>
> It is not:
>
> Single pack = 17.1s
> 66 packs with commit f7c22cc6 = 18.4s
> 66 packs without commit f7c22cc6 = 24.9s
> ...
>> But I am still puzzled...
>
> Please tell me why if this is still the case.
Not anymore. Your "It is not" above cleared things for me. Somehow I
misread "with patch single pack is 17.1s and even with 66 packs it is only
18.4s, compare these great numbers with horrible 24.9s with single pack
without the patch".
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-19 21:44 ` Johannes Schindelin
@ 2008-03-20 6:00 ` Jens Axboe
0 siblings, 0 replies; 30+ messages in thread
From: Jens Axboe @ 2008-03-20 6:00 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Linus Torvalds, git
On Wed, Mar 19 2008, Johannes Schindelin wrote:
> Hi,
>
> On Wed, 19 Mar 2008, Jens Axboe wrote:
>
> > On Wed, Mar 19 2008, Johannes Schindelin wrote:
> >
> > > On Tue, 18 Mar 2008, Jens Axboe wrote:
> > >
> > > > But you never answer the question on whether you really consider any
> > > > form of autopacking or auto gc sane? Next time some other limit is
> > > > added for auto gc, it'll be annoying once more.
> > >
> > > The problem is: if people do not bother to "git gc" their
> > > repositories, git operations get slow. We just had enough of that,
> > > and decided to "git gc" automatically for people who did not know
> > > about it, or were to lazy and then complained about git for being
> > > slow.
> >
> > Sorry I disagree,
>
> In this case, you can disagree as much as you want and you are still
> wrong.
You are catering to the wrong end of the scale, nothing good comes of
that in the longer run.
> The problem is that you are more intelligent than most others, and now you
> experience the downsides of it.
Whatever. If you don't have something worthwhile to say, please don't
respond.
--
Jens Axboe
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-19 21:27 ` Brandon Casey
2008-03-19 21:53 ` [PATCH] builtin-gc.c: allow disabling all auto-gc'ing by assigning 0 to gc.auto Brandon Casey
2008-03-19 22:56 ` auto gc again Nicolas Pitre
@ 2008-03-20 6:01 ` Jens Axboe
2 siblings, 0 replies; 30+ messages in thread
From: Jens Axboe @ 2008-03-20 6:01 UTC (permalink / raw)
To: Brandon Casey; +Cc: Nicolas Pitre, Linus Torvalds, git
On Wed, Mar 19 2008, Brandon Casey wrote:
> Nicolas Pitre wrote:
> > On Tue, 18 Mar 2008, Jens Axboe wrote:
> >
> >> But freshly pulled repo, git auto gc is enabled. And that is my main
> >> annoyance, I just don't think that type of policy should be in there.
> >
> > Just do this once:
> >
> > git config --global gc.auto 0
> > git config --global gc.autopacklimit 0
>
> Is there any reason why gc.auto=0 couldn't be used to disable auto
> packing entirely?
>
> Said differently, are there valid use cases where one might want automatic
> repacking based on the number of packs but _not_ based on the number of
> loose objects?
>
> If the answer is "no", then "gc.auto=0 means completely disable auto-gc"
> seems intuitive and would have protected Jens in this case.
Totally agree, makes a lot more sense.
--
Jens Axboe
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] builtin-gc.c: allow disabling all auto-gc'ing by assigning 0 to gc.auto
2008-03-19 21:53 ` [PATCH] builtin-gc.c: allow disabling all auto-gc'ing by assigning 0 to gc.auto Brandon Casey
@ 2008-03-20 7:08 ` Teemu Likonen
0 siblings, 0 replies; 30+ messages in thread
From: Teemu Likonen @ 2008-03-20 7:08 UTC (permalink / raw)
To: git
Cc: Brandon Casey, Nicolas Pitre, Jens Axboe, Linus Torvalds,
Junio C Hamano
Brandon Casey kirjoitti:
> The gc.auto configuration variable is somewhat ambiguous now that
> there is also a gc.autopacklimit setting. Some users may assume that
> it controls all auto-gc'ing. Also, now users must set two
> configuration variables to zero when they want to disable
> autopacking. Since it is unlikely that users will want to autopack
> based on some threshold of pack files when they have disabled
> autopacking based on the number of loose objects, be nice and allow a
> setting of zero for gc.auto to disable all autopacking.
>
> Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
> ---
> builtin-gc.c | 6 +++---
> 1 files changed, 3 insertions(+), 3 deletions(-)
This change should be documented in the git-gc.txt and config.txt. For
example, the former currently says:
"Housekeeping is required if there are too many loose objects or too
many packs in the repository. [...] Setting the value of `gc.auto` to 0
disables automatic packing of loose objects."
So, from the git-gc.txt's (current) point of view, gc.auto=0 does not
touch the "or too many packs" part.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-19 23:05 ` Nicolas Pitre
@ 2008-03-20 7:40 ` Jens Axboe
2008-03-20 7:55 ` Junio C Hamano
0 siblings, 1 reply; 30+ messages in thread
From: Jens Axboe @ 2008-03-20 7:40 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Linus Torvalds, git
On Wed, Mar 19 2008, Nicolas Pitre wrote:
> On Wed, 19 Mar 2008, Jens Axboe wrote:
>
> > On Wed, Mar 19 2008, Nicolas Pitre wrote:
> > > On Tue, 18 Mar 2008, Jens Axboe wrote:
> > >
> > > > But freshly pulled repo, git auto gc is enabled. And that is my main
> > > > annoyance, I just don't think that type of policy should be in there.
> > >
> > > Just do this once:
> > >
> > > git config --global gc.auto 0
> > > git config --global gc.autopacklimit 0
> > >
> > > and be happy.
> >
> > You don't get it. I did gc.auto 0. And know some other limit crops up, I
> > have to do gc.autopacklimit 0. I have LOTS of git trees. On many
> > machines. It's just annoying, period.
>
> As suggested, gc.auto = 0 should probably be made to disable it
> entirely, regardless of any other parameters that might exist.
Yes, agree.
> > > > Print the warning, include info on how to run git gc or even how to turn
> > > > it on automatically. But I'll bet you that most users will NOT want auto
> > > > gc. Ever.
> > >
> > > Unfortunately, the harshest complaints about this whole issue were the
> > > opposite.
> >
> > I just don't buy that, I have more faith in users.
>
> We also did in the past... even for a long period...
>
> Alas, it is the users who made us (and actually made Linus, who was the
> last to resist) change our minds.
OK, that's at least reassuring :-)
> > If they come around and complain it's slow, heck you told them it
> > would be.
>
> But they don't. They just presume that Git is crap and move on.
That's pretty sad, I like to have high hopes for users.
> > But it's not a big deal, I'll just carry a local patch that disables
> > this crap and forget the whole deal. I just worry that if this is where
> > git 'usability' is heading, it wont be a good thing in the long run.
>
> I wish the majority of users was thinking like you. I, too, have some
> conceptual problems with this auto gc things. With the experience
> we've gathered, the current state appears to be the
> lesser of all evils though.
Alright, I must bow down to empirical evidence... The conceptual policy
problem is indeed what is bothering me so much, even more so than the
actual gc running on my machine.
gc.auto covering everything is good enough for me, GIT_GC_AUTO
environment variable would be better because of the way that I work. But
I can get by knowing that the gc.auto thing will at least only bite me
once per tree. And perhaps just wrap git clone in one of my scripts
that'll then do the gc.auto thing automatically.
--
Jens Axboe
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-20 7:40 ` Jens Axboe
@ 2008-03-20 7:55 ` Junio C Hamano
2008-03-20 17:31 ` Jens Axboe
0 siblings, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2008-03-20 7:55 UTC (permalink / raw)
To: Jens Axboe; +Cc: Nicolas Pitre, Linus Torvalds, git
Jens Axboe <jens.axboe@oracle.com> writes:
> gc.auto covering everything is good enough for me, GIT_GC_AUTO
> environment variable would be better because of the way that I work. But
> I can get by knowing that the gc.auto thing will at least only bite me
> once per tree. And perhaps just wrap git clone in one of my scripts
> that'll then do the gc.auto thing automatically.
You missed --global part of the suggestion, perhaps?
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: auto gc again
2008-03-20 7:55 ` Junio C Hamano
@ 2008-03-20 17:31 ` Jens Axboe
0 siblings, 0 replies; 30+ messages in thread
From: Jens Axboe @ 2008-03-20 17:31 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Nicolas Pitre, Linus Torvalds, git
On Thu, Mar 20 2008, Junio C Hamano wrote:
> Jens Axboe <jens.axboe@oracle.com> writes:
>
> > gc.auto covering everything is good enough for me, GIT_GC_AUTO
> > environment variable would be better because of the way that I work. But
> > I can get by knowing that the gc.auto thing will at least only bite me
> > once per tree. And perhaps just wrap git clone in one of my scripts
> > that'll then do the gc.auto thing automatically.
>
> You missed --global part of the suggestion, perhaps?
I did indeed, --global does exactly what I meant with GIT_GC_AUTO.
Thanks!
--
Jens Axboe
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2008-03-20 17:31 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-18 18:01 auto gc again Jens Axboe
2008-03-18 18:14 ` Linus Torvalds
2008-03-18 18:19 ` Jens Axboe
2008-03-18 18:24 ` Jens Axboe
2008-03-18 18:33 ` Linus Torvalds
2008-03-18 18:39 ` Jens Axboe
2008-03-19 20:22 ` Johannes Schindelin
2008-03-19 21:14 ` Jens Axboe
2008-03-19 21:44 ` Johannes Schindelin
2008-03-20 6:00 ` Jens Axboe
2008-03-19 20:37 ` Nicolas Pitre
2008-03-19 21:17 ` Jens Axboe
2008-03-19 23:05 ` Nicolas Pitre
2008-03-20 7:40 ` Jens Axboe
2008-03-20 7:55 ` Junio C Hamano
2008-03-20 17:31 ` Jens Axboe
2008-03-19 21:27 ` Brandon Casey
2008-03-19 21:53 ` [PATCH] builtin-gc.c: allow disabling all auto-gc'ing by assigning 0 to gc.auto Brandon Casey
2008-03-20 7:08 ` Teemu Likonen
2008-03-19 22:56 ` auto gc again Nicolas Pitre
2008-03-20 6:01 ` Jens Axboe
2008-03-19 21:27 ` Junio C Hamano
2008-03-19 21:52 ` Linus Torvalds
2008-03-19 22:28 ` Junio C Hamano
2008-03-19 23:16 ` Nicolas Pitre
2008-03-19 23:25 ` Junio C Hamano
2008-03-20 3:13 ` Nicolas Pitre
2008-03-20 4:09 ` Junio C Hamano
2008-03-20 4:40 ` Nicolas Pitre
2008-03-20 4:49 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).