git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* git-reflog 70 minutes at 100% cpu and counting
@ 2009-12-14 20:28 Eric Paris
  2009-12-14 20:41 ` Sverre Rabbelier
  2009-12-14 21:11 ` Jeff King
  0 siblings, 2 replies; 30+ messages in thread
From: Eric Paris @ 2009-12-14 20:28 UTC (permalink / raw)
  To: git

So I have no idea what is interesting or relevant what I can collect,
what you want to know or anything like that, so this is a bit of a dump
of info and I'm sorry to whoever tries to pick anything useful out of
it.  Somone who understands git might glean some interesting information
(or tell me what a fool I am)  I'm going to lay out my whole working
process here and maybe people will even point out how to improve what I
do....

git-1.5.5.6-4.el5 (git in extras for RHEL5)

I have about 5 local trees that way I can work on different things
without having to rebuild quite a much especially as I go back and
change history so often with stgit.  Each local tree has a .git/config
file that has about 5 different kernel trees set up as remotes.  They
look something like this.

[core]
	repositoryformatversion = 0
	filemode = true
	bare = false
	logallrefupdates = true
[remote "origin"]
	url = git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	fetch = +refs/heads/*:refs/remotes/origin/*
[remote "linus"]
	url = git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
	fetch = +refs/heads/master:refs/remotes/linus/master
[remote "linux-next"]
	url = git://git.kernel.org/pub/scm/linux/kernel/git/sfr/linux-next.git
	fetch = +refs/heads/*:refs/remotes/linux-next/*
{snipped}

Each of these trees also has a .git/objects/info/alternatives file which
looks like so (am I using alternatives right?)

/export/kernel/kernel-1/.git/objects
/storage/kernel/kernel-1/.git/objects

On this particular machine (a very beefy dual quad core
Nehalem) /storage is a bind mount of /export.  On other machines I will
mount these using NFS in which case /export doesn't exist and /storage
is the mount point.  Typically (but not always) the only thing I do over
the NFS mount point is 'make install'.  

In this particular tree I use stgit and on a daily basis will update my
remotes and rebase my stgit patch series on top of linux-next.  I don't
know the details of the git commands going on under the covers, I just
do git remote update; stg rebase remotes/linux-next/master;  I don't
know if that's relevant, but it might leave me lots of crap in the tree?

Today I decided to make a clean branch to ask Linus to pull.  I exported
a patch series (about 80 patches) to an mbox file from one of my other 5
trees and I did the following

git remote update 
git checkout -b working remotes/linus/master
git-am -3 -k mbox.file

all 80 or so patches in the mbox file applied and then I got

Auto packing your repository for optimum performance. You may also
run "git gc" manually. See "git help gc" for more information.

I waited for a while, but it still hasn't come back.

#ps -ef | grep git
paris    24134 22057  0 14:09 pts/12   00:00:00 /bin/sh /usr/bin/git-am -3 -k /tmp/fanotify.mbox
paris    25638 24134  0 14:09 pts/12   00:00:00 git gc --auto
paris    25640 25638 99 14:09 pts/12   00:58:07 git-reflog expire --all

#top
25640 paris     25   0  920m 211m 149m R 99.8  2.1  69:09.55 git-reflog

#ls -ld /proc/pid/cwd
lrwxrwxrwx 1 paris paris 0 Dec 14 15:02 /proc/25640/cwd -> /storage/kernel/kernel-2

#strace -p -T -ttt
1260821793.751746 stat(".git/objects/b2/ad3c1470e751c53bf7a4d3d53514e0debab1fc", {st_mode=S_IFREG|0444, st_size=291, ...}) = 0 <0.000043>
1260821793.751917 open(".git/objects/b2/ad3c1470e751c53bf7a4d3d53514e0debab1fc", O_RDONLY|O_NOATIME) = 40 <0.000041>
1260821793.752032 mmap(NULL, 291, PROT_READ, MAP_PRIVATE, 40, 0) = 0x2b4d6a90e000 <0.000041>
1260821793.752148 close(40)             = 0 <0.000026>
1260821793.752286 munmap(0x2b4d6a90e000, 291) = 0 <0.000035>
1260821793.752538 stat(".git/objects/85/7d99d3a4f9780402fbff3d59b6b3de8d614cc7", {st_mode=S_IFREG|0444, st_size=330, ...}) = 0 <0.000138>
1260821793.752743 open(".git/objects/85/7d99d3a4f9780402fbff3d59b6b3de8d614cc7", O_RDONLY|O_NOATIME) = 40 <0.000027>
1260821793.752942 mmap(NULL, 330, PROT_READ, MAP_PRIVATE, 40, 0) = 0x2b4d6a90e000 <0.000024>
1260821793.753018 close(40)             = 0 <0.000038>
1260821793.753289 munmap(0x2b4d6a90e000, 330) = 0 <0.000040>
1260821796.796243 stat(".git/objects/85/7d99d3a4f9780402fbff3d59b6b3de8d614cc7", {st_mode=S_IFREG|0444, st_size=330, ...}) = 0 <0.000076>
1260821796.796440 open(".git/objects/85/7d99d3a4f9780402fbff3d59b6b3de8d614cc7", O_RDONLY|O_NOATIME) = 40 <0.000036>
1260821796.796553 mmap(NULL, 330, PROT_READ, MAP_PRIVATE, 40, 0) = 0x2b4d6a90e000 <0.000031>
1260821796.796624 close(40)             = 0 <0.000017>
1260821796.796828 munmap(0x2b4d6a90e000, 330) = 0 <0.000042>
1260821796.797124 stat(".git/objects/40/c92d2149426ea7fd8c70bf7c7727af15eed75d", {st_mode=S_IFREG|0444, st_size=293, ...}) = 0 <0.008584>
1260821796.805844 open(".git/objects/40/c92d2149426ea7fd8c70bf7c7727af15eed75d", O_RDONLY|O_NOATIME) = 40 <0.000114>
1260821796.806062 mmap(NULL, 293, PROT_READ, MAP_PRIVATE, 40, 0) = 0x2b4d6a90e000 <0.000041>
1260821796.806144 close(40)             = 0 <0.000018>
1260821796.806341 munmap(0x2b4d6a90e000, 293) = 0 <0.000023>
1260821799.863480 stat(".git/objects/40/c92d2149426ea7fd8c70bf7c7727af15eed75d", {st_mode=S_IFREG|0444, st_size=293, ...}) = 0 <0.000118>
1260821799.863737 open(".git/objects/40/c92d2149426ea7fd8c70bf7c7727af15eed75d", O_RDONLY|O_NOATIME) = 40 <0.000042>
1260821799.863855 mmap(NULL, 293, PROT_READ, MAP_PRIVATE, 40, 0) = 0x2b4d6a90e000 <0.000075>
1260821799.863973 close(40)             = 0 <0.000021>
1260821799.864101 munmap(0x2b4d6a90e000, 293) = 0 <0.000033>
1260821799.864306 stat(".git/objects/43/77e6fe8ac62e7b3a1b65a83665f172550440b6", {st_mode=S_IFREG|0444, st_size=272, ...}) = 0 <0.000177>
1260821799.864551 open(".git/objects/43/77e6fe8ac62e7b3a1b65a83665f172550440b6", O_RDONLY|O_NOATIME) = 40 <0.000025>
1260821799.864635 mmap(NULL, 272, PROT_READ, MAP_PRIVATE, 40, 0) = 0x2b4d6a90e000 <0.000041>
1260821799.864729 close(40)             = 0 <0.000058>
1260821799.865064 munmap(0x2b4d6a90e000, 272) = 0 <0.000031>

First things I notice in the strace is that git is opening the same
objects multiple times, and there are seconds between the munmap of the
last object and the second stat of that same object....

What can I collect, do, whatever?

-Eric

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-14 20:28 git-reflog 70 minutes at 100% cpu and counting Eric Paris
@ 2009-12-14 20:41 ` Sverre Rabbelier
  2009-12-14 21:11 ` Jeff King
  1 sibling, 0 replies; 30+ messages in thread
From: Sverre Rabbelier @ 2009-12-14 20:41 UTC (permalink / raw)
  To: Eric Paris; +Cc: git

Heya,

On Mon, Dec 14, 2009 at 21:28, Eric Paris <eparis@redhat.com> wrote:
> What can I collect, do, whatever?

If this really is a case that we end up wanting to optimize somehow,
it would probably be very helpful to make a copy of the repository
state _before_ the gc is done.

Also, 1.5.5 is really really old in git terms, consider compiling your
own. Something post 1.6.4 might be a good idea :).

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-14 20:28 git-reflog 70 minutes at 100% cpu and counting Eric Paris
  2009-12-14 20:41 ` Sverre Rabbelier
@ 2009-12-14 21:11 ` Jeff King
  2009-12-14 21:20   ` Eric Paris
  1 sibling, 1 reply; 30+ messages in thread
From: Jeff King @ 2009-12-14 21:11 UTC (permalink / raw)
  To: Eric Paris; +Cc: git

On Mon, Dec 14, 2009 at 03:28:04PM -0500, Eric Paris wrote:

> So I have no idea what is interesting or relevant what I can collect,
> what you want to know or anything like that, so this is a bit of a dump
> of info and I'm sorry to whoever tries to pick anything useful out of

It sounds like you might have found an infinite loop, as reflog should
never really need a lot of CPU. Is it possible to tar the whole
repository and make it available publicly for us to look at?

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-14 21:11 ` Jeff King
@ 2009-12-14 21:20   ` Eric Paris
  2009-12-14 21:23     ` Jeff King
  2009-12-15  2:39     ` Jeff King
  0 siblings, 2 replies; 30+ messages in thread
From: Eric Paris @ 2009-12-14 21:20 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Mon, 2009-12-14 at 16:11 -0500, Jeff King wrote:
> On Mon, Dec 14, 2009 at 03:28:04PM -0500, Eric Paris wrote:
> 
> > So I have no idea what is interesting or relevant what I can collect,
> > what you want to know or anything like that, so this is a bit of a dump
> > of info and I'm sorry to whoever tries to pick anything useful out of
> 
> It sounds like you might have found an infinite loop, as reflog should
> never really need a lot of CPU. Is it possible to tar the whole
> repository and make it available publicly for us to look at?

Updated to git-1.6.5.3-1 from Fedora rawhide and still git reflog ran
for >5 minutes at 100% cpu (I killed it, it didn't finish)

I'm pushing a copy of the whole repo (all 1.9G after bzip compression)
to

http://people.redhat.com/~eparis/git-tar/

But it's going to take a couple hours.

-Eric

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-14 21:20   ` Eric Paris
@ 2009-12-14 21:23     ` Jeff King
  2009-12-14 21:56       ` Eric Paris
  2009-12-15  2:39     ` Jeff King
  1 sibling, 1 reply; 30+ messages in thread
From: Jeff King @ 2009-12-14 21:23 UTC (permalink / raw)
  To: Eric Paris; +Cc: git

On Mon, Dec 14, 2009 at 04:20:29PM -0500, Eric Paris wrote:

> Updated to git-1.6.5.3-1 from Fedora rawhide and still git reflog ran
> for >5 minutes at 100% cpu (I killed it, it didn't finish)
> 
> I'm pushing a copy of the whole repo (all 1.9G after bzip compression)
> to
> 
> http://people.redhat.com/~eparis/git-tar/

Wowzers, that's big. Can you send just what's in .git?

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-14 21:23     ` Jeff King
@ 2009-12-14 21:56       ` Eric Paris
  2009-12-14 22:03         ` Sverre Rabbelier
                           ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Eric Paris @ 2009-12-14 21:56 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Mon, 2009-12-14 at 16:23 -0500, Jeff King wrote:
> On Mon, Dec 14, 2009 at 04:20:29PM -0500, Eric Paris wrote:
> 
> > Updated to git-1.6.5.3-1 from Fedora rawhide and still git reflog ran
> > for >5 minutes at 100% cpu (I killed it, it didn't finish)
> > 
> > I'm pushing a copy of the whole repo (all 1.9G after bzip compression)
> > to
> > 
> > http://people.redhat.com/~eparis/git-tar/
> 
> Wowzers, that's big. Can you send just what's in .git?

So I zipped up just .git   1.2G.  I did a make clean and zipped up the
whole repo  1.3G.

Just started pushing the 1.3G file.

Maybe having a .git directory that large is the problem?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-14 21:56       ` Eric Paris
@ 2009-12-14 22:03         ` Sverre Rabbelier
  2009-12-15  0:29           ` Nicolas Pitre
  2009-12-14 22:14         ` Jeff King
  2009-12-15  0:26         ` Nicolas Pitre
  2 siblings, 1 reply; 30+ messages in thread
From: Sverre Rabbelier @ 2009-12-14 22:03 UTC (permalink / raw)
  To: Eric Paris; +Cc: Jeff King, git

Heya,

On Mon, Dec 14, 2009 at 22:56, Eric Paris <eparis@redhat.com> wrote:
> Just started pushing the 1.3G file.
>
> Maybe having a .git directory that large is the problem?

What did you say this repository contained again? Your home video's?
Ah, well that explains ;).

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-14 21:56       ` Eric Paris
  2009-12-14 22:03         ` Sverre Rabbelier
@ 2009-12-14 22:14         ` Jeff King
  2009-12-15  0:26         ` Nicolas Pitre
  2 siblings, 0 replies; 30+ messages in thread
From: Jeff King @ 2009-12-14 22:14 UTC (permalink / raw)
  To: Eric Paris; +Cc: git

On Mon, Dec 14, 2009 at 04:56:30PM -0500, Eric Paris wrote:

> So I zipped up just .git   1.2G.  I did a make clean and zipped up the
> whole repo  1.3G.
> 
> Just started pushing the 1.3G file.
> 
> Maybe having a .git directory that large is the problem?

It could be, but I doubt it. If you have a lot of loose objects that
could make things slow due to the disk access, but it is not likely to
use that much CPU time (we do have to zlib uncompress more, but
still...70 minutes is a lot).

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-14 21:56       ` Eric Paris
  2009-12-14 22:03         ` Sverre Rabbelier
  2009-12-14 22:14         ` Jeff King
@ 2009-12-15  0:26         ` Nicolas Pitre
  2009-12-15  0:36           ` Junio C Hamano
  2009-12-15  2:11           ` Eric Paris
  2 siblings, 2 replies; 30+ messages in thread
From: Nicolas Pitre @ 2009-12-15  0:26 UTC (permalink / raw)
  To: Eric Paris; +Cc: Jeff King, git

On Mon, 14 Dec 2009, Eric Paris wrote:

> On Mon, 2009-12-14 at 16:23 -0500, Jeff King wrote:
> > On Mon, Dec 14, 2009 at 04:20:29PM -0500, Eric Paris wrote:
> > 
> > > Updated to git-1.6.5.3-1 from Fedora rawhide and still git reflog ran
> > > for >5 minutes at 100% cpu (I killed it, it didn't finish)
> > > 
> > > I'm pushing a copy of the whole repo (all 1.9G after bzip compression)
> > > to
> > > 
> > > http://people.redhat.com/~eparis/git-tar/
> > 
> > Wowzers, that's big. Can you send just what's in .git?
> 
> So I zipped up just .git   1.2G.  I did a make clean and zipped up the
> whole repo  1.3G.
> 
> Just started pushing the 1.3G file.
> 
> Maybe having a .git directory that large is the problem?

Shouldn't be, unless your repo is really badly packed.

What's the output of 'git count-objects -v' ?


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-14 22:03         ` Sverre Rabbelier
@ 2009-12-15  0:29           ` Nicolas Pitre
  0 siblings, 0 replies; 30+ messages in thread
From: Nicolas Pitre @ 2009-12-15  0:29 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: Eric Paris, Jeff King, git

On Mon, 14 Dec 2009, Sverre Rabbelier wrote:

> Heya,
> 
> On Mon, Dec 14, 2009 at 22:56, Eric Paris <eparis@redhat.com> wrote:
> > Just started pushing the 1.3G file.
> >
> > Maybe having a .git directory that large is the problem?
> 
> What did you say this repository contained again? Your home video's?
> Ah, well that explains ;).

That would explain the size, but not the reflog CPU time.


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-15  0:26         ` Nicolas Pitre
@ 2009-12-15  0:36           ` Junio C Hamano
  2009-12-15  3:58             ` Nicolas Pitre
  2009-12-15  2:11           ` Eric Paris
  1 sibling, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2009-12-15  0:36 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Eric Paris, Jeff King, git

Nicolas Pitre <nico@fluxnic.net> writes:

> On Mon, 14 Dec 2009, Eric Paris wrote:
>
>> On Mon, 2009-12-14 at 16:23 -0500, Jeff King wrote:
>> > On Mon, Dec 14, 2009 at 04:20:29PM -0500, Eric Paris wrote:
>> > 
>> > > Updated to git-1.6.5.3-1 from Fedora rawhide and still git reflog ran
>> > > for >5 minutes at 100% cpu (I killed it, it didn't finish)
>> > > 
>> > > I'm pushing a copy of the whole repo (all 1.9G after bzip compression)
>> > > to
>> > > 
>> > > http://people.redhat.com/~eparis/git-tar/
>> > 
>> > Wowzers, that's big. Can you send just what's in .git?
>> 
>> So I zipped up just .git   1.2G.  I did a make clean and zipped up the
>> whole repo  1.3G.
>> 
>> Just started pushing the 1.3G file.
>> 
>> Maybe having a .git directory that large is the problem?
>
> Shouldn't be, unless your repo is really badly packed.
>
> What's the output of 'git count-objects -v' ?

Didn't somebody say that the trace hints an infinite loop not "slow
because of bad packing"?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-15  0:26         ` Nicolas Pitre
  2009-12-15  0:36           ` Junio C Hamano
@ 2009-12-15  2:11           ` Eric Paris
  2009-12-15  3:44             ` Nicolas Pitre
  1 sibling, 1 reply; 30+ messages in thread
From: Eric Paris @ 2009-12-15  2:11 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Jeff King, git

On Mon, 2009-12-14 at 19:26 -0500, Nicolas Pitre wrote:
> On Mon, 14 Dec 2009, Eric Paris wrote:
> 
> > On Mon, 2009-12-14 at 16:23 -0500, Jeff King wrote:
> > > On Mon, Dec 14, 2009 at 04:20:29PM -0500, Eric Paris wrote:
> > > 
> > > > Updated to git-1.6.5.3-1 from Fedora rawhide and still git reflog ran
> > > > for >5 minutes at 100% cpu (I killed it, it didn't finish)
> > > > 
> > > > I'm pushing a copy of the whole repo (all 1.9G after bzip compression)
> > > > to
> > > > 
> > > > http://people.redhat.com/~eparis/git-tar/
> > > 
> > > Wowzers, that's big. Can you send just what's in .git?
> > 
> > So I zipped up just .git   1.2G.  I did a make clean and zipped up the
> > whole repo  1.3G.
> > 
> > Just started pushing the 1.3G file.
> > 
> > Maybe having a .git directory that large is the problem?
> 
> Shouldn't be, unless your repo is really badly packed.
> 
> What's the output of 'git count-objects -v' ?

count: 87065
size: 866744
in-pack: 1203497
packs: 148
size-pack: 976474
prune-packable: 1611
garbage: 0


It's not home movies   :)  .  It's a kernel trees with about 5
'upstream' trees that are remotes, which I update daily.  One of the
remotes constantly rebases every day starting with Linus' tree and
pulling in about 150+ branches of work from others all of which might
rebase.  I have (needlessly) the tags he keeps of that repo every day.
I daily rebase my work on top of that constantly rebasing tree
(linux-next) using stgit.

I noticed just blindly poking at sizes in my .git/object/pack that the
largest pack is a lot larger than the second and third largest....

-r--r--r-- 1 paris paris 108031039 Feb 12  2009 pack-71a9c0f08c76b8ffd1cf0a14d7cfe991fbc9db80.pack
-r--r--r-- 1 paris paris  32670479 Apr  7  2009 pack-5c8333301012d9b70d70648b287cf540afcc63ed.pack
-r--r--r-- 1 paris paris  26728958 Dec 30  2008 pack-fb8ceb5a33d9881fe771860c6006f55f73ecdf65.pack

And all total there is almost 1G of data in .git/object/pack

If the answer really is that I just have too much data and it can't be
handled, I'm fine exporting my patches getting some clean trees and
starting over till I get in this situation again, but if it really is a
problem/bug that can be solved, the full tar ball of my repo is at

http://people.redhat.com/~eparis/git-tar/

-Eric

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-14 21:20   ` Eric Paris
  2009-12-14 21:23     ` Jeff King
@ 2009-12-15  2:39     ` Jeff King
  2009-12-15  3:50       ` Nicolas Pitre
  1 sibling, 1 reply; 30+ messages in thread
From: Jeff King @ 2009-12-15  2:39 UTC (permalink / raw)
  To: Eric Paris; +Cc: git

On Mon, Dec 14, 2009 at 04:20:29PM -0500, Eric Paris wrote:

> I'm pushing a copy of the whole repo (all 1.9G after bzip compression)
> to
> 
> http://people.redhat.com/~eparis/git-tar/
> 
> But it's going to take a couple hours.

Holy cow. Almost 150 packs, and that's not even everything. The tarball
is missing a bunch of objects, because it points to your kernel-1 as an
alternate. So I suspect we would need that, as well, to recreate.

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-15  2:11           ` Eric Paris
@ 2009-12-15  3:44             ` Nicolas Pitre
  0 siblings, 0 replies; 30+ messages in thread
From: Nicolas Pitre @ 2009-12-15  3:44 UTC (permalink / raw)
  To: Eric Paris; +Cc: Jeff King, git

On Mon, 14 Dec 2009, Eric Paris wrote:

> On Mon, 2009-12-14 at 19:26 -0500, Nicolas Pitre wrote:
> > On Mon, 14 Dec 2009, Eric Paris wrote:
> > 
> > > Maybe having a .git directory that large is the problem?
> > 
> > Shouldn't be, unless your repo is really badly packed.
> > 
> > What's the output of 'git count-objects -v' ?
> 
> count: 87065
> size: 866744
> in-pack: 1203497
> packs: 148
> size-pack: 976474

So basically 87K loose objects occupying 846 MB and 1.2M packed objects 
occupying 954 MB across 148 packs.  That's an horrible repository 
layout which would definitely gain by being repacked.

> I noticed just blindly poking at sizes in my .git/object/pack that the
> largest pack is a lot larger than the second and third largest....

That's expected.

> And all total there is almost 1G of data in .git/object/pack
> 
> If the answer really is that I just have too much data and it can't be
> handled,

Nope.  git should handle that kind of data set perfectly fine.  And once 
repacked, you should end up with a single pack containing everything and 
the total size of your .git/objects directory will probably shrink by 
50% or more.

But to be able to repack, your 'git reflog' needs to work correctly, and 
the problem is unlikely to be related to the repository size.


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-15  2:39     ` Jeff King
@ 2009-12-15  3:50       ` Nicolas Pitre
  2009-12-15  4:26         ` Eric Paris
  0 siblings, 1 reply; 30+ messages in thread
From: Nicolas Pitre @ 2009-12-15  3:50 UTC (permalink / raw)
  To: Jeff King; +Cc: Eric Paris, git

On Mon, 14 Dec 2009, Jeff King wrote:

> On Mon, Dec 14, 2009 at 04:20:29PM -0500, Eric Paris wrote:
> 
> > I'm pushing a copy of the whole repo (all 1.9G after bzip compression)
> > to
> > 
> > http://people.redhat.com/~eparis/git-tar/
> > 
> > But it's going to take a couple hours.
> 
> Holy cow. Almost 150 packs, and that's not even everything. The tarball
> is missing a bunch of objects, because it points to your kernel-1 as an
> alternate. So I suspect we would need that, as well, to recreate.

Hmmm... Rebasing repositories mixed with alternates...  I wonder if the 
infinite loop might not actually be due to a delta cycle, especially if 
the alternate is also rebasing.

So having the alternate, too, would certainly be interesting.


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-15  0:36           ` Junio C Hamano
@ 2009-12-15  3:58             ` Nicolas Pitre
  0 siblings, 0 replies; 30+ messages in thread
From: Nicolas Pitre @ 2009-12-15  3:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Eric Paris, Jeff King, git

On Mon, 14 Dec 2009, Junio C Hamano wrote:

> Nicolas Pitre <nico@fluxnic.net> writes:
> 
> > On Mon, 14 Dec 2009, Eric Paris wrote:
> >
> >> On Mon, 2009-12-14 at 16:23 -0500, Jeff King wrote:
> >> > On Mon, Dec 14, 2009 at 04:20:29PM -0500, Eric Paris wrote:
> >> > 
> >> > > Updated to git-1.6.5.3-1 from Fedora rawhide and still git reflog ran
> >> > > for >5 minutes at 100% cpu (I killed it, it didn't finish)
> >> > > 
> >> > > I'm pushing a copy of the whole repo (all 1.9G after bzip compression)
> >> > > to
> >> > > 
> >> > > http://people.redhat.com/~eparis/git-tar/
> >> > 
> >> > Wowzers, that's big. Can you send just what's in .git?
> >> 
> >> So I zipped up just .git   1.2G.  I did a make clean and zipped up the
> >> whole repo  1.3G.
> >> 
> >> Just started pushing the 1.3G file.
> >> 
> >> Maybe having a .git directory that large is the problem?
> >
> > Shouldn't be, unless your repo is really badly packed.
> >
> > What's the output of 'git count-objects -v' ?
> 
> Didn't somebody say that the trace hints an infinite loop not "slow
> because of bad packing"?

Maybe.  But I was curious about the size too, which turns out to be 
really bad packing.  Of course bad packing shouldn't affect the 
correctness of the repository.


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-15  3:50       ` Nicolas Pitre
@ 2009-12-15  4:26         ` Eric Paris
  2009-12-16  3:03           ` Nicolas Pitre
  0 siblings, 1 reply; 30+ messages in thread
From: Eric Paris @ 2009-12-15  4:26 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Jeff King, git

On Mon, 2009-12-14 at 22:50 -0500, Nicolas Pitre wrote:
> On Mon, 14 Dec 2009, Jeff King wrote:
> 
> > On Mon, Dec 14, 2009 at 04:20:29PM -0500, Eric Paris wrote:
> > 
> > > I'm pushing a copy of the whole repo (all 1.9G after bzip compression)
> > > to
> > > 
> > > http://people.redhat.com/~eparis/git-tar/
> > > 
> > > But it's going to take a couple hours.
> > 
> > Holy cow. Almost 150 packs, and that's not even everything. The tarball
> > is missing a bunch of objects, because it points to your kernel-1 as an
> > alternate. So I suspect we would need that, as well, to recreate.
> 
> Hmmm... Rebasing repositories mixed with alternates...  I wonder if the 
> infinite loop might not actually be due to a delta cycle, especially if 
> the alternate is also rebasing.
> 
> So having the alternate, too, would certainly be interesting.

The alternative repo is slowing pushing up to that same location.  That
tar is 855838982, so just a tad bit smaller.

-Eric

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-15  4:26         ` Eric Paris
@ 2009-12-16  3:03           ` Nicolas Pitre
  2009-12-16  3:31             ` Eric Paris
  2009-12-16 13:41             ` Eric Paris
  0 siblings, 2 replies; 30+ messages in thread
From: Nicolas Pitre @ 2009-12-16  3:03 UTC (permalink / raw)
  To: Eric Paris; +Cc: Jeff King, git

On Mon, 14 Dec 2009, Eric Paris wrote:

> The alternative repo is slowing pushing up to that same location.  That
> tar is 855838982, so just a tad bit smaller.

It doesn't appear to be complete yet, and not progressing either.


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-16  3:03           ` Nicolas Pitre
@ 2009-12-16  3:31             ` Eric Paris
  2009-12-16 13:41             ` Eric Paris
  1 sibling, 0 replies; 30+ messages in thread
From: Eric Paris @ 2009-12-16  3:31 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Jeff King, git

On Tue, 2009-12-15 at 22:03 -0500, Nicolas Pitre wrote:
> On Mon, 14 Dec 2009, Eric Paris wrote:
> 
> > The alternative repo is slowing pushing up to that same location.  That
> > tar is 855838982, so just a tad bit smaller.
> 
> It doesn't appear to be complete yet, and not progressing either.

I ran out of quota and ask for more, but IT departments moves at the
speed of IT departments.  I'll delete the first one and just push the
alternative repo.  Once I get more space I'll try to get them both up at
once....

-Eric

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-16  3:03           ` Nicolas Pitre
  2009-12-16  3:31             ` Eric Paris
@ 2009-12-16 13:41             ` Eric Paris
  2009-12-16 21:06               ` Nicolas Pitre
  1 sibling, 1 reply; 30+ messages in thread
From: Eric Paris @ 2009-12-16 13:41 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Jeff King, git

On Tue, 2009-12-15 at 22:03 -0500, Nicolas Pitre wrote:
> On Mon, 14 Dec 2009, Eric Paris wrote:
> 
> > The alternative repo is slowing pushing up to that same location.  That
> > tar is 855838982, so just a tad bit smaller.
> 
> It doesn't appear to be complete yet, and not progressing either.

The alternative repo is now available (but the original is down)

I tried to run git gc --aggressive last night while I slept and got this
as output, maybe it helps point to a solution/problem?  The git reflog
portion ran for 5 hours and 36 minutes and appears to have finished.

$ git gc --aggressive
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
warning: reflog of 'refs/remotes/audit/master' references pruned commits
warning: reflog of 'refs/remotes/btrfs/enospc' references pruned commits
warning: reflog of 'refs/remotes/btrfs/merge' references pruned commits
warning: reflog of 'refs/remotes/btrfs/for-linus' references pruned commits
warning: reflog of 'refs/remotes/security-testing/for-linus' references pruned commits
error: Could not read 29b6c2fb1390b4fd350a5ecc78f1156fc5d91e9f
fatal: bad tree object 29b6c2fb1390b4fd350a5ecc78f1156fc5d91e9f
error: failed to run repack

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-16 13:41             ` Eric Paris
@ 2009-12-16 21:06               ` Nicolas Pitre
  2009-12-16 22:37                 ` Eric Paris
  0 siblings, 1 reply; 30+ messages in thread
From: Nicolas Pitre @ 2009-12-16 21:06 UTC (permalink / raw)
  To: Eric Paris; +Cc: Jeff King, git

On Wed, 16 Dec 2009, Eric Paris wrote:

> On Tue, 2009-12-15 at 22:03 -0500, Nicolas Pitre wrote:
> > On Mon, 14 Dec 2009, Eric Paris wrote:
> > 
> > > The alternative repo is slowing pushing up to that same location.  That
> > > tar is 855838982, so just a tad bit smaller.
> > 
> > It doesn't appear to be complete yet, and not progressing either.
> 
> The alternative repo is now available (but the original is down)
> 
> I tried to run git gc --aggressive last night while I slept and got this
> as output, maybe it helps point to a solution/problem?  The git reflog
> portion ran for 5 hours and 36 minutes and appears to have finished.

Yes.  I was able to reproduce your issue.  And because of the *horrible* 
repository packing, the reflog expiration process is taking ages when 
determining object reachability at a rate of one reflog entry every 2 
seconds or so.  With 4214 entries for the fsnotify-syscall branch, and 
1352 entries for the fsnotify branch, this already takes up asignificant 
portion of the actual run time.  I'm sure if your repository was 
properly packed this would take less than a minute.

Now, repacking doesn't work because...

> $ git gc --aggressive
> error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
> error: Could not read 29b6c2fb1390b4fd350a5ecc78f1156fc5d91e9f

Those objects are indeed missing from the repository.  Without them your 
repository is "broken".  Either you can find them somewhere else and 
copy them over, or salvage as much as you can by fetching the 
interesting branches into another freshly made repository.  This is 
unfortunate because I would have liked to see by how much this 
repository would have shrunk after a successful repack.

Of course, usage of alternates is recommended _only_ with repositories 
that are stable, i.e. don't ever add repositories to 
.git/objects/info/alternates if those repositories are rewinded/rebased 
and/or branches in them are deleted/replaced.  That could be a reason 
why some objects are now missing from the repository using alternates.


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-16 21:06               ` Nicolas Pitre
@ 2009-12-16 22:37                 ` Eric Paris
  2009-12-17  5:38                   ` Nicolas Pitre
  0 siblings, 1 reply; 30+ messages in thread
From: Eric Paris @ 2009-12-16 22:37 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Jeff King, git

On Wed, 2009-12-16 at 16:06 -0500, Nicolas Pitre wrote:
> On Wed, 16 Dec 2009, Eric Paris wrote:
> 
> > On Tue, 2009-12-15 at 22:03 -0500, Nicolas Pitre wrote:
> > > On Mon, 14 Dec 2009, Eric Paris wrote:
> > > 
> > > > The alternative repo is slowing pushing up to that same location.  That
> > > > tar is 855838982, so just a tad bit smaller.
> > > 
> > > It doesn't appear to be complete yet, and not progressing either.
> > 
> > The alternative repo is now available (but the original is down)
> > 
> > I tried to run git gc --aggressive last night while I slept and got this
> > as output, maybe it helps point to a solution/problem?  The git reflog
> > portion ran for 5 hours and 36 minutes and appears to have finished.
> 
> Yes.  I was able to reproduce your issue.  And because of the *horrible* 
> repository packing, the reflog expiration process is taking ages when 
> determining object reachability at a rate of one reflog entry every 2 
> seconds or so.  With 4214 entries for the fsnotify-syscall branch, and 
> 1352 entries for the fsnotify branch, this already takes up asignificant 
> portion of the actual run time.  I'm sure if your repository was 
> properly packed this would take less than a minute.

I'm guessing this is a result of stgit.?  These branches really should
be just a branch from a tag (which exists in kernel-1) and about 30-50
patches linearly applied on top.  I don't know how I get that many
objects.  I'm guessing many/most of them are crap that should be able to
be cleaned/deleted entirely as the rebasing/pushing/poping/updating that
stgit does under the covers should have rendered them pointless.  Not
really sure when/how that should/could have happened.

Should I be running git-gc every night?

> Now, repacking doesn't work because...
> 
> > $ git gc --aggressive
> > error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
> > error: Could not read 29b6c2fb1390b4fd350a5ecc78f1156fc5d91e9f

/me is pretty git dumb, but is there some way to figure out the parents
or children of these?  I just trolled through all of my directories
doing git show and didn't get any hits.  I guess I'll just clean up and
start over....

> Those objects are indeed missing from the repository.  Without them your 
> repository is "broken".  Either you can find them somewhere else and 
> copy them over, or salvage as much as you can by fetching the 
> interesting branches into another freshly made repository.  This is 
> unfortunate because I would have liked to see by how much this 
> repository would have shrunk after a successful repack.
> 
> Of course, usage of alternates is recommended _only_ with repositories 
> that are stable, i.e. don't ever add repositories to 
> .git/objects/info/alternates if those repositories are rewinded/rebased 
> and/or branches in them are deleted/replaced.  That could be a reason 
> why some objects are now missing from the repository using alternates.

So I'm not sure how I did things wrong.  my kernel-1 has those bunch of
remotes.  The linux-next remote, like I said, basically rebases to
linus' tree, then merges 150 random branches.  It tags that tree every
day and I pull those tags.  So I would never expect any objects from
those remote trees to ever disappear.

Now I created branches in kernel-1 and I certainly have done lots of
things like so

git checkout -b testing remotes/linux-next/master
[edit]
git commit -a
git checkout -b testing1 remotes/linux-next/master
git branch -D testing

My assumption though was that this wouldn't ever affect my other
repositories.  My other repository branches always started by checking
out a branch with remotes/*/* as the base.

My understanding was that I would only run into problems if I used
something on a branch I created myself in the alternatives repo in other
repos (and I didn't remove remotes)

I guess it's not impossible to believe that at some point in time i
would have exported patches to and mbox from kernel-1 and applied them
to kernel-2 or vice versa.  I guess this would end up with the same
objects, right?  Then if I deleted the branch in kernel-1 I would have
problems in kernel-2?

I guess I'll rebuild my setup

new kernel-alt has just the remotes, and my kernel-1,2,3 all alt to it
I'll never have local branches in my kernel-alt
I'll run git-gc every night
I'll hope to never have problem again.

Sound good?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-16 22:37                 ` Eric Paris
@ 2009-12-17  5:38                   ` Nicolas Pitre
  2009-12-17 16:29                     ` Eric Paris
  0 siblings, 1 reply; 30+ messages in thread
From: Nicolas Pitre @ 2009-12-17  5:38 UTC (permalink / raw)
  To: Eric Paris; +Cc: Jeff King, git

On Wed, 16 Dec 2009, Eric Paris wrote:

> On Wed, 2009-12-16 at 16:06 -0500, Nicolas Pitre wrote:
> > On Wed, 16 Dec 2009, Eric Paris wrote:
> > 
> > > On Tue, 2009-12-15 at 22:03 -0500, Nicolas Pitre wrote:
> > > > On Mon, 14 Dec 2009, Eric Paris wrote:
> > > > 
> > > > > The alternative repo is slowing pushing up to that same location.  That
> > > > > tar is 855838982, so just a tad bit smaller.
> > > > 
> > > > It doesn't appear to be complete yet, and not progressing either.
> > > 
> > > The alternative repo is now available (but the original is down)
> > > 
> > > I tried to run git gc --aggressive last night while I slept and got this
> > > as output, maybe it helps point to a solution/problem?  The git reflog
> > > portion ran for 5 hours and 36 minutes and appears to have finished.
> > 
> > Yes.  I was able to reproduce your issue.  And because of the *horrible* 
> > repository packing, the reflog expiration process is taking ages when 
> > determining object reachability at a rate of one reflog entry every 2 
> > seconds or so.  With 4214 entries for the fsnotify-syscall branch, and 
> > 1352 entries for the fsnotify branch, this already takes up asignificant 
> > portion of the actual run time.  I'm sure if your repository was 
> > properly packed this would take less than a minute.
> 
> I'm guessing this is a result of stgit.?  These branches really should
> be just a branch from a tag (which exists in kernel-1) and about 30-50
> patches linearly applied on top.  I don't know how I get that many
> objects.  I'm guessing many/most of them are crap that should be able to
> be cleaned/deleted entirely as the rebasing/pushing/poping/updating that
> stgit does under the covers should have rendered them pointless.  Not
> really sure when/how that should/could have happened.

Possible.  Commit operations (including patch applications) always 
create loose objects because this is fast, with the expectation that 
they get collected in a pack later.

> Should I be running git-gc every night?

This is certainly a good thing to do given your heavy stgit usage.

> > Now, repacking doesn't work because...
> > 
> > > $ git gc --aggressive
> > > error: Could not read d936ff8a7b0841b51ddf96afa24a30b016824cb2
> > > error: Could not read 29b6c2fb1390b4fd350a5ecc78f1156fc5d91e9f
> 
> /me is pretty git dumb, but is there some way to figure out the parents
> or children of these?  I just trolled through all of my directories
> doing git show and didn't get any hits.  I guess I'll just clean up and
> start over....

Moving the reflog data aside (i.e. mv .git/logs .git/logs.bak) it seems 
that d936ff8 is not referenced anymore.

I found the other one as follows:

First I tried

$ git rev-list --all --objects

This resulted in:

[...]
4f7911b0b0dbd187131a109cf00161a0c6a9d727 arch/x86
ea868257c1eabc31e0ea7941efa42b543978b3fa arch/x86/kvm
a0c11ead723956c667172a9f3fb6787684fe7ff5 arch/x86/kvm/paging_tmpl.h
b556b6aad8b1aacfecb1dd4a56dbd389674687b5 arch/x86/kvm/x86.c
68a9733ae3315d7e2bfec2037dfeee4db8a6f6a1 drivers
error: Could not read 29b6c2fb1390b4fd350a5ecc78f1156fc5d91e9f
fatal: bad tree object 29b6c2fb1390b4fd350a5ecc78f1156fc5d91e9f

Because of the way objects are enumerated, we can be pretty sure that 
the bad tree object is referenced by the tree object 68a9733a 
corresponding to drivers/.  Let's verify that:

$ git ls-tree 68a9733a
100644 blob 00cf9553f74065291612b0971337f79995933a06    Kconfig
100644 blob c1bf41737936ab00be4a87563a0bb0638074785d    Makefile
040000 tree d4e847de9bf2450842936582ea7cc6778413825b    accessibility
040000 tree 29b6c2fb1390b4fd350a5ecc78f1156fc5d91e9f    acpi
[...]

Yep, we found it there.  So the missing tree object corresponds to 
drivers/acpi/.  So to find the latest commit to which this particular 
tree object is referenced by, we just need to look at the same rev-list 
output above (piped into less is handy here) and scroll up until an 
object with no name is found.  This would usually be the first root tree 
object referencing the named objects that follow.  Here I get aafb68eb.  
To be sure, let's list it so to confirm it really contains a reference 
to the 68a9733a drivers tree:

$ git ls-tree aafb68eb
[...]
040000 tree 68a9733ae3315d7e2bfec2037dfeee4db8a6f6a1    drivers
[...]

So yes, we've got the right root tree object.  Now finding the 
corresponding commit should be easy:

$ git log --all --pretty=raw

Then within less, a search for aafb68eb brings us to this:

commit 2e765e9c87a337131aad3014f9a7e5e878c7d0a0
tree aafb68eb84f96c9ab5697c6e8d10d5006d1e7209
parent a2c2de42295b3ac29758f454a7072338e5555ca3
author Eric Paris <eparis@redhat.com> 1237233261 -0400
committer Eric Paris <eparis@redhat.com> 1237233261 -0400

    refresh     64d34c511b1125d9efd2926e683e019f15dec5b4

So this is referenced by a commit that you made on the 1237233261th 
second since January 1, 1970 i.e. 2009-03-16 19:54:21 +0000 which is 
quite a while ago.  Or given the nature of the commit log, this is 
probably some stgit branch.

Note that the missing tree didn't necessarily appear with that commit.  
Because of the recency ordering from rev-list, all we can say is that 
this is the last commit on that particular branch to reference that 
tree, but it might have been introduced in the repository way before 
that point in time.

Now let's try to find out what branch(es) actually link(s) to this 
commit:

$ git branch -a --contains 2e765e9c

This comes empty.  This is because 'git branch' looks only in the 
refs/heads/ and refs/remotes namespace (or only one of them without -a).  

Scripting something around 'git for-each-ref' and 'git merge-base' could 
be done, such as:

	TARGET=2e765e9c87a337131aad3014f9a7e5e878c7d0a0
	git for-each-ref refs/* |
	while read sha1 type ref; do
		if [ "$(git merge-base $sha1 $TARGET)" = "$TARGET" ]; then
			echo "referenced by $type $ref"
		fi
	done

But this is slow, for the same reason as 'git reflog expire' above.  But 
letting it run for a while should give you at least one answer.

> > Of course, usage of alternates is recommended _only_ with repositories 
> > that are stable, i.e. don't ever add repositories to 
> > .git/objects/info/alternates if those repositories are rewinded/rebased 
> > and/or branches in them are deleted/replaced.  That could be a reason 
> > why some objects are now missing from the repository using alternates.
> 
> So I'm not sure how I did things wrong.  my kernel-1 has those bunch of
> remotes.  The linux-next remote, like I said, basically rebases to
> linus' tree, then merges 150 random branches.  It tags that tree every
> day and I pull those tags.  So I would never expect any objects from
> those remote trees to ever disappear.

Right.

> Now I created branches in kernel-1 and I certainly have done lots of
> things like so
> 
> git checkout -b testing remotes/linux-next/master
> [edit]
> git commit -a
> git checkout -b testing1 remotes/linux-next/master
> git branch -D testing
> 
> My assumption though was that this wouldn't ever affect my other
> repositories.  My other repository branches always started by checking
> out a branch with remotes/*/* as the base.
> 
> My understanding was that I would only run into problems if I used
> something on a branch I created myself in the alternatives repo in other
> repos (and I didn't remove remotes)
> 
> I guess it's not impossible to believe that at some point in time i
> would have exported patches to and mbox from kernel-1 and applied them
> to kernel-2 or vice versa.  I guess this would end up with the same
> objects, right?  Then if I deleted the branch in kernel-1 I would have
> problems in kernel-2?

Eventually, yes. After a while the auto repack in kernel2 would notice 
that some objects are in kernel1 already and purge them from kernel2.  
And if those objects were part of a deleted branch then kernel1 would 
get rid of those objects too once the reflog with a reference to that 
deleted branch expires.  The unsuspecting kernel2 repo then gets broken.

> I guess I'll rebuild my setup
> 
> new kernel-alt has just the remotes, and my kernel-1,2,3 all alt to it
> I'll never have local branches in my kernel-alt
> I'll run git-gc every night
> I'll hope to never have problem again.
> 
> Sound good?

Yes.  And make sure not to fetch rebasing repositories, such as 
linux-next, into kernel-alt without keeping a tag for each fetched state 
otherwise you'll accumulate unreferenced objects which the other 
repositories might rely upon.


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-17  5:38                   ` Nicolas Pitre
@ 2009-12-17 16:29                     ` Eric Paris
  2009-12-18  3:33                       ` Nicolas Pitre
  0 siblings, 1 reply; 30+ messages in thread
From: Eric Paris @ 2009-12-17 16:29 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Jeff King, git

On Thu, 2009-12-17 at 00:38 -0500, Nicolas Pitre wrote:

> Moving the reflog data aside (i.e. mv .git/logs .git/logs.bak) it seems 
> that d936ff8 is not referenced anymore.
> 
> I found the other one as follows:
> 
> First I tried
> 
> $ git rev-list --all --objects
> 
> This resulted in:
> 
> [...]
> 4f7911b0b0dbd187131a109cf00161a0c6a9d727 arch/x86
> ea868257c1eabc31e0ea7941efa42b543978b3fa arch/x86/kvm
> a0c11ead723956c667172a9f3fb6787684fe7ff5 arch/x86/kvm/paging_tmpl.h
> b556b6aad8b1aacfecb1dd4a56dbd389674687b5 arch/x86/kvm/x86.c
> 68a9733ae3315d7e2bfec2037dfeee4db8a6f6a1 drivers
> error: Could not read 29b6c2fb1390b4fd350a5ecc78f1156fc5d91e9f
> fatal: bad tree object 29b6c2fb1390b4fd350a5ecc78f1156fc5d91e9f
> 
> Because of the way objects are enumerated, we can be pretty sure that 
> the bad tree object is referenced by the tree object 68a9733a 
> corresponding to drivers/.  Let's verify that:
> 
> $ git ls-tree 68a9733a
> 100644 blob 00cf9553f74065291612b0971337f79995933a06    Kconfig
> 100644 blob c1bf41737936ab00be4a87563a0bb0638074785d    Makefile
> 040000 tree d4e847de9bf2450842936582ea7cc6778413825b    accessibility
> 040000 tree 29b6c2fb1390b4fd350a5ecc78f1156fc5d91e9f    acpi

This alone almost certainly tells me how I broke it.

For quite some time (a period of months) linux-next was broken and I had
to carry a patch to ACPI to make it boot.  I dropped that patch at the
head of my stgit trees in all of my repositories.  So I wouldn't be at
all surprised to learn that eventually kernel-2 found that object in
kernel-1.  Sometime when I dropped that patch from kernel-1 (because it
finally got fixed upstream) I can see how it broke.

But now that patch shouldn't be needed by any tree since I have long
since dropped it from the stgit stack.  So if we cleaned up all of the
useless objects in this tree I bet this object wouldn't be needed.  Not
exactly a situation that I'd expect git to be able to dig out of itself
thought.

I'm creating clean repos and going to do no work in my -alt    :)

Thanks everyone!

-Eric

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-17 16:29                     ` Eric Paris
@ 2009-12-18  3:33                       ` Nicolas Pitre
  2009-12-18  3:44                         ` Steven Noonan
  2009-12-18  3:55                         ` Eric Paris
  0 siblings, 2 replies; 30+ messages in thread
From: Nicolas Pitre @ 2009-12-18  3:33 UTC (permalink / raw)
  To: Eric Paris; +Cc: Jeff King, git

On Thu, 17 Dec 2009, Eric Paris wrote:

> This alone almost certainly tells me how I broke it.
> 
> For quite some time (a period of months) linux-next was broken and I had
> to carry a patch to ACPI to make it boot.  I dropped that patch at the
> head of my stgit trees in all of my repositories.  So I wouldn't be at
> all surprised to learn that eventually kernel-2 found that object in
> kernel-1.  Sometime when I dropped that patch from kernel-1 (because it
> finally got fixed upstream) I can see how it broke.
> 
> But now that patch shouldn't be needed by any tree since I have long
> since dropped it from the stgit stack.  So if we cleaned up all of the
> useless objects in this tree I bet this object wouldn't be needed.  Not
> exactly a situation that I'd expect git to be able to dig out of itself
> thought.

I let the script I provided previously ran for a while.  And the commit 
I found to contain the missing object belongs to 
refs/patches/fsnotify/fsnotify-group-priorities.log.  So I simply 
deleted that branch entirely and now the repack can proceed.  And with a 
'git gc --aggressive' the 1.2GB repository shrank to a mere 5.2 MB.  :-) 
Of course I didn't bring back all the reflogs though.  But I would 
have expected a repository reduction of the same magnitude even with 
them.


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-18  3:33                       ` Nicolas Pitre
@ 2009-12-18  3:44                         ` Steven Noonan
  2009-12-18  3:52                           ` Eric Paris
  2009-12-18  3:57                           ` Nicolas Pitre
  2009-12-18  3:55                         ` Eric Paris
  1 sibling, 2 replies; 30+ messages in thread
From: Steven Noonan @ 2009-12-18  3:44 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Eric Paris, Jeff King, git

On Thu, Dec 17, 2009 at 7:33 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Thu, 17 Dec 2009, Eric Paris wrote:
>
>> This alone almost certainly tells me how I broke it.
>>
>> For quite some time (a period of months) linux-next was broken and I had
>> to carry a patch to ACPI to make it boot.  I dropped that patch at the
>> head of my stgit trees in all of my repositories.  So I wouldn't be at
>> all surprised to learn that eventually kernel-2 found that object in
>> kernel-1.  Sometime when I dropped that patch from kernel-1 (because it
>> finally got fixed upstream) I can see how it broke.
>>
>> But now that patch shouldn't be needed by any tree since I have long
>> since dropped it from the stgit stack.  So if we cleaned up all of the
>> useless objects in this tree I bet this object wouldn't be needed.  Not
>> exactly a situation that I'd expect git to be able to dig out of itself
>> thought.
>
> I let the script I provided previously ran for a while.  And the commit
> I found to contain the missing object belongs to
> refs/patches/fsnotify/fsnotify-group-priorities.log.  So I simply
> deleted that branch entirely and now the repack can proceed.  And with a
> 'git gc --aggressive' the 1.2GB repository shrank to a mere 5.2 MB.  :-)
> Of course I didn't bring back all the reflogs though.  But I would
> have expected a repository reduction of the same magnitude even with
> them.
>

Are we talking about the same Linux kernel repository as before?
Because if so, that reduction in size doesn't make any sense to me.
The smallest size I've seen for the Linux kernel repository (in the
past year) is 250MB.

- Steven

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-18  3:44                         ` Steven Noonan
@ 2009-12-18  3:52                           ` Eric Paris
  2009-12-18  3:57                           ` Nicolas Pitre
  1 sibling, 0 replies; 30+ messages in thread
From: Eric Paris @ 2009-12-18  3:52 UTC (permalink / raw)
  To: Steven Noonan; +Cc: Nicolas Pitre, Jeff King, git

On Thu, 2009-12-17 at 19:44 -0800, Steven Noonan wrote:
> On Thu, Dec 17, 2009 at 7:33 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
> > On Thu, 17 Dec 2009, Eric Paris wrote:
> >
> >> This alone almost certainly tells me how I broke it.
> >>
> >> For quite some time (a period of months) linux-next was broken and I had
> >> to carry a patch to ACPI to make it boot.  I dropped that patch at the
> >> head of my stgit trees in all of my repositories.  So I wouldn't be at
> >> all surprised to learn that eventually kernel-2 found that object in
> >> kernel-1.  Sometime when I dropped that patch from kernel-1 (because it
> >> finally got fixed upstream) I can see how it broke.
> >>
> >> But now that patch shouldn't be needed by any tree since I have long
> >> since dropped it from the stgit stack.  So if we cleaned up all of the
> >> useless objects in this tree I bet this object wouldn't be needed.  Not
> >> exactly a situation that I'd expect git to be able to dig out of itself
> >> thought.
> >
> > I let the script I provided previously ran for a while.  And the commit
> > I found to contain the missing object belongs to
> > refs/patches/fsnotify/fsnotify-group-priorities.log.  So I simply
> > deleted that branch entirely and now the repack can proceed.  And with a
> > 'git gc --aggressive' the 1.2GB repository shrank to a mere 5.2 MB.  :-)
> > Of course I didn't bring back all the reflogs though.  But I would
> > have expected a repository reduction of the same magnitude even with
> > them.
> >
> 
> Are we talking about the same Linux kernel repository as before?
> Because if so, that reduction in size doesn't make any sense to me.
> The smallest size I've seen for the Linux kernel repository (in the
> past year) is 250MB.

Remember that the real code object are in an alternative repository
which isn't going to shrink like this.  (A nicely packed repo with the
majority of the objects in question is around 500M)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-18  3:33                       ` Nicolas Pitre
  2009-12-18  3:44                         ` Steven Noonan
@ 2009-12-18  3:55                         ` Eric Paris
  1 sibling, 0 replies; 30+ messages in thread
From: Eric Paris @ 2009-12-18  3:55 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Jeff King, git

On Thu, 2009-12-17 at 22:33 -0500, Nicolas Pitre wrote:
> On Thu, 17 Dec 2009, Eric Paris wrote:
> 
> > This alone almost certainly tells me how I broke it.
> > 
> > For quite some time (a period of months) linux-next was broken and I had
> > to carry a patch to ACPI to make it boot.  I dropped that patch at the
> > head of my stgit trees in all of my repositories.  So I wouldn't be at
> > all surprised to learn that eventually kernel-2 found that object in
> > kernel-1.  Sometime when I dropped that patch from kernel-1 (because it
> > finally got fixed upstream) I can see how it broke.
> > 
> > But now that patch shouldn't be needed by any tree since I have long
> > since dropped it from the stgit stack.  So if we cleaned up all of the
> > useless objects in this tree I bet this object wouldn't be needed.  Not
> > exactly a situation that I'd expect git to be able to dig out of itself
> > thought.
> 
> I let the script I provided previously ran for a while.  And the commit 
> I found to contain the missing object belongs to 
> refs/patches/fsnotify/fsnotify-group-priorities.log.

At least when I thought it was in ACPI I could imagine what I had done
wrong.  Now I'm not so sure.

In any case, I've redesigned with a clear alternative repo that I never
work in and a cron job to clean up garbage every night.  So hopefully
noone will hear from me again.

Nicolas, thanks so much for hunting this down!

-Eric

>   So I simply 
> deleted that branch entirely and now the repack can proceed.  And with a 
> 'git gc --aggressive' the 1.2GB repository shrank to a mere 5.2 MB.  :-) 
> Of course I didn't bring back all the reflogs though.  But I would 
> have expected a repository reduction of the same magnitude even with 
> them.
> 
> 
> Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-18  3:44                         ` Steven Noonan
  2009-12-18  3:52                           ` Eric Paris
@ 2009-12-18  3:57                           ` Nicolas Pitre
  2009-12-18  4:26                             ` Steven Noonan
  1 sibling, 1 reply; 30+ messages in thread
From: Nicolas Pitre @ 2009-12-18  3:57 UTC (permalink / raw)
  To: Steven Noonan; +Cc: Eric Paris, Jeff King, git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1965 bytes --]

On Thu, 17 Dec 2009, Steven Noonan wrote:

> On Thu, Dec 17, 2009 at 7:33 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
> > On Thu, 17 Dec 2009, Eric Paris wrote:
> >
> >> This alone almost certainly tells me how I broke it.
> >>
> >> For quite some time (a period of months) linux-next was broken and I had
> >> to carry a patch to ACPI to make it boot.  I dropped that patch at the
> >> head of my stgit trees in all of my repositories.  So I wouldn't be at
> >> all surprised to learn that eventually kernel-2 found that object in
> >> kernel-1.  Sometime when I dropped that patch from kernel-1 (because it
> >> finally got fixed upstream) I can see how it broke.
> >>
> >> But now that patch shouldn't be needed by any tree since I have long
> >> since dropped it from the stgit stack.  So if we cleaned up all of the
> >> useless objects in this tree I bet this object wouldn't be needed.  Not
> >> exactly a situation that I'd expect git to be able to dig out of itself
> >> thought.
> >
> > I let the script I provided previously ran for a while.  And the commit
> > I found to contain the missing object belongs to
> > refs/patches/fsnotify/fsnotify-group-priorities.log.  So I simply
> > deleted that branch entirely and now the repack can proceed.  And with a
> > 'git gc --aggressive' the 1.2GB repository shrank to a mere 5.2 MB.  :-)
> > Of course I didn't bring back all the reflogs though.  But I would
> > have expected a repository reduction of the same magnitude even with
> > them.
> >
> 
> Are we talking about the same Linux kernel repository as before?

As before in this thread.

> Because if so, that reduction in size doesn't make any sense to me.

Sure it does.

> The smallest size I've seen for the Linux kernel repository (in the
> past year) is 250MB.

Depends if you have an alternate repository from which you may borrow 
objects from, which was the case here.  In that context, 1.2 GB of disk 
space was completely insane.


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: git-reflog 70 minutes at 100% cpu and counting
  2009-12-18  3:57                           ` Nicolas Pitre
@ 2009-12-18  4:26                             ` Steven Noonan
  0 siblings, 0 replies; 30+ messages in thread
From: Steven Noonan @ 2009-12-18  4:26 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Eric Paris, Jeff King, git

On Thu, Dec 17, 2009 at 7:57 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Thu, 17 Dec 2009, Steven Noonan wrote:
>
>> On Thu, Dec 17, 2009 at 7:33 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
>> > On Thu, 17 Dec 2009, Eric Paris wrote:
>> >
>> >> This alone almost certainly tells me how I broke it.
>> >>
>> >> For quite some time (a period of months) linux-next was broken and I had
>> >> to carry a patch to ACPI to make it boot.  I dropped that patch at the
>> >> head of my stgit trees in all of my repositories.  So I wouldn't be at
>> >> all surprised to learn that eventually kernel-2 found that object in
>> >> kernel-1.  Sometime when I dropped that patch from kernel-1 (because it
>> >> finally got fixed upstream) I can see how it broke.
>> >>
>> >> But now that patch shouldn't be needed by any tree since I have long
>> >> since dropped it from the stgit stack.  So if we cleaned up all of the
>> >> useless objects in this tree I bet this object wouldn't be needed.  Not
>> >> exactly a situation that I'd expect git to be able to dig out of itself
>> >> thought.
>> >
>> > I let the script I provided previously ran for a while.  And the commit
>> > I found to contain the missing object belongs to
>> > refs/patches/fsnotify/fsnotify-group-priorities.log.  So I simply
>> > deleted that branch entirely and now the repack can proceed.  And with a
>> > 'git gc --aggressive' the 1.2GB repository shrank to a mere 5.2 MB.  :-)
>> > Of course I didn't bring back all the reflogs though.  But I would
>> > have expected a repository reduction of the same magnitude even with
>> > them.
>> >
>>
>> Are we talking about the same Linux kernel repository as before?
>
> As before in this thread.
>
>> Because if so, that reduction in size doesn't make any sense to me.
>
> Sure it does.
>
>> The smallest size I've seen for the Linux kernel repository (in the
>> past year) is 250MB.
>
> Depends if you have an alternate repository from which you may borrow
> objects from, which was the case here.  In that context, 1.2 GB of disk
> space was completely insane.
>

Ahh. That makes sense. I should really read up on alternates then.

- Steven

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2009-12-18  4:27 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-14 20:28 git-reflog 70 minutes at 100% cpu and counting Eric Paris
2009-12-14 20:41 ` Sverre Rabbelier
2009-12-14 21:11 ` Jeff King
2009-12-14 21:20   ` Eric Paris
2009-12-14 21:23     ` Jeff King
2009-12-14 21:56       ` Eric Paris
2009-12-14 22:03         ` Sverre Rabbelier
2009-12-15  0:29           ` Nicolas Pitre
2009-12-14 22:14         ` Jeff King
2009-12-15  0:26         ` Nicolas Pitre
2009-12-15  0:36           ` Junio C Hamano
2009-12-15  3:58             ` Nicolas Pitre
2009-12-15  2:11           ` Eric Paris
2009-12-15  3:44             ` Nicolas Pitre
2009-12-15  2:39     ` Jeff King
2009-12-15  3:50       ` Nicolas Pitre
2009-12-15  4:26         ` Eric Paris
2009-12-16  3:03           ` Nicolas Pitre
2009-12-16  3:31             ` Eric Paris
2009-12-16 13:41             ` Eric Paris
2009-12-16 21:06               ` Nicolas Pitre
2009-12-16 22:37                 ` Eric Paris
2009-12-17  5:38                   ` Nicolas Pitre
2009-12-17 16:29                     ` Eric Paris
2009-12-18  3:33                       ` Nicolas Pitre
2009-12-18  3:44                         ` Steven Noonan
2009-12-18  3:52                           ` Eric Paris
2009-12-18  3:57                           ` Nicolas Pitre
2009-12-18  4:26                             ` Steven Noonan
2009-12-18  3:55                         ` Eric Paris

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).