* Recovering from epic fail (deleted .git/objects/pack)
@ 2008-12-10 0:11 R. Tyler Ballance
2008-12-10 0:19 ` Junio C Hamano
0 siblings, 1 reply; 22+ messages in thread
From: R. Tyler Ballance @ 2008-12-10 0:11 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 873 bytes --]
I really wish I didn't have to ask this question, as we discussed in
#git early this morning, whiskey is the likely answer.
For unexplainable reasons one of our sysadmins got trigger-happy when he
tried to prune a temp_pack file generated and left in a
developer's .git/ directory after a git operation aborted (disk quota
exceeded)
As a result, the sysadmin killed the developers
entire .git/objects/pack/ directory. (insert copious amounts of whiskey
here)
He did not however delete all the other contents of .git/objects (00/,
01/, etc)
Is there a feasible way that I can properly recover
the .git/objects/pack directory such that the developer who had their
last two weeks of local work thrashed can get it back?
Anything that can help (other than pummeling the sysadmin) would be
appreciated
Cheers
--
-R. Tyler Ballance
Slide, Inc.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Recovering from epic fail (deleted .git/objects/pack)
2008-12-10 0:11 Recovering from epic fail (deleted .git/objects/pack) R. Tyler Ballance
@ 2008-12-10 0:19 ` Junio C Hamano
2008-12-10 10:06 ` R. Tyler Ballance
0 siblings, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2008-12-10 0:19 UTC (permalink / raw)
To: R. Tyler Ballance; +Cc: git
"R. Tyler Ballance" <tyler@slide.com> writes:
> I really wish I didn't have to ask this question, as we discussed in
> #git early this morning, whiskey is the likely answer.
>
> For unexplainable reasons one of our sysadmins got trigger-happy when he
> tried to prune a temp_pack file generated and left in a
> developer's .git/ directory after a git operation aborted (disk quota
> exceeded)
>
> As a result, the sysadmin killed the developers
> entire .git/objects/pack/ directory. (insert copious amounts of whiskey
> here)
>
> He did not however delete all the other contents of .git/objects (00/,
> 01/, etc)
>
> Is there a feasible way that I can properly recover
> the .git/objects/pack directory such that the developer who had their
> last two weeks of local work thrashed can get it back?
I do not know about "feasible" and "properly", but ...
(0) take backup of the repository of this unfortunate developer.
(1) make a fresh clone of the central repository that this unfortunate
developer's work started out from.
(2) copy the contents of the .git/objects/pack/ of that clone to the
developer's .git/objects/pack/.
See if "fsck --full" complains after that. If the repository was not
repacked during that period, all objects created by the activity by the
unfortunate developer would be loose, so ...
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Recovering from epic fail (deleted .git/objects/pack)
2008-12-10 0:19 ` Junio C Hamano
@ 2008-12-10 10:06 ` R. Tyler Ballance
2008-12-10 11:39 ` Johannes Sixt
0 siblings, 1 reply; 22+ messages in thread
From: R. Tyler Ballance @ 2008-12-10 10:06 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 1517 bytes --]
On Tue, 2008-12-09 at 16:19 -0800, Junio C Hamano wrote:
> I do not know about "feasible" and "properly", but ...
>
> (0) take backup of the repository of this unfortunate developer.
>
> (1) make a fresh clone of the central repository that this unfortunate
> developer's work started out from.
>
> (2) copy the contents of the .git/objects/pack/ of that clone to the
> developer's .git/objects/pack/.
This approach "sort of" worked, i.e. it worked insofar that I was able
to use the repository enough to generate a series of patch files for the
developer's work from the last two weeks to be applied to their new
clone of the central repository. Why I did this is answered below ;)
>
> See if "fsck --full" complains after that. If the repository was not
> repacked during that period, all objects created by the activity by the
> unfortunate developer would be loose, so ...
tyler@ccnet:~/source/slide/brian_main> time git fsck --full
Segmentation fault
real 27m2.187s
user 10m3.238s
sys 0m16.609s
tyler@ccnet:~/source/slide/brian_main>
Oh well, your approach worked *enough* to get the important data out,
and that's what's most important.
Moving forward we're likely going to implement an automated process of
walking through developers' repositories and pushing any unpushed refs
to a backup repository just to make sure something like this doesn't
happen again.
Appreciate the help :)
Cheers
--
-R. Tyler Ballance
Slide, Inc.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Recovering from epic fail (deleted .git/objects/pack)
2008-12-10 10:06 ` R. Tyler Ballance
@ 2008-12-10 11:39 ` Johannes Sixt
2008-12-10 22:52 ` epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack)) R. Tyler Ballance
0 siblings, 1 reply; 22+ messages in thread
From: Johannes Sixt @ 2008-12-10 11:39 UTC (permalink / raw)
To: R. Tyler Ballance; +Cc: Junio C Hamano, git
R. Tyler Ballance schrieb:
> On Tue, 2008-12-09 at 16:19 -0800, Junio C Hamano wrote:
>> See if "fsck --full" complains after that. If the repository was not
>> repacked during that period, all objects created by the activity by the
>> unfortunate developer would be loose, so ...
>
> tyler@ccnet:~/source/slide/brian_main> time git fsck --full
> Segmentation fault
Please make a backup (tarball) of the repository that shows this segfault.
'git fsck' is not supposed to segfault, no matter what garbage is thrown
at it.
Can you make a backtrace of this failing 'git fsck --full' invocation?
-- Hannes
^ permalink raw reply [flat|nested] 22+ messages in thread
* epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack))
2008-12-10 11:39 ` Johannes Sixt
@ 2008-12-10 22:52 ` R. Tyler Ballance
2008-12-10 23:40 ` Linus Torvalds
0 siblings, 1 reply; 22+ messages in thread
From: R. Tyler Ballance @ 2008-12-10 22:52 UTC (permalink / raw)
To: Johannes Sixt; +Cc: Junio C Hamano, git
[-- Attachment #1: Type: text/plain, Size: 1750 bytes --]
On Wed, 2008-12-10 at 12:39 +0100, Johannes Sixt wrote:
> R. Tyler Ballance schrieb:
> > On Tue, 2008-12-09 at 16:19 -0800, Junio C Hamano wrote:
> >> See if "fsck --full" complains after that. If the repository was not
> >> repacked during that period, all objects created by the activity by the
> >> unfortunate developer would be loose, so ...
> >
> > tyler@ccnet:~/source/slide/brian_main> time git fsck --full
> > Segmentation fault
>
> Please make a backup (tarball) of the repository that shows this segfault.
> 'git fsck' is not supposed to segfault, no matter what garbage is thrown
> at it.
>
> Can you make a backtrace of this failing 'git fsck --full' invocation?
I decided to endure the 30 minutes long this took on machine, and ran
the operation in gdb. As a result, I got the SIGSEGV again, and a 13MB
stacktrace.
In fact, the stack trace was probably longer, but this happened while I
printed out `bt full`:
#80496 0x00000000004244bc in fsck_handle_ref (refname=0x162aa61
"refs/heads/master", sha1=0x162aa39 "S\236\024(f\210��V\027�'�E
\025�u�g",
flag=<value optimized out>, cb_data=<value optimized out>)
at builtin-fsck.c:118
obj = <value optimized out>
#80497 0x000000000047f61c in do_for_each_ref (base=0x4a7022
"refs/", fn=0x424450 <fsck_handle_ref>, trim=0, cb_data=0x0) at
refs.c:582
[1] 29388 segmentation fault (core dumped) gdb git
tyler@ccnet:~/source/slide/brian_main>
The "full" trace is here:
http://pineapple.monkeypox.org/scratch/git-1.6.0.2-fsck-sigsegv.trace
I think I'm going to need to have a drink :-/
--
-R. Tyler Ballance
Slide, Inc.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack))
2008-12-10 22:52 ` epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack)) R. Tyler Ballance
@ 2008-12-10 23:40 ` Linus Torvalds
2008-12-11 0:24 ` R. Tyler Ballance
` (3 more replies)
0 siblings, 4 replies; 22+ messages in thread
From: Linus Torvalds @ 2008-12-10 23:40 UTC (permalink / raw)
To: R. Tyler Ballance; +Cc: Johannes Sixt, Junio C Hamano, git
On Wed, 10 Dec 2008, R. Tyler Ballance wrote:
>
> I decided to endure the 30 minutes long this took on machine, and ran
> the operation in gdb. As a result, I got the SIGSEGV again, and a 13MB
> stacktrace.
>
> In fact, the stack trace was probably longer, but this happened while I
> printed out `bt full`:
Wow. You even got _gdb_ to segfault.
You're my hero. If it can break, you will do it.
> I think I'm going to need to have a drink :-/
Have one for me too.
Anyway, that's a really annoying problem, and it's a bug in git.
Admittedly it's probably brought on by you having a fairly small stack
ulimit, which is also what likely brought gdb to its knees.
That stupid fsck commit walker walks the parents recursively. That's
horribly bogus. So you have a recursion that goes from the top-level
commit all the way to the root, doing
fsck_walk_commit -> walk(parent) -> fsck_walk-commit -> ..
and you have a fairly deep commit tree.
When it hits parent number 80,000+, you run out of stack space, and
SIGSEGV. And judging by the fact that gdb also SIGSEGV's for you when
doing the backtrace, it looks like the gdb backtrace tracer is _also_
recursive, and _also_ hits the same issue ;)
Anyway, with a 8M stack-size I can fsck the kernel repo without any
problem, but while the kernel repo has something like 120k commits in it,
it's a very "bushy" repository (lots of parallelism and merges), and the
path from the top parent to the root is actually much shorter, at just 27k
commits.
I take it that your project has a very long and linear history, which is
why you have a long path from your HEAD to your root.
(You can do something like
git rev-list --first-parent HEAD | wc -l
to get the depth of your history when just walking the first parent, and
if I'm right you'll have a number that is bgger then 80k.)
So you have definitely found a real bug. Right now, you should be able to
work around it by just making your stack depth rather bigger. The
recursion is not very complicated, so even though it's 80,000 deep, each
entry probably is about a hundred bytes on the stack.
In fact, if you're on Linux, most default stack depths would be 8 MB,
which would roughly match that "80k entries of 100 bytes each".
But we should definitely fix this braindamage in fsck. Rather than
recursively walk the commits, we should add them to a commit list and just
walk the list iteratively.
Junio?
Linus
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack))
2008-12-10 23:40 ` Linus Torvalds
@ 2008-12-11 0:24 ` R. Tyler Ballance
2008-12-11 0:45 ` Linus Torvalds
2008-12-11 0:51 ` epic fsck SIGSEGV! Junio C Hamano
` (2 subsequent siblings)
3 siblings, 1 reply; 22+ messages in thread
From: R. Tyler Ballance @ 2008-12-11 0:24 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Johannes Sixt, Junio C Hamano, git
[-- Attachment #1: Type: text/plain, Size: 2798 bytes --]
On Wed, 2008-12-10 at 15:40 -0800, Linus Torvalds wrote:
>
> Wow. You even got _gdb_ to segfault.
>
> You're my hero. If it can break, you will do it.
You have no idea :) So much so that a coworker got me a "FAIL" stamp for
my birthday:
http://agentdero.cachefly.net/pictotweet.com//saved/6f217a5ababb06185d5e4ca1398e743c/PIC-012835841677481.jpg )
Anyways..
>
> That stupid fsck commit walker walks the parents recursively. That's
> horribly bogus. So you have a recursion that goes from the top-level
> commit all the way to the root, doing
>
> fsck_walk_commit -> walk(parent) -> fsck_walk-commit -> ..
>
> and you have a fairly deep commit tree.
This repository is ~3 years old and ~7.1GB small, when we finally cut
over from Subversion we were in the 130,000 revision range.
> Anyway, with a 8M stack-size I can fsck the kernel repo without any
> problem, but while the kernel repo has something like 120k commits in it,
> it's a very "bushy" repository (lots of parallelism and merges), and the
> path from the top parent to the root is actually much shorter, at just 27k
> commits.
The stack size is 8M as you assumed, I'm curious as to how the kernel
handles a process that exceeds the ulimit(2) stacksize. I know from our
experience with this repository that when Git runs up against the
address space (ulimit -v) that an ENOMEM or something similar is
returned. Is there an E_NOSTACK? :) (figured I'd ask, given your
apparent knowledge on the subject ;))
>
> I take it that your project has a very long and linear history, which is
> why you have a long path from your HEAD to your root.
>
> (You can do something like
>
> git rev-list --first-parent HEAD | wc -l
tyler@ccnet:~/source/slide/brian_main> git rev-list --first-parent HEAD
| wc -l
46751
tyler@ccnet:~/source/slide/brian_main> uname -a
Linux ccnet 2.6.25.18-0.2-default #1 SMP 2008-10-21 16:30:26 +0200
x86_64 x86_64 x86_64 GNU/Linux
tyler@ccnet:~/source/slide/brian_main> git --version
git version 1.6.0.2
>
> But we should definitely fix this braindamage in fsck. Rather than
> recursively walk the commits, we should add them to a commit list and just
> walk the list iteratively.
Given that this issue affects our internal (proprietary) repository, I
can't very well give access to it or publish a clone, but I'm willing to
help in any way I can. We maintain an internal fork of the Git tree, so
I can apply any changes you'd like to an internal 1.6.0.4 or 1.6.0.5
build. For obvious reasons I ran the fsck against an upstream maintained
(stable) build of Git.
Cheers
p.s. If you find yourself in downtown San Francisco, we'd be honored to
buy you a drink here at Slide :)
--
-R. Tyler Ballance
Slide, Inc.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack))
2008-12-11 0:24 ` R. Tyler Ballance
@ 2008-12-11 0:45 ` Linus Torvalds
2008-12-11 1:21 ` R. Tyler Ballance
0 siblings, 1 reply; 22+ messages in thread
From: Linus Torvalds @ 2008-12-11 0:45 UTC (permalink / raw)
To: R. Tyler Ballance; +Cc: Johannes Sixt, Junio C Hamano, git
On Wed, 10 Dec 2008, R. Tyler Ballance wrote:
>
> The stack size is 8M as you assumed, I'm curious as to how the kernel
> handles a process that exceeds the ulimit(2) stacksize. I know from our
> experience with this repository that when Git runs up against the
> address space (ulimit -v) that an ENOMEM or something similar is
> returned. Is there an E_NOSTACK? :) (figured I'd ask, given your
> apparent knowledge on the subject ;))
Since stack expansion doesn't involve any system calls, and since there is
no way to recover from it anyway, the kernel has no choice: it just sends
a SIGSEGV.
An application that wants to _can_ handle this case by installing a signal
handler, but since signal handling needs some stack-space too, a regular
"sigaction(SIGSEGV..)" isn't sufficient. You also need to set up a
separate signal stack ..
Nobody really ever does that, except for some _really_ special programs.
But it's a way to handle errors in stack allocation if you really need to.
Git certainly does not do it.
> > (You can do something like
> >
> > git rev-list --first-parent HEAD | wc -l
>
> tyler@ccnet:~/source/slide/brian_main> git rev-list --first-parent HEAD | wc -l
> 46751
Ahh. yes. The 80k number is because the callchain was that deep, but since
each recursion involves _two_ functions, it really only needed a 40k
commit depth to the root to get there.
> > But we should definitely fix this braindamage in fsck. Rather than
> > recursively walk the commits, we should add them to a commit list and just
> > walk the list iteratively.
>
> Given that this issue affects our internal (proprietary) repository, I
> can't very well give access to it or publish a clone, but I'm willing to
> help in any way I can. We maintain an internal fork of the Git tree, so
> I can apply any changes you'd like to an internal 1.6.0.4 or 1.6.0.5
> build. For obvious reasons I ran the fsck against an upstream maintained
> (stable) build of Git.
Can you try with a bigger stack? Just do
ulimit -s 16384
and then re-try the fsck. Just to verify that this is it. If nothing else,
it will at least give you a working fsck, even if it's obviously not the
"correct" solution.
Linus
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: epic fsck SIGSEGV!
2008-12-10 23:40 ` Linus Torvalds
2008-12-11 0:24 ` R. Tyler Ballance
@ 2008-12-11 0:51 ` Junio C Hamano
2008-12-11 1:03 ` epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack)) Boyd Stephen Smith Jr.
2008-12-11 1:33 ` Nicolas Pitre
3 siblings, 0 replies; 22+ messages in thread
From: Junio C Hamano @ 2008-12-11 0:51 UTC (permalink / raw)
To: Linus Torvalds; +Cc: R. Tyler Ballance, Johannes Sixt, git
Linus Torvalds <torvalds@linux-foundation.org> writes:
> But we should definitely fix this braindamage in fsck. Rather than
> recursively walk the commits, we should add them to a commit list and just
> walk the list iteratively.
>
> Junio?
I think that is a sensible thing to do. I may take a look at it myself
later in the week, unless somebody else (wants to do / does) this first.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack))
2008-12-10 23:40 ` Linus Torvalds
2008-12-11 0:24 ` R. Tyler Ballance
2008-12-11 0:51 ` epic fsck SIGSEGV! Junio C Hamano
@ 2008-12-11 1:03 ` Boyd Stephen Smith Jr.
2008-12-11 1:16 ` Shawn O. Pearce
2008-12-11 1:33 ` Nicolas Pitre
3 siblings, 1 reply; 22+ messages in thread
From: Boyd Stephen Smith Jr. @ 2008-12-11 1:03 UTC (permalink / raw)
To: Linus Torvalds; +Cc: R. Tyler Ballance, Johannes Sixt, Junio C Hamano, git
[-- Attachment #1: Type: text/plain, Size: 1054 bytes --]
On Wednesday 2008 December 10 17:40:28 Linus Torvalds wrote:
>On Wed, 10 Dec 2008, R. Tyler Ballance wrote:
>Anyway, that's a really annoying problem, and it's a bug in git.
>
>That stupid fsck commit walker walks the parents recursively.
>
>And judging by the fact that gdb also SIGSEGV's for you when
>doing the backtrace, it looks like the gdb backtrace tracer is _also_
>recursive, and _also_ hits the same issue ;)
>
>So you have definitely found a real bug.
>
>But we should definitely fix this braindamage in fsck. Rather than
>recursively walk the commits, we should add them to a commit list and just
>walk the list iteratively.
Suppose I fixed this tonight. Would you need anything other than a patch
(series) from me? (E.g. copyright assignment or something else legal [vs.
technical])
--
Boyd Stephen Smith Jr. ,= ,-_-. =.
bss03@volumehost.net ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-'
http://iguanasuicide.org/ \_/
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack))
2008-12-11 1:03 ` epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack)) Boyd Stephen Smith Jr.
@ 2008-12-11 1:16 ` Shawn O. Pearce
0 siblings, 0 replies; 22+ messages in thread
From: Shawn O. Pearce @ 2008-12-11 1:16 UTC (permalink / raw)
To: Boyd Stephen Smith Jr.
Cc: Linus Torvalds, R. Tyler Ballance, Johannes Sixt, Junio C Hamano,
git
"Boyd Stephen Smith Jr." <bss03@volumehost.net> wrote:
>
> Suppose I fixed this tonight. Would you need anything other than a patch
> (series) from me? (E.g. copyright assignment or something else legal [vs.
> technical])
No, just consent under the "Developer's Certificate of Origin 1.1"
in SubmittingPatches.
--
Shawn.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack))
2008-12-11 0:45 ` Linus Torvalds
@ 2008-12-11 1:21 ` R. Tyler Ballance
0 siblings, 0 replies; 22+ messages in thread
From: R. Tyler Ballance @ 2008-12-11 1:21 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Johannes Sixt, Junio C Hamano, git
[-- Attachment #1: Type: text/plain, Size: 5219 bytes --]
On Wed, 2008-12-10 at 16:45 -0800, Linus Torvalds wrote:
>
> On Wed, 10 Dec 2008, R. Tyler Ballance wrote:
> >
> > The stack size is 8M as you assumed, I'm curious as to how the kernel
> > handles a process that exceeds the ulimit(2) stacksize. I know from our
> > experience with this repository that when Git runs up against the
> > address space (ulimit -v) that an ENOMEM or something similar is
> > returned. Is there an E_NOSTACK? :) (figured I'd ask, given your
> > apparent knowledge on the subject ;))
>
> Since stack expansion doesn't involve any system calls, and since there is
> no way to recover from it anyway, the kernel has no choice: it just sends
> a SIGSEGV.
>
> An application that wants to _can_ handle this case by installing a signal
> handler, but since signal handling needs some stack-space too, a regular
> "sigaction(SIGSEGV..)" isn't sufficient. You also need to set up a
> separate signal stack ..
Interesting, thanks for the explanation :)
> Can you try with a bigger stack? Just do
>
> ulimit -s 16384
Looks like that'll do it :) Transcript below. I'll lower the limit with
a build with Boyd's impending patch, though I assume you can probably
recreate this with a stacksize that's less than 2x your commit count.
>
> and then re-try the fsck. Just to verify that this is it. If nothing else,
> it will at least give you a working fsck, even if it's obviously not the
> "correct" solution.
tyler@ccnet:~/source/slide/brian_main> gdb git
GNU gdb 6.8
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show
copying"
and "show warranty" for details.
This GDB was configured as "x86_64-suse-linux"...
(gdb) run fsck --full
Starting program: /usr/local/bin/git fsck --full
error: refs/remotes/origin/master-team-test does not point to a valid
object!
error: refs/remotes/origin/wip-test does not point to a valid object!
error: refs/tags/cooltag does not point to a valid object!
dangling commit 743db07961c5076511a6d04664536863da91920c
dangling commit 525660ad1268b208d440467cd3c083aa2375ee8f
dangling commit 6587b04ff81aaa43721f32f0e443bb3b0ef2be78
dangling commit 0498400f510c0bf3dc17d533d356e46ce19f0f6b
dangling commit 61af71c12f7e608d0e68b52c4d118d7fe4be9690
dangling commit aae5318412d1ca51912c71dca8f181d605928cfc
dangling commit df6522a65a3e40f50da695303cd146852cb13c3d
dangling commit fe6643635b89700f384f0d28acd247a411d52e1a
dangling commit 7e81536bfc7fb12e1f4576cb64f55e46e7e8042a
dangling commit 92ab13d2779f7a7d46736ce041e0d5a5cde16dfb
dangling commit 00b8a3ea6d294c43140f9277bf48bbe734ded10f
dangling commit 3883d47c6dffb6163989d7c54784edb08c5a8e42
dangling commit e39fb41aa5b7ce327d64938a241d055524d0425b
dangling blob 19ccb407e4f7629880e484d08bcfd805157820ee
dangling commit a135f5ae5c4cf5e3d87e63bb102c1f59c9bf2d98
dangling blob 2fa995576ce9cb7f04a4d302d0defb24468a78da
dangling commit 06d125abdf5dd002664a3b39f372713049495db7
dangling commit 26d50600da2954a71f1e985a24497be6f9ccd9bb
dangling commit 53da563adeefee480e2230bc01fedb703185e659
dangling commit 9c06e7fefb0bdfcf096439549e7f9bba4c1b5f1e
dangling commit 8d4a571f179f04a243367615e6e04a9d7437de8a
dangling commit 734b47a3329618deeb556150e161e040bc055e5e
dangling commit 038e08581164006168b38ae3b3632592ff243346
dangling commit 92f3286443b737fb2787a157479eff93b4ec1949
dangling commit 6700599d0bd0bb20b1eb611e377a9f9628272f93
dangling commit b668393e07e4c0b3cff47484084c6dad0fc6c67c
dangling commit 9777594bd3e5e9e66b22827266ee7c0d672e63d8
dangling commit a84a1a40bfebaced5be4160a37a754841ec6839d
dangling commit 41c29a41daa556b073be46401148b71864122f10
dangling commit e4caeabd7e0bdc28bacc14f5fc3f9b7f00678e9f
dangling commit 33ce1af2009bd9ccff27950af1e4faead0dcbaa9
dangling commit 2ded0a58779e02e7e07c861e541b5d75911b9ef6
dangling commit 2148fb15b79c3bab79859e80bba35ff8e9343e4d
dangling commit cfc3fb2e13a3f7b5e53ce77db26faa4badb42c06
dangling commit e3f8bbd1a0993f080355e297d4204bfd5a079d4c
dangling commit ed11cc08822d005d6f70ea9c059ee1b1ee28b5cb
dangling commit f61e3c6094df2ca7bd421853dad108b6cf0a6be7
dangling commit a730acaa6454e76bf033b3962d13b64fb0b03ca0
dangling commit 25689c64420b7e062931045919e452afa11940bc
dangling commit 8eda5c081ddf1ba5c926f47d8bd1b3c9643d8adf
dangling commit 76dbbca6603e2a630c2cac8b65ed5ec4c9f45abd
dangling commit 41e93c5564f2e06b61baf67d34da8774d84f463d
dangling commit f2038d93f67b95034c97a5895457062a6b0c96c4
dangling commit f54a0d30f3e62234941c80487b9dcbfaa10927ad
dangling commit 02a51d79b8ab6e7f396e8a0ee5f8768bf538d112
dangling commit 80a71dd3cd4b5a301931d44bee5ef4584fa1f2e9
dangling commit 2c05de9b6db8c9f392a0ee90b796efaf862dbcfe
dangling commit b465eebed0cf5538c124393df6b0cb35f98f7d3a
dangling commit 02246f3eb943a5b0868d386e39ed5719ab0d2ca9
dangling commit e6b97f34bc6f27f4ad48041b1eb3a88e18b87f18
dangling commit 7bc80fb7f429219310e5671f7191a4d6476a4bd9
Program exited normally.
(gdb)
--
-R. Tyler Ballance
Slide, Inc.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack))
2008-12-10 23:40 ` Linus Torvalds
` (2 preceding siblings ...)
2008-12-11 1:03 ` epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack)) Boyd Stephen Smith Jr.
@ 2008-12-11 1:33 ` Nicolas Pitre
2008-12-11 1:52 ` epic fsck SIGSEGV! Junio C Hamano
2008-12-11 3:28 ` epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack)) Linus Torvalds
3 siblings, 2 replies; 22+ messages in thread
From: Nicolas Pitre @ 2008-12-11 1:33 UTC (permalink / raw)
To: Linus Torvalds; +Cc: R. Tyler Ballance, Johannes Sixt, Junio C Hamano, git
On Wed, 10 Dec 2008, Linus Torvalds wrote:
> But we should definitely fix this braindamage in fsck. Rather than
> recursively walk the commits, we should add them to a commit list and just
> walk the list iteratively.
What about:
http://marc.info/?l=git&m=122889563424786&w=2
Nicolas
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: epic fsck SIGSEGV!
2008-12-11 1:33 ` Nicolas Pitre
@ 2008-12-11 1:52 ` Junio C Hamano
2008-12-11 2:16 ` Nicolas Pitre
2008-12-11 3:28 ` epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack)) Linus Torvalds
1 sibling, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2008-12-11 1:52 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Linus Torvalds, R. Tyler Ballance, Johannes Sixt, git
Nicolas Pitre <nico@cam.org> writes:
> On Wed, 10 Dec 2008, Linus Torvalds wrote:
>
>> But we should definitely fix this braindamage in fsck. Rather than
>> recursively walk the commits, we should add them to a commit list and just
>> walk the list iteratively.
>
> What about:
>
> http://marc.info/?l=git&m=122889563424786&w=2
I have to dig that out of the mail archive (quoting message-id or $gmane
article number would have been easier for me), but should I take it as an
Ack from you?
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: epic fsck SIGSEGV!
2008-12-11 1:52 ` epic fsck SIGSEGV! Junio C Hamano
@ 2008-12-11 2:16 ` Nicolas Pitre
0 siblings, 0 replies; 22+ messages in thread
From: Nicolas Pitre @ 2008-12-11 2:16 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Linus Torvalds, R. Tyler Ballance, Johannes Sixt, git
On Wed, 10 Dec 2008, Junio C Hamano wrote:
> Nicolas Pitre <nico@cam.org> writes:
>
> > On Wed, 10 Dec 2008, Linus Torvalds wrote:
> >
> >> But we should definitely fix this braindamage in fsck. Rather than
> >> recursively walk the commits, we should add them to a commit list and just
> >> walk the list iteratively.
> >
> > What about:
> >
> > http://marc.info/?l=git&m=122889563424786&w=2
>
> I have to dig that out of the mail archive (quoting message-id or $gmane
> article number would have been easier for me),
Message-ID: <20081210075338.GA7776@auto.tuwien.ac.at>
> but should I take it as an Ack from you?
I was involved in that thread initially, until bisection showed commit
271b8d25b25e49b367087440e093e755e5f35aa9 as the culprit. This might be
the same issue but I have not experienced it myself.
So I'm merely only connecting email threads here.
Nicolas
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack))
2008-12-11 1:33 ` Nicolas Pitre
2008-12-11 1:52 ` epic fsck SIGSEGV! Junio C Hamano
@ 2008-12-11 3:28 ` Linus Torvalds
2008-12-11 3:44 ` Linus Torvalds
2008-12-11 4:00 ` epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack)) Boyd Stephen Smith Jr.
1 sibling, 2 replies; 22+ messages in thread
From: Linus Torvalds @ 2008-12-11 3:28 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: R. Tyler Ballance, Johannes Sixt, Junio C Hamano, git
On Wed, 10 Dec 2008, Nicolas Pitre wrote:
> On Wed, 10 Dec 2008, Linus Torvalds wrote:
>
> > But we should definitely fix this braindamage in fsck. Rather than
> > recursively walk the commits, we should add them to a commit list and just
> > walk the list iteratively.
>
> What about:
>
> http://marc.info/?l=git&m=122889563424786&w=2
Not very pretty. The basic notion is ok, but wouldn't it be nicer to at
least use a "struct object_array" instead?
Let me try to cook something up.
Linus
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack))
2008-12-11 3:28 ` epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack)) Linus Torvalds
@ 2008-12-11 3:44 ` Linus Torvalds
2008-12-11 7:33 ` epic fsck SIGSEGV! Junio C Hamano
2008-12-11 7:53 ` Junio C Hamano
2008-12-11 4:00 ` epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack)) Boyd Stephen Smith Jr.
1 sibling, 2 replies; 22+ messages in thread
From: Linus Torvalds @ 2008-12-11 3:44 UTC (permalink / raw)
To: Nicolas Pitre
Cc: R. Tyler Ballance, Johannes Sixt, Junio C Hamano,
Git Mailing List
On Wed, 10 Dec 2008, Linus Torvalds wrote:
>
> On Wed, 10 Dec 2008, Nicolas Pitre wrote:
>
> > On Wed, 10 Dec 2008, Linus Torvalds wrote:
> >
> > > But we should definitely fix this braindamage in fsck. Rather than
> > > recursively walk the commits, we should add them to a commit list and just
> > > walk the list iteratively.
> >
> > What about:
> >
> > http://marc.info/?l=git&m=122889563424786&w=2
>
> Not very pretty. The basic notion is ok, but wouldn't it be nicer to at
> least use a "struct object_array" instead?
>
> Let me try to cook something up.
I dunno. I like this patch better. It's a bit larger. I think it's a bit
more clearly separated (ie a "mark_object_reachable()" _literally_ just
puts the object on a list, and the whole traversal is a whole separate
phase), but I guess it's a matter of taste.
It has gotten no real testing. Caveat emptor. And I didn't even bother to
check that it can run with less stack or that it makes any other
difference.
Linus
---
builtin-fsck.c | 38 +++++++++++++++++++++++++++++++-------
1 files changed, 31 insertions(+), 7 deletions(-)
diff --git a/builtin-fsck.c b/builtin-fsck.c
index afded5e..297b2c4 100644
--- a/builtin-fsck.c
+++ b/builtin-fsck.c
@@ -64,11 +64,11 @@ static int fsck_error_func(struct object *obj, int type, const char *err, ...)
return (type == FSCK_WARN) ? 0 : 1;
}
+static struct object_array pending;
+
static int mark_object(struct object *obj, int type, void *data)
{
- struct tree *tree = NULL;
struct object *parent = data;
- int result;
if (!obj) {
printf("broken link from %7s %s\n",
@@ -96,6 +96,20 @@ static int mark_object(struct object *obj, int type, void *data)
return 1;
}
+ add_object_array(obj, (void *) parent, &pending);
+ return 0;
+}
+
+static void mark_object_reachable(struct object *obj)
+{
+ mark_object(obj, OBJ_ANY, 0);
+}
+
+static int traverse_one_object(struct object *obj, struct object *parent)
+{
+ int result;
+ struct tree *tree = NULL;
+
if (obj->type == OBJ_TREE) {
obj->parsed = 0;
tree = (struct tree *)obj;
@@ -107,15 +121,22 @@ static int mark_object(struct object *obj, int type, void *data)
free(tree->buffer);
tree->buffer = NULL;
}
- if (result < 0)
- result = 1;
-
return result;
}
-static void mark_object_reachable(struct object *obj)
+static int traverse_reachable(void)
{
- mark_object(obj, OBJ_ANY, 0);
+ int result = 0;
+ while (pending.nr) {
+ struct object_array_entry *entry;
+ struct object *obj, *parent;
+
+ entry = pending.objects + --pending.nr;
+ obj = entry->item;
+ parent = (struct object *) entry->name;
+ result |= traverse_one_object(obj, parent);
+ }
+ return !!result;
}
static int mark_used(struct object *obj, int type, void *data)
@@ -237,6 +258,9 @@ static void check_connectivity(void)
{
int i, max;
+ /* Traverse the pending reachable objects */
+ traverse_reachable();
+
/* Look up all the requirements, warn about missing objects.. */
max = get_max_object_index();
if (verbose)
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack))
2008-12-11 3:28 ` epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack)) Linus Torvalds
2008-12-11 3:44 ` Linus Torvalds
@ 2008-12-11 4:00 ` Boyd Stephen Smith Jr.
1 sibling, 0 replies; 22+ messages in thread
From: Boyd Stephen Smith Jr. @ 2008-12-11 4:00 UTC (permalink / raw)
To: git
Cc: Linus Torvalds, Nicolas Pitre, R. Tyler Ballance, Johannes Sixt,
Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 1194 bytes --]
On Wednesday 2008 December 10 21:28:15 Linus Torvalds wrote:
>On Wed, 10 Dec 2008, Nicolas Pitre wrote:
>> http://marc.info/?l=git&m=122889563424786&w=2
>
>Not very pretty. The basic notion is ok, but wouldn't it be nicer to at
>least use a "struct object_array" instead?
As Junio pointed out, we may want to make similar changes with other calls in
fsck_walk with the function itself as a callback. It might even make sense
to have a fsck_walk_full that handles managing the object_array itself.
While we are making changes, there appears to be a copy and paste error from
line 74 to line 76 -- the second "broken link from" should probably be "
to".
I'd have already submitted a patch for that, but I can't figure out how to
tell kmail to not do quoted-printable. :( [And, if I can beat this client
into submission I will.]
Linus, sorry about the reply with no snipping or original content. I
mis-clicked. :(
--
Boyd Stephen Smith Jr. ,= ,-_-. =.
bss03@volumehost.net ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-'
http://iguanasuicide.org/ \_/
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: epic fsck SIGSEGV!
2008-12-11 3:44 ` Linus Torvalds
@ 2008-12-11 7:33 ` Junio C Hamano
2008-12-11 17:33 ` Linus Torvalds
2008-12-11 7:53 ` Junio C Hamano
1 sibling, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2008-12-11 7:33 UTC (permalink / raw)
To: Linus Torvalds
Cc: Nicolas Pitre, R. Tyler Ballance, Johannes Sixt, Git Mailing List
Linus Torvalds <torvalds@linux-foundation.org> writes:
> I dunno. I like this patch better. It's a bit larger. I think it's a bit
> more clearly separated (ie a "mark_object_reachable()" _literally_ just
> puts the object on a list, and the whole traversal is a whole separate
> phase), but I guess it's a matter of taste.
... which happens to match mine in this case ;-)
I'll consider this signed-off and do the usual forging (for people new on
the list, Cf. http://article.gmane.org/gmane.comp.version-control.git/19031).
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: epic fsck SIGSEGV!
2008-12-11 3:44 ` Linus Torvalds
2008-12-11 7:33 ` epic fsck SIGSEGV! Junio C Hamano
@ 2008-12-11 7:53 ` Junio C Hamano
1 sibling, 0 replies; 22+ messages in thread
From: Junio C Hamano @ 2008-12-11 7:53 UTC (permalink / raw)
To: Linus Torvalds
Cc: Nicolas Pitre, R. Tyler Ballance, Johannes Sixt, Git Mailing List
Linus Torvalds <torvalds@linux-foundation.org> writes:
> It has gotten no real testing. Caveat emptor. And I didn't even bother to
> check that it can run with less stack or that it makes any other
> difference.
A quick "git fsck --full" in a copy of git.git (eh, "not-so-quick" on a
not-so-quick machine, obviously) shows that the patch does reduce minor
faults significantly.
(without patch)
83.03user 0.60system 1:23.62elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+158275minor)pagefaults 0swaps
(with object_array patch)
82.88user 0.40system 1:23.28elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+95397minor)pagefaults 0swaps
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: epic fsck SIGSEGV!
2008-12-11 7:33 ` epic fsck SIGSEGV! Junio C Hamano
@ 2008-12-11 17:33 ` Linus Torvalds
2008-12-11 20:18 ` Linus Torvalds
0 siblings, 1 reply; 22+ messages in thread
From: Linus Torvalds @ 2008-12-11 17:33 UTC (permalink / raw)
To: Junio C Hamano
Cc: Nicolas Pitre, R. Tyler Ballance, Johannes Sixt, Git Mailing List
On Wed, 10 Dec 2008, Junio C Hamano wrote:
>
> I'll consider this signed-off and do the usual forging
Yea. I've even tested it a bit now:
[torvalds@nehalem git]$ ulimit -s 1024
[torvalds@nehalem git]$ git fsck --full
Segmentation fault
[torvalds@nehalem git]$ ./git-fsck --full
dangling commit 3d00b49495ceff119de52dc5443731e2d8d84b6b
dangling commit 4e0a3c7de9af3cbb53cc421329f0579679edbb51
...
so it does seem to fix the issue, and the patch looks safe enough.
It passes all the tests, and works fine on the kernel repo too (ugh, four
minutes! I used to run git-fsck religiously every day back in the early
days, now I realized that I must not have done so in _months_, and my
kernel tree has grown and so has fsck time).
But obviously the true test for fsck is some complex corruption, and I
didn't test that. I can't imagine that it introduces any new problems
though - but the bugs you can't imagine are always the worst ones ;)
Linus
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: epic fsck SIGSEGV!
2008-12-11 17:33 ` Linus Torvalds
@ 2008-12-11 20:18 ` Linus Torvalds
0 siblings, 0 replies; 22+ messages in thread
From: Linus Torvalds @ 2008-12-11 20:18 UTC (permalink / raw)
To: Junio C Hamano
Cc: Nicolas Pitre, R. Tyler Ballance, Johannes Sixt, Git Mailing List
On Thu, 11 Dec 2008, Linus Torvalds wrote:
>
> But obviously the true test for fsck is some complex corruption, and I
> didn't test that. I can't imagine that it introduces any new problems
> though - but the bugs you can't imagine are always the worst ones ;)
Btw, even if it doesn't introduce any bugs, it _does_ change the order
that we traverse things in. It shouldn't matter, of course, but because it
always picks the last entry from the object array (it really treats the
array as a stack), it ends up traversing parents of commits (and the
entries in trees) by looking at the last parent (or entry) first.
The whole two-phase thing also means that rather traverse the references
as we find them, we'll end up traversing things later in one group. Again,
access ordering will change.
Absolutely nothing should care about this from a correctness angle, of
course, but I thought I'd point it out because I think it will change the
order that we print out errors in.
So if somebody has some test-case, and you get different output
before-and-after, it's not necessarily any indication of a problem, just
an effect of doing object traversal in slightly different order.
Linus
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2008-12-11 20:20 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-10 0:11 Recovering from epic fail (deleted .git/objects/pack) R. Tyler Ballance
2008-12-10 0:19 ` Junio C Hamano
2008-12-10 10:06 ` R. Tyler Ballance
2008-12-10 11:39 ` Johannes Sixt
2008-12-10 22:52 ` epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack)) R. Tyler Ballance
2008-12-10 23:40 ` Linus Torvalds
2008-12-11 0:24 ` R. Tyler Ballance
2008-12-11 0:45 ` Linus Torvalds
2008-12-11 1:21 ` R. Tyler Ballance
2008-12-11 0:51 ` epic fsck SIGSEGV! Junio C Hamano
2008-12-11 1:03 ` epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack)) Boyd Stephen Smith Jr.
2008-12-11 1:16 ` Shawn O. Pearce
2008-12-11 1:33 ` Nicolas Pitre
2008-12-11 1:52 ` epic fsck SIGSEGV! Junio C Hamano
2008-12-11 2:16 ` Nicolas Pitre
2008-12-11 3:28 ` epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack)) Linus Torvalds
2008-12-11 3:44 ` Linus Torvalds
2008-12-11 7:33 ` epic fsck SIGSEGV! Junio C Hamano
2008-12-11 17:33 ` Linus Torvalds
2008-12-11 20:18 ` Linus Torvalds
2008-12-11 7:53 ` Junio C Hamano
2008-12-11 4:00 ` epic fsck SIGSEGV! (was Recovering from epic fail (deleted .git/objects/pack)) Boyd Stephen Smith Jr.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).