* Kernel failing to boot when compressed with bzip2
@ 2013-01-20 21:55 Thomas Capricelli
2013-01-30 13:58 ` Rob Landley
0 siblings, 1 reply; 3+ messages in thread
From: Thomas Capricelli @ 2013-01-20 21:55 UTC (permalink / raw)
To: linux-kernel
Hello,
I have had this weird behaviour that seems to be important enough to
report on lkml. I compile my own kernels since ~1994 and never had this
bug before. There are actually two problems :
1st problem
Since around September 2012, i have tried to compile my kernel 3.6.x
with gcc-4.7. It failed this way: just after grub loaded the kernel,
after displaying "Decompressing Linux", the computer rebooted. The exact
same kernel (same .config) compiled with gcc-4.6 would boot perfectly. I
did several tests, it was all very reproducible.
I didn't pay attention and just stick with gcc-4.6 when compiling my
kernel. I thought it would just get fixed with time.
2nd problem
Then the kernel 4.7 was released. I started testing with 4.7.1, did the
usual stuff and obviously, I used gcc-4.6, because of the previous
problem. Guess what, I've had _exactly_ the same behaviour. I double
checked and again i could reproduce it very well:
kernel 3.6 / gcc 4.6 -> OK
kernel 3.7 / gcc 4.6 -> reboot after printing "Decompressing Linux"
kernel 3.6 / gcc 4.7 -> reboot after printing "Decompressing Linux"
It took some time to find out the source of the problem, which will
probably feel obvious to many of you: a long time ago, i had switched
from using CONFIG_KERNEL_GZIP to using CONFIG_KERNEL_BZIP2, probably
just for the fun of testing it. I can't remember when but it was
probably very long ago (2 years ?).
Testing confirmed : going back to CONFIG_KERNEL_GZIP fixed it all and i
could not only use gcc-4.7 but also kernel 4.7.
So my guess is that there's something badly broken in the bzip2 kernel
decompressing code.. ? There's both a regression between kernel 3.6 and
3.7, and a problem with gcc-4.7.
Here are some more information, just ask if you need some more. I can
even do some testing, but you'll need to cc: me as i'm not anymore on lkml.
* the cpu is "AMD Athlon(tm) II X4 620 Processor" as reported by
/proc/cpuinfo
* CONFIG_DECOMPRESS_BZIP2=y was set on all my tests (not sure it's relevant)
* the last 3.6 kernel tested was 3.6.11
* 4.7 kernels tested were 4.7.1, 4.7.2, and 4.7.3
* the computer will reboot really fast just after the kernel kprinted
"Decompressing Linux", nothing is kprinted after this.
best regards,
--
Thomas Capricelli <orzel@freehackers.org>
http://www.freehackers.org/thomas/
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Kernel failing to boot when compressed with bzip2
2013-01-20 21:55 Kernel failing to boot when compressed with bzip2 Thomas Capricelli
@ 2013-01-30 13:58 ` Rob Landley
2013-01-30 21:15 ` Thomas Capricelli
0 siblings, 1 reply; 3+ messages in thread
From: Rob Landley @ 2013-01-30 13:58 UTC (permalink / raw)
To: Thomas Capricelli; +Cc: linux-kernel
Has there been any follow-up on this? I've had this flagged to follow
replies to the thread for a while, but didn't see any.
On 01/20/2013 03:55:05 PM, Thomas Capricelli wrote:
> 1st problem
> Since around September 2012, i have tried to compile my kernel 3.6.x
> with gcc-4.7.
I can't do anything about the gcc guys going insane, switching the
license to CDDL 2, rewriting their code in C++, and losing all the
pragmatists to other compiler projects.
Sorry the FSF is nuts, EGCS 2 is apparently called "LLVM". (Or PCC. Or
possibly Open64. Yeah, we might be better off with just one, but the
leading alternative is also in C++, so unification behind it isn't
quite happening yet.)
> So my guess is that there's something badly broken in the bzip2 kernel
> decompressing code.. ? There's both a regression between kernel 3.6
> and
> 3.7, and a problem with gcc-4.7.
This is the bit I'm interested in: I wrote the original bunzip2 code
the kernel wound up sucking in (lib/decompress_bunzip2.c "it Came From
BusyBox!" *dramatic*chord*), and it Worked For Me (tm).
According to git log it hasn't been touched in-kernel since 2011, so
something subtle's going on. I just built an i686 kernel with it...
(Wait for slow netbook to catch up...)
BZIP2 arch/x86/boot/compressed/vmlinux.bin.bz2
Wheee. Plug that kernel into my Aboriginal Linux project and use
run-emulator.sh to boot it under qemu-system-i386, and:
VFS: Mounted root (squashfs filesystem) readonly on device 3:0.
Freeing unused kernel memory: 216k freed
mount: mounting /dev/hdc on /mnt failed: No medium found
Not using distcc.
Type exit when done.
(i686:1) /home # e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex,
Flow Control: RX
That's a shell prompt. (Well, it's a shell prompt with the ethernet
driver barfing on it, but that's just delayed driver loading.
Parallelism! It's what's for dinner.)
Worked For Me. I note that I built with gcc 4.2.1 and binutils 2.17
(the last GPLv2 releases), and it's for i686, and it was with current
-git (as of yesterday), and I'm probably facing a different direction
than you are relative to magnetic north...
Need more info to reproduce this, sorry.
> Here are some more information, just ask if you need some more. I can
> even do some testing, but you'll need to cc: me as i'm not anymore on
> lkml.
I really, really, really miss kernel-traffic. And Jon Masters isn't
doing the kernel podcast anymore since he eloped with an Arm board. (As
soon as he the debug results back they had a quiet, tasteful ceremony
in front of a compiler and a linker.)
> * the cpu is "AMD Athlon(tm) II X4 620 Processor" as reported by
> /proc/cpuinfo
> * CONFIG_DECOMPRESS_BZIP2=y was set on all my tests (not sure it's
> relevant)
> * the last 3.6 kernel tested was 3.6.11
> * 4.7 kernels tested were 4.7.1, 4.7.2, and 4.7.3
> * the computer will reboot really fast just after the kernel kprinted
> "Decompressing Linux", nothing is kprinted after this.
95% chance it's a bad memory access. If the kernel does a bad memory
access before the page fault handler is set up the interrupt turns into
a reboot, and during initial kernel decompression we haven't so much
got page tables set up as a couple TLB entries held in place with twist
ties. (The code to do a better job is in the compressed payload, the
setup code gets thrown away so it's as simple as possible. Occasionally
simpler, but I don't think that's the case here.)
It's also got a tiny, fixed size stack. My first wild-ass-guess is your
compiler's decided your C code really needs exception handling because
it can't tell the difference between C and C++ (reasoning: a mud pie
contains a glass of water, therefore mud pies are an excellent
beverage), and is crapping stuff on the stack.
That probably isn't it.
Rob
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Kernel failing to boot when compressed with bzip2
2013-01-30 13:58 ` Rob Landley
@ 2013-01-30 21:15 ` Thomas Capricelli
0 siblings, 0 replies; 3+ messages in thread
From: Thomas Capricelli @ 2013-01-30 21:15 UTC (permalink / raw)
To: Rob Landley; +Cc: linux-kernel
Il 30/01/2013 14:58, Rob Landley ha scritto:
>
> On 01/20/2013 03:55:05 PM, Thomas Capricelli wrote:
>
>> So my guess is that there's something badly broken in the bzip2 kernel
>> decompressing code.. ? There's both a regression between kernel 3.6 and
>> 3.7, and a problem with gcc-4.7.
> Worked For Me. I note that I built with gcc 4.2.1 and binutils 2.17
> (the last GPLv2 releases), and it's for i686
I guess that's the main point : my bug appears on AMD64, not i686, and i
can trigger It with either gcc-4.6 or gcc-4.7. I'm pretty sure it still
works with such an old gcc as 4.2.1.
I use binutils 2.23.1 but i doubt it matters here.
Again, amd64 using vanilla kernel releases (3.x.y)
linux-3.6 with gcc-4.6 : ok
linux-3.6 with gcc-4.7 : fail
linux-3.7 with gcc-4.6 : fail
linux-3.7 with gcc-4.7 : fail
I've had this BUNZIP2 option for very long, so i'm sure it was working
well with previous versions of both gcc and linux kernel.
greetings,
Thomas
--
Thomas Capricelli <orzel@freehackers.org>
http://www.freehackers.org/thomas/
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2013-01-30 21:15 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-20 21:55 Kernel failing to boot when compressed with bzip2 Thomas Capricelli
2013-01-30 13:58 ` Rob Landley
2013-01-30 21:15 ` Thomas Capricelli
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox