* Kernel failing to boot when compressed with bzip2
@ 2013-01-20 21:55 Thomas Capricelli
2013-01-30 13:58 ` Rob Landley
0 siblings, 1 reply; 3+ messages in thread
From: Thomas Capricelli @ 2013-01-20 21:55 UTC (permalink / raw)
To: linux-kernel
Hello,
I have had this weird behaviour that seems to be important enough to
report on lkml. I compile my own kernels since ~1994 and never had this
bug before. There are actually two problems :
1st problem
Since around September 2012, i have tried to compile my kernel 3.6.x
with gcc-4.7. It failed this way: just after grub loaded the kernel,
after displaying "Decompressing Linux", the computer rebooted. The exact
same kernel (same .config) compiled with gcc-4.6 would boot perfectly. I
did several tests, it was all very reproducible.
I didn't pay attention and just stick with gcc-4.6 when compiling my
kernel. I thought it would just get fixed with time.
2nd problem
Then the kernel 4.7 was released. I started testing with 4.7.1, did the
usual stuff and obviously, I used gcc-4.6, because of the previous
problem. Guess what, I've had _exactly_ the same behaviour. I double
checked and again i could reproduce it very well:
kernel 3.6 / gcc 4.6 -> OK
kernel 3.7 / gcc 4.6 -> reboot after printing "Decompressing Linux"
kernel 3.6 / gcc 4.7 -> reboot after printing "Decompressing Linux"
It took some time to find out the source of the problem, which will
probably feel obvious to many of you: a long time ago, i had switched
from using CONFIG_KERNEL_GZIP to using CONFIG_KERNEL_BZIP2, probably
just for the fun of testing it. I can't remember when but it was
probably very long ago (2 years ?).
Testing confirmed : going back to CONFIG_KERNEL_GZIP fixed it all and i
could not only use gcc-4.7 but also kernel 4.7.
So my guess is that there's something badly broken in the bzip2 kernel
decompressing code.. ? There's both a regression between kernel 3.6 and
3.7, and a problem with gcc-4.7.
Here are some more information, just ask if you need some more. I can
even do some testing, but you'll need to cc: me as i'm not anymore on lkml.
* the cpu is "AMD Athlon(tm) II X4 620 Processor" as reported by
/proc/cpuinfo
* CONFIG_DECOMPRESS_BZIP2=y was set on all my tests (not sure it's relevant)
* the last 3.6 kernel tested was 3.6.11
* 4.7 kernels tested were 4.7.1, 4.7.2, and 4.7.3
* the computer will reboot really fast just after the kernel kprinted
"Decompressing Linux", nothing is kprinted after this.
best regards,
--
Thomas Capricelli <orzel@freehackers.org>
http://www.freehackers.org/thomas/
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: Kernel failing to boot when compressed with bzip2 2013-01-20 21:55 Kernel failing to boot when compressed with bzip2 Thomas Capricelli @ 2013-01-30 13:58 ` Rob Landley 2013-01-30 21:15 ` Thomas Capricelli 0 siblings, 1 reply; 3+ messages in thread From: Rob Landley @ 2013-01-30 13:58 UTC (permalink / raw) To: Thomas Capricelli; +Cc: linux-kernel Has there been any follow-up on this? I've had this flagged to follow replies to the thread for a while, but didn't see any. On 01/20/2013 03:55:05 PM, Thomas Capricelli wrote: > 1st problem > Since around September 2012, i have tried to compile my kernel 3.6.x > with gcc-4.7. I can't do anything about the gcc guys going insane, switching the license to CDDL 2, rewriting their code in C++, and losing all the pragmatists to other compiler projects. Sorry the FSF is nuts, EGCS 2 is apparently called "LLVM". (Or PCC. Or possibly Open64. Yeah, we might be better off with just one, but the leading alternative is also in C++, so unification behind it isn't quite happening yet.) > So my guess is that there's something badly broken in the bzip2 kernel > decompressing code.. ? There's both a regression between kernel 3.6 > and > 3.7, and a problem with gcc-4.7. This is the bit I'm interested in: I wrote the original bunzip2 code the kernel wound up sucking in (lib/decompress_bunzip2.c "it Came From BusyBox!" *dramatic*chord*), and it Worked For Me (tm). According to git log it hasn't been touched in-kernel since 2011, so something subtle's going on. I just built an i686 kernel with it... (Wait for slow netbook to catch up...) BZIP2 arch/x86/boot/compressed/vmlinux.bin.bz2 Wheee. Plug that kernel into my Aboriginal Linux project and use run-emulator.sh to boot it under qemu-system-i386, and: VFS: Mounted root (squashfs filesystem) readonly on device 3:0. Freeing unused kernel memory: 216k freed mount: mounting /dev/hdc on /mnt failed: No medium found Not using distcc. Type exit when done. (i686:1) /home # e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX That's a shell prompt. (Well, it's a shell prompt with the ethernet driver barfing on it, but that's just delayed driver loading. Parallelism! It's what's for dinner.) Worked For Me. I note that I built with gcc 4.2.1 and binutils 2.17 (the last GPLv2 releases), and it's for i686, and it was with current -git (as of yesterday), and I'm probably facing a different direction than you are relative to magnetic north... Need more info to reproduce this, sorry. > Here are some more information, just ask if you need some more. I can > even do some testing, but you'll need to cc: me as i'm not anymore on > lkml. I really, really, really miss kernel-traffic. And Jon Masters isn't doing the kernel podcast anymore since he eloped with an Arm board. (As soon as he the debug results back they had a quiet, tasteful ceremony in front of a compiler and a linker.) > * the cpu is "AMD Athlon(tm) II X4 620 Processor" as reported by > /proc/cpuinfo > * CONFIG_DECOMPRESS_BZIP2=y was set on all my tests (not sure it's > relevant) > * the last 3.6 kernel tested was 3.6.11 > * 4.7 kernels tested were 4.7.1, 4.7.2, and 4.7.3 > * the computer will reboot really fast just after the kernel kprinted > "Decompressing Linux", nothing is kprinted after this. 95% chance it's a bad memory access. If the kernel does a bad memory access before the page fault handler is set up the interrupt turns into a reboot, and during initial kernel decompression we haven't so much got page tables set up as a couple TLB entries held in place with twist ties. (The code to do a better job is in the compressed payload, the setup code gets thrown away so it's as simple as possible. Occasionally simpler, but I don't think that's the case here.) It's also got a tiny, fixed size stack. My first wild-ass-guess is your compiler's decided your C code really needs exception handling because it can't tell the difference between C and C++ (reasoning: a mud pie contains a glass of water, therefore mud pies are an excellent beverage), and is crapping stuff on the stack. That probably isn't it. Rob ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Kernel failing to boot when compressed with bzip2 2013-01-30 13:58 ` Rob Landley @ 2013-01-30 21:15 ` Thomas Capricelli 0 siblings, 0 replies; 3+ messages in thread From: Thomas Capricelli @ 2013-01-30 21:15 UTC (permalink / raw) To: Rob Landley; +Cc: linux-kernel Il 30/01/2013 14:58, Rob Landley ha scritto: > > On 01/20/2013 03:55:05 PM, Thomas Capricelli wrote: > >> So my guess is that there's something badly broken in the bzip2 kernel >> decompressing code.. ? There's both a regression between kernel 3.6 and >> 3.7, and a problem with gcc-4.7. > Worked For Me. I note that I built with gcc 4.2.1 and binutils 2.17 > (the last GPLv2 releases), and it's for i686 I guess that's the main point : my bug appears on AMD64, not i686, and i can trigger It with either gcc-4.6 or gcc-4.7. I'm pretty sure it still works with such an old gcc as 4.2.1. I use binutils 2.23.1 but i doubt it matters here. Again, amd64 using vanilla kernel releases (3.x.y) linux-3.6 with gcc-4.6 : ok linux-3.6 with gcc-4.7 : fail linux-3.7 with gcc-4.6 : fail linux-3.7 with gcc-4.7 : fail I've had this BUNZIP2 option for very long, so i'm sure it was working well with previous versions of both gcc and linux kernel. greetings, Thomas -- Thomas Capricelli <orzel@freehackers.org> http://www.freehackers.org/thomas/ ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2013-01-30 21:15 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-01-20 21:55 Kernel failing to boot when compressed with bzip2 Thomas Capricelli 2013-01-30 13:58 ` Rob Landley 2013-01-30 21:15 ` Thomas Capricelli
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox