public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Kernel failing to boot when compressed with bzip2
@ 2013-01-20 21:55 Thomas Capricelli
  2013-01-30 13:58 ` Rob Landley
  0 siblings, 1 reply; 3+ messages in thread
From: Thomas Capricelli @ 2013-01-20 21:55 UTC (permalink / raw)
  To: linux-kernel


Hello,

I have had this weird behaviour that seems to be important enough to
report on lkml. I compile my own kernels since ~1994 and never had this
bug before. There are actually two problems :

1st problem
Since around September 2012, i have tried to compile my kernel 3.6.x
with gcc-4.7. It failed this way: just after grub loaded the kernel,
after displaying "Decompressing Linux", the computer rebooted. The exact
same kernel (same .config) compiled with gcc-4.6 would boot perfectly. I
did several tests, it was all very reproducible.
I didn't pay attention and just stick with gcc-4.6 when compiling my
kernel. I thought it would just get fixed with time.

2nd problem
Then the kernel 4.7 was released. I started testing with 4.7.1, did the
usual stuff and obviously, I used gcc-4.6, because of the previous
problem. Guess what, I've had _exactly_ the same behaviour. I double
checked and again i could reproduce it very well:
    kernel 3.6 / gcc 4.6 -> OK
    kernel 3.7 / gcc 4.6 -> reboot after printing "Decompressing Linux"
    kernel 3.6 / gcc 4.7 -> reboot after printing "Decompressing Linux"

It took some time to find out the source of the problem, which will
probably feel obvious to many of you: a long time ago, i had switched
from using CONFIG_KERNEL_GZIP to using CONFIG_KERNEL_BZIP2, probably
just for the fun of testing it. I can't remember when but it was
probably very long ago  (2 years ?).

Testing confirmed : going back to CONFIG_KERNEL_GZIP fixed it all and i
could not only use gcc-4.7 but also kernel 4.7.

So my guess is that there's something badly broken in the bzip2 kernel
decompressing code.. ? There's both a regression between kernel 3.6 and
3.7, and a problem with gcc-4.7.

Here are some more information, just ask if you need some more. I can
even do some testing, but you'll need to cc: me as i'm not anymore on lkml.

* the cpu is "AMD Athlon(tm) II X4 620 Processor" as reported by
/proc/cpuinfo
* CONFIG_DECOMPRESS_BZIP2=y was set on all my tests (not sure it's relevant)
* the last 3.6 kernel tested was 3.6.11
* 4.7 kernels tested were 4.7.1, 4.7.2, and 4.7.3
* the computer will reboot really fast just after the kernel kprinted
"Decompressing Linux", nothing is kprinted after this.

best regards,

-- 
Thomas Capricelli <orzel@freehackers.org>
http://www.freehackers.org/thomas/


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Kernel failing to boot when compressed with bzip2
  2013-01-20 21:55 Kernel failing to boot when compressed with bzip2 Thomas Capricelli
@ 2013-01-30 13:58 ` Rob Landley
  2013-01-30 21:15   ` Thomas Capricelli
  0 siblings, 1 reply; 3+ messages in thread
From: Rob Landley @ 2013-01-30 13:58 UTC (permalink / raw)
  To: Thomas Capricelli; +Cc: linux-kernel

Has there been any follow-up on this? I've had this flagged to follow  
replies to the thread for a while, but didn't see any.

On 01/20/2013 03:55:05 PM, Thomas Capricelli wrote:
> 1st problem
> Since around September 2012, i have tried to compile my kernel 3.6.x
> with gcc-4.7.

I can't do anything about the gcc guys going insane, switching the  
license to CDDL 2, rewriting their code in C++, and losing all the  
pragmatists to other compiler projects.

Sorry the FSF is nuts, EGCS 2 is apparently called "LLVM". (Or PCC.  Or  
possibly Open64. Yeah, we might be better off with just one, but the  
leading alternative is also in C++, so unification behind it isn't  
quite happening yet.)

> So my guess is that there's something badly broken in the bzip2 kernel
> decompressing code.. ? There's both a regression between kernel 3.6  
> and
> 3.7, and a problem with gcc-4.7.

This is the bit I'm interested in: I wrote the original bunzip2 code  
the kernel wound up sucking in (lib/decompress_bunzip2.c "it Came From  
BusyBox!" *dramatic*chord*), and it Worked For Me (tm).

According to git log it hasn't been touched in-kernel since 2011, so  
something subtle's going on. I just built an i686 kernel with it...

(Wait for slow netbook to catch up...)

   BZIP2   arch/x86/boot/compressed/vmlinux.bin.bz2

Wheee. Plug that kernel into my Aboriginal Linux project and use  
run-emulator.sh to boot it under qemu-system-i386, and:

   VFS: Mounted root (squashfs filesystem) readonly on device 3:0.
   Freeing unused kernel memory: 216k freed
   mount: mounting /dev/hdc on /mnt failed: No medium found
   Not using distcc.
   Type exit when done.
   (i686:1) /home # e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex,  
Flow Control: RX

That's a shell prompt. (Well, it's a shell prompt with the ethernet  
driver barfing on it, but that's just delayed driver loading.  
Parallelism! It's what's for dinner.)

Worked For Me. I note that I built with gcc 4.2.1 and binutils 2.17  
(the last GPLv2 releases), and it's for i686, and it was with current  
-git (as of yesterday), and I'm probably facing a different direction  
than you are relative to magnetic north...

Need more info to reproduce this, sorry.

> Here are some more information, just ask if you need some more. I can
> even do some testing, but you'll need to cc: me as i'm not anymore on  
> lkml.

I really, really, really miss kernel-traffic. And Jon Masters isn't  
doing the kernel podcast anymore since he eloped with an Arm board. (As  
soon as he the debug results back they had a quiet, tasteful ceremony  
in front of a compiler and a linker.)

> * the cpu is "AMD Athlon(tm) II X4 620 Processor" as reported by
> /proc/cpuinfo
> * CONFIG_DECOMPRESS_BZIP2=y was set on all my tests (not sure it's  
> relevant)
> * the last 3.6 kernel tested was 3.6.11
> * 4.7 kernels tested were 4.7.1, 4.7.2, and 4.7.3
> * the computer will reboot really fast just after the kernel kprinted
> "Decompressing Linux", nothing is kprinted after this.

95% chance it's a bad memory access. If the kernel does a bad memory  
access before the page fault handler is set up the interrupt turns into  
a reboot, and during initial kernel decompression we haven't so much  
got page tables set up as a couple TLB entries held in place with twist  
ties. (The code to do a better job is in the compressed payload, the  
setup code gets thrown away so it's as simple as possible. Occasionally  
simpler, but I don't think that's the case here.)

It's also got a tiny, fixed size stack. My first wild-ass-guess is your  
compiler's decided your C code really needs exception handling because  
it can't tell the difference between C and C++ (reasoning: a mud pie  
contains a glass of water, therefore mud pies are an excellent  
beverage), and is crapping stuff on the stack.

That probably isn't it.

Rob

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Kernel failing to boot when compressed with bzip2
  2013-01-30 13:58 ` Rob Landley
@ 2013-01-30 21:15   ` Thomas Capricelli
  0 siblings, 0 replies; 3+ messages in thread
From: Thomas Capricelli @ 2013-01-30 21:15 UTC (permalink / raw)
  To: Rob Landley; +Cc: linux-kernel


Il 30/01/2013 14:58, Rob Landley ha scritto:
>
> On 01/20/2013 03:55:05 PM, Thomas Capricelli wrote:
>
>> So my guess is that there's something badly broken in the bzip2 kernel
>> decompressing code.. ? There's both a regression between kernel 3.6 and
>> 3.7, and a problem with gcc-4.7.

> Worked For Me. I note that I built with gcc 4.2.1 and binutils 2.17
> (the last GPLv2 releases), and it's for i686

I guess that's the main point : my bug appears on AMD64, not i686, and i
can trigger It with either gcc-4.6 or gcc-4.7. I'm pretty sure it still
works with such an old gcc as 4.2.1.
I use binutils 2.23.1 but i doubt it matters here.

Again, amd64 using vanilla kernel releases (3.x.y)
    linux-3.6 with gcc-4.6 : ok
    linux-3.6 with gcc-4.7 : fail
    linux-3.7 with gcc-4.6 : fail
    linux-3.7 with gcc-4.7 : fail

I've had this BUNZIP2 option for very long, so i'm sure it was working
well with previous versions of both gcc and linux kernel.

greetings,
Thomas

-- 
Thomas Capricelli <orzel@freehackers.org>
http://www.freehackers.org/thomas/


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-01-30 21:15 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-20 21:55 Kernel failing to boot when compressed with bzip2 Thomas Capricelli
2013-01-30 13:58 ` Rob Landley
2013-01-30 21:15   ` Thomas Capricelli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox