* [regression?] 2.6.26 floppy boot failure with kernel packed using 'upx'
@ 2008-07-10 4:54 Frans Pop
2008-07-10 7:47 ` Ian Campbell
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Frans Pop @ 2008-07-10 4:54 UTC (permalink / raw)
To: Ian Campbell; +Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, Linus Torvalds
For the Debian installer we've been tracing a problem with our installation
boot floppy. If booted with 'expert' at the syslinux prompt, it only shows:
<snip>
SYSLINUX 3.70 <etc.>
boot:expert
Loading linux......................
Loading initrd.gz....ready.
Probing EDD (edd=off to disable)... ok
</snip>
And then the emulator crashes (both VirtualBox and qemu), or on real
hardware the system reboots. The qemu crash is included at the bottom
of this mail.
The problem was initially seen with the 2.6.25 Debian kernel and traced
to a set of Xen-related patches backported from upstream 2.6.26. Floppies
using Debian 2.6.24 or pristine 2.6.24/2.6.25 don't show the problem.
Bisection has shown the culprit to be this very early 2.6.26 commit:
$ git bisect bad
099e1377269a47ed30a00ee131001988e5bcaa9c is first bad commit
commit 099e1377269a47ed30a00ee131001988e5bcaa9c
Author: Ian Campbell <ijc@hellion.org.uk>
Date: Wed Feb 13 20:54:58 2008 +0000
x86: use ELF format in compressed images.
Important factor here is that we "pack" the kernel using upx [1] (in order
to fit everything on a floppy). The original (unpacked) kernel after this
commit boots fine, only a packed version fails.
We have tried upx versions 2.01, 3.01 and 3.03, all with same result.
Both "good" (before commit) and "bad" (after commit) images are available
at: http://people.debian.org/~fjp/tmp/d-i/floppy/upx/
Included are the boot floppy image, the raw kernel and the packed kernel.
The issue can also be reproduced using qemu without booting the floppy
itself. For the "bad" image:
# Boots correctly (but fails when mounting root fs):
$ qemu -kernel vmlinuz -hda /dev/zero
# Fails:
$ qemu -kernel vmlinuz.upx -hda /dev/zero
So, the primairy question here is:
- is this a kernel regression because whatever changed is no longer valid
conform "kernel format specs", or
- is this a latent issue in upx that somehow creates an invalid image, or
- does this change effectively create a new "type" of image that upx
just doesn't yet know how to handle correctly?
And a follow-up question in the last two cases: how likely is it that this
change could/will cause similar issues in other comparable scenarios?
Note that we've been using this same compression technique for ages in
the Debian installer without any problems.
Cheers,
FJP
[1] http://upx.sourceforge.net/
upx command used for compression and output
-------------------------------------------
$ upx -f -9 vmlinuz
Ultimate Packer for eXecutables
Copyright (C) 1996,1997,1998,1999,2000,2001,2002,2003,2004,2005,2006,2007
UPX 3.01 Markus Oberhumer, Laszlo Molnar & John Reiser Jul 31st 2007
File size Ratio Format Name
-------------------- ------ ----------- -----------
1312304 -> 1245723 94.93% bvmlinuz/386 vmlinuz
qemu crash output
-----------------
$ qemu -fda boot.img
qemu: fatal: triple fault
EAX=00000018 EBX=00000000 ECX=00000000 EDX=003646f6
ESI=000333a1 EDI=00000000 EBP=ffffffb0 ESP=0003c367
EIP=00101015 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 00000000 00000000
CS =0010 00000000 ffffffff 00cf9b00
SS =0018 00000000 ffffffff 00cf9300
DS =0000 00000000 00000000 00000000
FS =0018 00000000 ffffffff 00cf9300
GS =0018 00000000 ffffffff 00cf9300
LDT=0000 00000000 00000000 00008000
TR =0020 00001000 00000067 00008900
GDT= 656e6900 0000646e
IDT= 00000000 00000000
CR0=60000011 CR2=00000000 CR3=00000000 CR4=00000000
CCS=ffeb7200 CCD=00000000 CCO=LOGICB
FCW=037f FSW=4000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
Aborted
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [regression?] 2.6.26 floppy boot failure with kernel packed using 'upx'
2008-07-10 4:54 [regression?] 2.6.26 floppy boot failure with kernel packed using 'upx' Frans Pop
@ 2008-07-10 7:47 ` Ian Campbell
2008-07-10 9:25 ` Frans Pop
2008-07-10 13:49 ` H. Peter Anvin
2008-07-10 17:17 ` Linus Torvalds
2 siblings, 1 reply; 7+ messages in thread
From: Ian Campbell @ 2008-07-10 7:47 UTC (permalink / raw)
To: Frans Pop
Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, Linus Torvalds,
H. Peter Anvin
On Thu, 2008-07-10 at 06:54 +0200, Frans Pop wrote:
> The issue can also be reproduced using qemu without booting the floppy
> itself. For the "bad" image:
> # Boots correctly (but fails when mounting root fs):
> $ qemu -kernel vmlinuz -hda /dev/zero
> # Fails:
> $ qemu -kernel vmlinuz.upx -hda /dev/zero
I can repro this.
> So, the primairy question here is:
> - is this a kernel regression because whatever changed is no longer valid
> conform "kernel format specs", or
> - is this a latent issue in upx that somehow creates an invalid image, or
> - does this change effectively create a new "type" of image that upx
> just doesn't yet know how to handle correctly?
This is the first time I've looked at UPX but from glancing through the
code it certainly appears to make a lot of assumptions about the
structure of the bzImage (to the point that it looks for specific code
sequences within the binary).
It seems that the way UPX works is that it extracts the compressed image
from the bzImage, recompresses it and rebuilds a new bzImage replacing
the decompression stage (and possibly some of the other 16 bit startup,
I'm not quite sure yet) with its own. The issue is that its new
decompressor does not understand the ELF format and expects a raw
binary.
I think that UPX probably has gone a bit beyond the documented
interfaces, but it's not unreasonable way. In fact the changeset which
you referenced (or one of the ones around it) actually adds further
documentation (in Documentation/x86/i386/boot.txt) and header fields to
aid in doing the sort of extraction UPX wants to do and documents more
explicitly the formats which can be expected to be found there. Also
around the same time a checksum field was defined which is invalidated
by the repacking.
All in all I'd say it should be treated as a new subtype/variant which
UPX should support. I'd say adding support for bzImage v2.08 to UPX
would be pretty easy for someone who knows the code base (I couldn't
even find the decompressor code, but then it's pre-coffee time here..).
Ian.
--
Ian Campbell
Fame lost its appeal for me when I went into a public restroom and an
autograph seeker handed me a pen and paper under the stall door.
-- Marlo Thomas
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [regression?] 2.6.26 floppy boot failure with kernel packed using 'upx'
2008-07-10 7:47 ` Ian Campbell
@ 2008-07-10 9:25 ` Frans Pop
0 siblings, 0 replies; 7+ messages in thread
From: Frans Pop @ 2008-07-10 9:25 UTC (permalink / raw)
To: Ian Campbell
Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, Linus Torvalds,
H. Peter Anvin
On Thursday 10 July 2008, Ian Campbell wrote:
> On Thu, 2008-07-10 at 06:54 +0200, Frans Pop wrote:
> > The issue can also be reproduced using qemu without booting the
> > floppy itself. For the "bad" image:
> > # Boots correctly (but fails when mounting root fs):
> > $ qemu -kernel vmlinuz -hda /dev/zero
> > # Fails:
> > $ qemu -kernel vmlinuz.upx -hda /dev/zero
>
> I can repro this.
>
> All in all I'd say it should be treated as a new subtype/variant which
> UPX should support. I'd say adding support for bzImage v2.08 to UPX
> would be pretty easy for someone who knows the code base (I couldn't
> even find the decompressor code, but then it's pre-coffee time here..).
Thanks for the quick and complete reply Ian. I already expected this but
thought it would at least be useful to inform people here of the issue
(and get it indexed for search engines).
I have no problems with it not being treated as a bug/regression.
I've now filed a bug report against upx:
http://sourceforge.net/tracker/?func=detail&aid=2014835&group_id=2331&atid=102331
Cheers,
FJP
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [regression?] 2.6.26 floppy boot failure with kernel packed using 'upx'
2008-07-10 4:54 [regression?] 2.6.26 floppy boot failure with kernel packed using 'upx' Frans Pop
2008-07-10 7:47 ` Ian Campbell
@ 2008-07-10 13:49 ` H. Peter Anvin
2008-07-10 17:17 ` Linus Torvalds
2 siblings, 0 replies; 7+ messages in thread
From: H. Peter Anvin @ 2008-07-10 13:49 UTC (permalink / raw)
To: Frans Pop
Cc: Ian Campbell, linux-kernel, Thomas Gleixner, Ingo Molnar,
Linus Torvalds
Frans Pop wrote:
>
> So, the primairy question here is:
> - is this a kernel regression because whatever changed is no longer valid
> conform "kernel format specs", or
> - is this a latent issue in upx that somehow creates an invalid image, or
> - does this change effectively create a new "type" of image that upx
> just doesn't yet know how to handle correctly?
>
Sounds to me like UPX makes assumptions about the kernel image format
that it really doesn't have standing to assume.
-hpa
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [regression?] 2.6.26 floppy boot failure with kernel packed using 'upx'
2008-07-10 4:54 [regression?] 2.6.26 floppy boot failure with kernel packed using 'upx' Frans Pop
2008-07-10 7:47 ` Ian Campbell
2008-07-10 13:49 ` H. Peter Anvin
@ 2008-07-10 17:17 ` Linus Torvalds
2008-07-10 17:49 ` H. Peter Anvin
2 siblings, 1 reply; 7+ messages in thread
From: Linus Torvalds @ 2008-07-10 17:17 UTC (permalink / raw)
To: Frans Pop
Cc: Ian Campbell, Linux Kernel Mailing List, Thomas Gleixner,
Ingo Molnar, Peter Anvin
On Thu, 10 Jul 2008, Frans Pop wrote:
>
> Important factor here is that we "pack" the kernel using upx [1] (in order
> to fit everything on a floppy). The original (unpacked) kernel after this
> commit boots fine, only a packed version fails.
> We have tried upx versions 2.01, 3.01 and 3.03, all with same result.
Ok, I do not consider this to be a regression.
Clearly UPX is expecting a very specific kernel format, and clearly UPX
just needs to be updated for the changes.
That said, if some UPX person can explain what it is that UPX needs, maybe
we can add some format markers into the kernel (and perhaps some extra
header info) so that future format changes are transparent or at least
break in obvious ways.
Linus
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [regression?] 2.6.26 floppy boot failure with kernel packed using 'upx'
2008-07-10 17:17 ` Linus Torvalds
@ 2008-07-10 17:49 ` H. Peter Anvin
2008-07-10 18:18 ` Frans Pop
0 siblings, 1 reply; 7+ messages in thread
From: H. Peter Anvin @ 2008-07-10 17:49 UTC (permalink / raw)
To: Linus Torvalds
Cc: Frans Pop, Ian Campbell, Linux Kernel Mailing List,
Thomas Gleixner, Ingo Molnar
Linus Torvalds wrote:
>
> On Thu, 10 Jul 2008, Frans Pop wrote:
>> Important factor here is that we "pack" the kernel using upx [1] (in order
>> to fit everything on a floppy). The original (unpacked) kernel after this
>> commit boots fine, only a packed version fails.
>> We have tried upx versions 2.01, 3.01 and 3.03, all with same result.
>
> Ok, I do not consider this to be a regression.
>
> Clearly UPX is expecting a very specific kernel format, and clearly UPX
> just needs to be updated for the changes.
>
> That said, if some UPX person can explain what it is that UPX needs, maybe
> we can add some format markers into the kernel (and perhaps some extra
> header info) so that future format changes are transparent or at least
> break in obvious ways.
>
The issue with UPX is that UPX actually wants to replace the kernel
decompressor, which also does other tasks (relocation and ELF parsing)
in recent kernels.
-hpa
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [regression?] 2.6.26 floppy boot failure with kernel packed using 'upx'
2008-07-10 17:49 ` H. Peter Anvin
@ 2008-07-10 18:18 ` Frans Pop
0 siblings, 0 replies; 7+ messages in thread
From: Frans Pop @ 2008-07-10 18:18 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Linus Torvalds, Ian Campbell, Linux Kernel Mailing List,
Thomas Gleixner, Ingo Molnar
On Thursday 10 July 2008, H. Peter Anvin wrote:
> The issue with UPX is that UPX actually wants to replace the kernel
> decompressor, which also does other tasks (relocation and ELF parsing)
> in recent kernels.
It does save us about 5% though, which is quite substantial when you're
talking about fitting a kernel on a boot floppy. For the past few kernel
versions it has made the difference between being able to base
installation floppies on the standard Debian kernel (generic 486 flavor)
or having to come up with radical new solutions.
>From my tests for this issue it looks like .26 has grown so much that
we've now reached that point anyway though :-(
Cheers,
FJP
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-07-10 18:19 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-10 4:54 [regression?] 2.6.26 floppy boot failure with kernel packed using 'upx' Frans Pop
2008-07-10 7:47 ` Ian Campbell
2008-07-10 9:25 ` Frans Pop
2008-07-10 13:49 ` H. Peter Anvin
2008-07-10 17:17 ` Linus Torvalds
2008-07-10 17:49 ` H. Peter Anvin
2008-07-10 18:18 ` Frans Pop
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox