* wrong final bzImage build (regading #14270) @ 2009-10-09 14:17 Michael Tokarev 2009-10-09 14:26 ` Michael Tokarev 2009-10-09 14:58 ` Cyrill Gorcunov 0 siblings, 2 replies; 11+ messages in thread From: Michael Tokarev @ 2009-10-09 14:17 UTC (permalink / raw) To: Kernel Mailing List Cc: Rafael J. Wysocki, Cyrill Gorcunov, Kernel Testers List Ok, finally the mystery solved. After a week of digging. The original problem was titled "Cannot boot on a PIII Celeron", and Rafael filed a bug #14270 for this. In short, what I observed was that a new kernel (2.6.31) fails to boot on a PIII Celeron machine. But changing just the CPU to plain PIII and voila, it now works. I don't know why it behaved this way, but I found where was the problem, finally. And the problem is in the last stage of build, when building the bzImage. make -f scripts/Makefile.build obj=arch/x86/boot/compressed arch/x86/boot/compressed/vmlinux ... (cat arch/x86/boot/compressed/vmlinux.bin | lzma -9 && echo -ne \\x38\\xd6\\x37\\x00) > arch/x86/boot/compressed/vmlinux.bin.lzma ... Note the echo command. Now, Debian switched to dash as /bin/sh. And dash does not understand the -e option: $ dash -c 'echo -ne \\x38\\xd6\\x37\\x00' | od -x 0000000 6e2d 2065 785c 3833 785c 3664 785c 3733 0000020 785c 3030 000a $ bash -c 'echo -ne \\x38\\xd6\\x37\\x00' | od -x 0000000 d638 0037 So the final size (it's the size of uncompressed file) becomes incorrect. Here's what mkpiggy outputs for this (in arch/x86/boot/compressed/piggy.S): z_output_len = 170930296 while it should be z_output_len = 3659320 And with the former (wrong, larger) size, the whole thing just reboots on a PIII Celeron. I've no idea why, but the original problem is here. The same thing happens with bzip2 algorithm which is not new, not only with lzma. The whole thing looks quite hackish to me, -- mkpiggy can know the size from the original image just fine, instead of getting it from the end of already compressed file. For now, quick fix is to change echo to printf in there. Correct fix is to re-write mkpiggy to look at the original file for size (IMHO anyway). And this is a very good candidate for -stable as well. The bug is very difficult to find. And now when more and more people who use Debian are switching to dash, it will be more common. Thanks! /mjt ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: wrong final bzImage build (regading #14270) 2009-10-09 14:17 wrong final bzImage build (regading #14270) Michael Tokarev @ 2009-10-09 14:26 ` Michael Tokarev 2009-10-09 14:58 ` Cyrill Gorcunov 1 sibling, 0 replies; 11+ messages in thread From: Michael Tokarev @ 2009-10-09 14:26 UTC (permalink / raw) To: Kernel Mailing List Cc: Rafael J. Wysocki, Cyrill Gorcunov, Kernel Testers List And I forgot to mention: this IS a regression in 2.6.31. Michael Tokarev wrote: > Ok, finally the mystery solved. After a week of > digging. > > The original problem was titled "Cannot boot on > a PIII Celeron", and Rafael filed a bug #14270 > for this. > > In short, what I observed was that a new kernel > (2.6.31) fails to boot on a PIII Celeron machine. > But changing just the CPU to plain PIII and voila, > it now works. I don't know why it behaved this > way, but I found where was the problem, finally. > > And the problem is in the last stage of build, when > building the bzImage. > > make -f scripts/Makefile.build obj=arch/x86/boot/compressed > arch/x86/boot/compressed/vmlinux > ... > (cat arch/x86/boot/compressed/vmlinux.bin | lzma -9 && echo -ne > \\x38\\xd6\\x37\\x00) > arch/x86/boot/compressed/vmlinux.bin.lzma > ... > > Note the echo command. > > Now, Debian switched to dash as /bin/sh. And dash > does not understand the -e option: > > $ dash -c 'echo -ne \\x38\\xd6\\x37\\x00' | od -x > 0000000 6e2d 2065 785c 3833 785c 3664 785c 3733 > 0000020 785c 3030 000a > > $ bash -c 'echo -ne \\x38\\xd6\\x37\\x00' | od -x > 0000000 d638 0037 > > So the final size (it's the size of uncompressed file) > becomes incorrect. Here's what mkpiggy outputs for > this (in arch/x86/boot/compressed/piggy.S): > > z_output_len = 170930296 > > while it should be > > z_output_len = 3659320 > > And with the former (wrong, larger) size, the whole > thing just reboots on a PIII Celeron. I've no idea > why, but the original problem is here. > > The same thing happens with bzip2 algorithm which is > not new, not only with lzma. > > The whole thing looks quite hackish to me, -- mkpiggy > can know the size from the original image just fine, > instead of getting it from the end of already compressed > file. > > For now, quick fix is to change echo to printf in there. > Correct fix is to re-write mkpiggy to look at the > original file for size (IMHO anyway). > > And this is a very good candidate for -stable as well. > The bug is very difficult to find. And now when more > and more people who use Debian are switching to dash, > it will be more common. > > Thanks! > > /mjt ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: wrong final bzImage build (regading #14270) 2009-10-09 14:17 wrong final bzImage build (regading #14270) Michael Tokarev 2009-10-09 14:26 ` Michael Tokarev @ 2009-10-09 14:58 ` Cyrill Gorcunov 2009-10-09 17:03 ` H. Peter Anvin 2009-10-09 19:39 ` Michael Tokarev 1 sibling, 2 replies; 11+ messages in thread From: Cyrill Gorcunov @ 2009-10-09 14:58 UTC (permalink / raw) To: Michael Tokarev Cc: Kernel Mailing List, Rafael J. Wysocki, Kernel Testers List, Sam Ravnborg, H. Peter Anvin Peter and Sam CC'ed [Michael Tokarev - Fri, Oct 09, 2009 at 06:17:50PM +0400] > Ok, finally the mystery solved. After a week of > digging. > > The original problem was titled "Cannot boot on > a PIII Celeron", and Rafael filed a bug #14270 > for this. > > In short, what I observed was that a new kernel > (2.6.31) fails to boot on a PIII Celeron machine. > But changing just the CPU to plain PIII and voila, > it now works. I don't know why it behaved this > way, but I found where was the problem, finally. > > And the problem is in the last stage of build, when > building the bzImage. > > make -f scripts/Makefile.build obj=arch/x86/boot/compressed arch/x86/boot/compressed/vmlinux > ... > (cat arch/x86/boot/compressed/vmlinux.bin | lzma -9 && echo -ne \\x38\\xd6\\x37\\x00) > arch/x86/boot/compressed/vmlinux.bin.lzma > ... > > Note the echo command. > > Now, Debian switched to dash as /bin/sh. And dash > does not understand the -e option: > > $ dash -c 'echo -ne \\x38\\xd6\\x37\\x00' | od -x > 0000000 6e2d 2065 785c 3833 785c 3664 785c 3733 > 0000020 785c 3030 000a > > $ bash -c 'echo -ne \\x38\\xd6\\x37\\x00' | od -x > 0000000 d638 0037 > > So the final size (it's the size of uncompressed file) > becomes incorrect. Here's what mkpiggy outputs for > this (in arch/x86/boot/compressed/piggy.S): > > z_output_len = 170930296 > > while it should be > > z_output_len = 3659320 > > And with the former (wrong, larger) size, the whole > thing just reboots on a PIII Celeron. I've no idea > why, but the original problem is here. > > The same thing happens with bzip2 algorithm which is > not new, not only with lzma. > > The whole thing looks quite hackish to me, -- mkpiggy > can know the size from the original image just fine, > instead of getting it from the end of already compressed > file. > > For now, quick fix is to change echo to printf in there. > Correct fix is to re-write mkpiggy to look at the > original file for size (IMHO anyway). > > And this is a very good candidate for -stable as well. > The bug is very difficult to find. And now when more > and more people who use Debian are switching to dash, > it will be more common. > > Thanks! > > /mjt > -- Cyrill ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: wrong final bzImage build (regading #14270) 2009-10-09 14:58 ` Cyrill Gorcunov @ 2009-10-09 17:03 ` H. Peter Anvin [not found] ` <4ACF6CF8.4060204-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org> 2009-10-09 19:39 ` Michael Tokarev 1 sibling, 1 reply; 11+ messages in thread From: H. Peter Anvin @ 2009-10-09 17:03 UTC (permalink / raw) To: Cyrill Gorcunov Cc: Michael Tokarev, Kernel Mailing List, Rafael J. Wysocki, Kernel Testers List, Sam Ravnborg On 10/09/2009 07:58 AM, Cyrill Gorcunov wrote: > Peter and Sam CC'ed > > [Michael Tokarev - Fri, Oct 09, 2009 at 06:17:50PM +0400] >> Ok, finally the mystery solved. After a week of >> digging. >> >> The original problem was titled "Cannot boot on >> a PIII Celeron", and Rafael filed a bug #14270 >> for this. >> >> In short, what I observed was that a new kernel >> (2.6.31) fails to boot on a PIII Celeron machine. >> But changing just the CPU to plain PIII and voila, >> it now works. I don't know why it behaved this >> way, but I found where was the problem, finally. >> We should switch to printf here. Hexadecimal constants in echo aren't guaranteed by POSIX. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <4ACF6CF8.4060204-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>]
* Re: wrong final bzImage build (regading #14270) [not found] ` <4ACF6CF8.4060204-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org> @ 2009-10-09 17:14 ` Michael Tokarev 0 siblings, 0 replies; 11+ messages in thread From: Michael Tokarev @ 2009-10-09 17:14 UTC (permalink / raw) To: H. Peter Anvin Cc: Cyrill Gorcunov, Kernel Mailing List, Rafael J. Wysocki, Kernel Testers List, Sam Ravnborg H. Peter Anvin пишет: > On 10/09/2009 07:58 AM, Cyrill Gorcunov wrote: >> Peter and Sam CC'ed >> >> [Michael Tokarev - Fri, Oct 09, 2009 at 06:17:50PM +0400] >>> Ok, finally the mystery solved. After a week of >>> digging. >>> >>> The original problem was titled "Cannot boot on >>> a PIII Celeron", and Rafael filed a bug #14270 >>> for this. >>> >>> In short, what I observed was that a new kernel >>> (2.6.31) fails to boot on a PIII Celeron machine. >>> But changing just the CPU to plain PIII and voila, >>> it now works. I don't know why it behaved this >>> way, but I found where was the problem, finally. > > We should switch to printf here. Hexadecimal constants in echo aren't > guaranteed by POSIX. That's what I initially proposed. However, as Scott Olson pointed out, there's already a fix for this: http://lkml.org/lkml/2009/8/19/84 http://patchwork.kernel.org/patch/42564/ which uses still-non-portable /bin/echo. (I wish I knew about it a week before now - it wasn't a pleasant week for me). Still an interesting result. I can understand if it failed for systems with smaller amounts of memory, -- nope, it fails with Celeron on a 64Mb system, but works on the same system if I replace the CPU to a real PIII... Fun. /mjt ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: wrong final bzImage build (regading #14270) 2009-10-09 14:58 ` Cyrill Gorcunov 2009-10-09 17:03 ` H. Peter Anvin @ 2009-10-09 19:39 ` Michael Tokarev [not found] ` <4ACF9184.9040104-Gdu+ltImwkhes2APU0mLOQ@public.gmane.org> 2009-10-09 20:02 ` Arkadiusz Miskiewicz 1 sibling, 2 replies; 11+ messages in thread From: Michael Tokarev @ 2009-10-09 19:39 UTC (permalink / raw) To: Cyrill Gorcunov Cc: Kernel Mailing List, Rafael J. Wysocki, Kernel Testers List, Sam Ravnborg, H. Peter Anvin Ok, some more to this. It turns out dash's built-in echo command interprets \nnn octal sequences by default, and there's no way to turn that off. So, for example, sed-zoffset command from arch/x86/boot/Makefile (which includes \1 \2 etc substitutions for sed), when echoed in verbose mode (V=1), produces.. interesting characters (with ascii code 1 and 2). It's not practival to replace V=1's echo with /bin/echo I think. So I'd say it's not a bug in the build system after all, but a bug in dash. Well, at least this expanding-by-default didn't trigger another very-difficult-to-find bug (hopefully), but it has good potential. I'll file a bug report against dash. /mjt > [Michael Tokarev - Fri, Oct 09, 2009 at 06:17:50PM +0400] >> Ok, finally the mystery solved. After a week of >> digging. >> >> The original problem was titled "Cannot boot on >> a PIII Celeron", and Rafael filed a bug #14270 >> for this. >> >> In short, what I observed was that a new kernel >> (2.6.31) fails to boot on a PIII Celeron machine. >> But changing just the CPU to plain PIII and voila, >> it now works. I don't know why it behaved this >> way, but I found where was the problem, finally. >> >> And the problem is in the last stage of build, when >> building the bzImage. >> >> make -f scripts/Makefile.build obj=arch/x86/boot/compressed arch/x86/boot/compressed/vmlinux >> ... >> (cat arch/x86/boot/compressed/vmlinux.bin | lzma -9 && echo -ne \\x38\\xd6\\x37\\x00) > arch/x86/boot/compressed/vmlinux.bin.lzma >> ... >> >> Note the echo command. >> >> Now, Debian switched to dash as /bin/sh. And dash >> does not understand the -e option: >> >> $ dash -c 'echo -ne \\x38\\xd6\\x37\\x00' | od -x >> 0000000 6e2d 2065 785c 3833 785c 3664 785c 3733 >> 0000020 785c 3030 000a >> >> $ bash -c 'echo -ne \\x38\\xd6\\x37\\x00' | od -x >> 0000000 d638 0037 >> >> So the final size (it's the size of uncompressed file) >> becomes incorrect. Here's what mkpiggy outputs for >> this (in arch/x86/boot/compressed/piggy.S): >> >> z_output_len = 170930296 >> >> while it should be >> >> z_output_len = 3659320 >> >> And with the former (wrong, larger) size, the whole >> thing just reboots on a PIII Celeron. I've no idea >> why, but the original problem is here. >> >> The same thing happens with bzip2 algorithm which is >> not new, not only with lzma. >> >> The whole thing looks quite hackish to me, -- mkpiggy >> can know the size from the original image just fine, >> instead of getting it from the end of already compressed >> file. >> >> For now, quick fix is to change echo to printf in there. >> Correct fix is to re-write mkpiggy to look at the >> original file for size (IMHO anyway). >> >> And this is a very good candidate for -stable as well. >> The bug is very difficult to find. And now when more >> and more people who use Debian are switching to dash, >> it will be more common. >> >> Thanks! ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <4ACF9184.9040104-Gdu+ltImwkhes2APU0mLOQ@public.gmane.org>]
* Re: wrong final bzImage build (regading #14270) [not found] ` <4ACF9184.9040104-Gdu+ltImwkhes2APU0mLOQ@public.gmane.org> @ 2009-10-09 19:59 ` Cyrill Gorcunov 0 siblings, 0 replies; 11+ messages in thread From: Cyrill Gorcunov @ 2009-10-09 19:59 UTC (permalink / raw) To: Michael Tokarev Cc: Kernel Mailing List, Rafael J. Wysocki, Kernel Testers List, Sam Ravnborg, H. Peter Anvin [Michael Tokarev - Fri, Oct 09, 2009 at 11:39:48PM +0400] > Ok, some more to this. > > It turns out dash's built-in echo command interprets \nnn octal > sequences by default, and there's no way to turn that off. So, > for example, sed-zoffset command from arch/x86/boot/Makefile > (which includes \1 \2 etc substitutions for sed), when echoed > in verbose mode (V=1), produces.. interesting characters (with > ascii code 1 and 2). > > It's not practival to replace V=1's echo with /bin/echo I think. > > So I'd say it's not a bug in the build system after all, but > a bug in dash. Well, at least this expanding-by-default didn't > trigger another very-difficult-to-find bug (hopefully), but it > has good potential. > > I'll file a bug report against dash. > > /mjt > OK, thanks Michael! -- Cyrill ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: wrong final bzImage build (regading #14270) 2009-10-09 19:39 ` Michael Tokarev [not found] ` <4ACF9184.9040104-Gdu+ltImwkhes2APU0mLOQ@public.gmane.org> @ 2009-10-09 20:02 ` Arkadiusz Miskiewicz 2009-10-09 20:56 ` H. Peter Anvin 1 sibling, 1 reply; 11+ messages in thread From: Arkadiusz Miskiewicz @ 2009-10-09 20:02 UTC (permalink / raw) To: linux-kernel Cc: Michael Tokarev, Cyrill Gorcunov, Rafael J. Wysocki, Kernel Testers List, Sam Ravnborg, H. Peter Anvin On Friday 09 of October 2009, Michael Tokarev wrote: > Ok, some more to this. > > It turns out dash's built-in echo command interprets \nnn octal > sequences by default, and there's no way to turn that off. So, > for example, sed-zoffset command from arch/x86/boot/Makefile > (which includes \1 \2 etc substitutions for sed), when echoed > in verbose mode (V=1), produces.. interesting characters (with > ascii code 1 and 2). > > It's not practival to replace V=1's echo with /bin/echo I think. > > So I'd say it's not a bug in the build system after all, but > a bug in dash. It's still a bug in build system if you consider that a /bin/sh is a posix shell. posix shells don't support \hex notation (see single unix system specification). I had exactly this problem few weeks ago with pdksh as /bin/sh (and bugreported to author of that change). As I workaround I used /bin/echo but using printf is more sane/portable. -- Arkadiusz Mi≈õkiewicz PLD/Linux Team arekm / maven.pl http://ftp.pld-linux.org/ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: wrong final bzImage build (regading #14270) 2009-10-09 20:02 ` Arkadiusz Miskiewicz @ 2009-10-09 20:56 ` H. Peter Anvin [not found] ` <4ACFA36F.6000105-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org> 0 siblings, 1 reply; 11+ messages in thread From: H. Peter Anvin @ 2009-10-09 20:56 UTC (permalink / raw) To: Arkadiusz Miskiewicz Cc: linux-kernel, Michael Tokarev, Cyrill Gorcunov, Rafael J. Wysocki, Kernel Testers List, Sam Ravnborg On 10/09/2009 01:02 PM, Arkadiusz Miskiewicz wrote: > > I had exactly this problem few weeks ago with pdksh as /bin/sh (and > bugreported to author of that change). As I workaround I used /bin/echo but > using printf is more sane/portable. > Yes, using printf is the right thing to do. A patch would be appreciated. -hpa ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <4ACFA36F.6000105-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>]
* Re: wrong final bzImage build (regading #14270) [not found] ` <4ACFA36F.6000105-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org> @ 2009-10-09 21:27 ` Michael Tokarev 2009-10-09 21:29 ` H. Peter Anvin 0 siblings, 1 reply; 11+ messages in thread From: Michael Tokarev @ 2009-10-09 21:27 UTC (permalink / raw) To: H. Peter Anvin Cc: Arkadiusz Miskiewicz, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Cyrill Gorcunov, Rafael J. Wysocki, Kernel Testers List, Sam Ravnborg H. Peter Anvin wrote: > On 10/09/2009 01:02 PM, Arkadiusz Miskiewicz wrote: >> I had exactly this problem few weeks ago with pdksh as /bin/sh (and >> bugreported to author of that change). As I workaround I used /bin/echo but >> using printf is more sane/portable. >> > > Yes, using printf is the right thing to do. > > A patch would be appreciated. Come on, it's just a one-word change (s/echo/printf/ in scripts/Makefile.lib). But it should go to Sam's tree first I guess, which already has s|echo|/bin/echo| so it'll conflict. It's easier to change it in whatever tree it will be changed without complete patches. /mjt ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: wrong final bzImage build (regading #14270) 2009-10-09 21:27 ` Michael Tokarev @ 2009-10-09 21:29 ` H. Peter Anvin 0 siblings, 0 replies; 11+ messages in thread From: H. Peter Anvin @ 2009-10-09 21:29 UTC (permalink / raw) To: Michael Tokarev Cc: Arkadiusz Miskiewicz, linux-kernel, Cyrill Gorcunov, Rafael J. Wysocki, Kernel Testers List, Sam Ravnborg On 10/09/2009 02:27 PM, Michael Tokarev wrote: > H. Peter Anvin wrote: >> On 10/09/2009 01:02 PM, Arkadiusz Miskiewicz wrote: >>> I had exactly this problem few weeks ago with pdksh as /bin/sh (and >>> bugreported to author of that change). As I workaround I used /bin/echo but >>> using printf is more sane/portable. >>> >> >> Yes, using printf is the right thing to do. >> >> A patch would be appreciated. > > Come on, it's just a one-word change (s/echo/printf/ in > scripts/Makefile.lib). > But it should go to Sam's tree first I guess, which already > has s|echo|/bin/echo| so it'll conflict. > It's easier to change it in whatever tree it will be changed > without complete patches. So send a patch against Sam's tree. -hpa ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2009-10-09 21:29 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-09 14:17 wrong final bzImage build (regading #14270) Michael Tokarev
2009-10-09 14:26 ` Michael Tokarev
2009-10-09 14:58 ` Cyrill Gorcunov
2009-10-09 17:03 ` H. Peter Anvin
[not found] ` <4ACF6CF8.4060204-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
2009-10-09 17:14 ` Michael Tokarev
2009-10-09 19:39 ` Michael Tokarev
[not found] ` <4ACF9184.9040104-Gdu+ltImwkhes2APU0mLOQ@public.gmane.org>
2009-10-09 19:59 ` Cyrill Gorcunov
2009-10-09 20:02 ` Arkadiusz Miskiewicz
2009-10-09 20:56 ` H. Peter Anvin
[not found] ` <4ACFA36F.6000105-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
2009-10-09 21:27 ` Michael Tokarev
2009-10-09 21:29 ` H. Peter Anvin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).